New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLI dies because of mac_addr mismatch #3469
Comments
Some more information: after fixing the mac_addr in The VM still doesn't work as it should though:
But it works using lxc:
So it looks that there's some other state that's out of sync or not properly initialised. (either in the VM or multipass) |
Hi @nielsreijers, thanks for taking the time to investigate this and report. Let me ask you if you did something from the LXD command-line to make those MACs change. Thanks! |
Not that I'm aware of. I'm pretty new to both LXD and multipass, so there's really not a lot I could do. The context is that I was trying to spin up a VM to try Sunbeam and got into all sorts of issues with juju getting stuck, no networking because of the iptables issue, and multipass cli hanging. I'm new to all of these (really enjoying the process of learning them) so I was messing around quite a bit, and I probably killed a few processes rather ungracefully when things got stuck. I really don't know much about LXD and learned just enough to be able to check multipass was creating the VMs and that they were indeed running. But I have no idea what I could have done from the commandline to change the MAC address. I did create and deleted several VMs with the same name, and hard killed multipassd a number of times when things weren't responding, so I suspect that's where things got out of sync somehow. But unfortunately I haven't been able to recreate it. I'm travelling at the moment, but I'll have another go at it when I'm back home in a week and a half or so. |
Hey @luis4a0! Could you please follow up on this one? Thanks! |
Hi @luis4a0 and @townsend2010 I'm finally back from a long trip, and will have some free time to spend on fun/useful projects over the next weeks. I'll have another go at reproducing the issue for sure, but I'd also like to get involved in fixing this or other issues in multipass if that's possible. That's provided I can figure out what's going wrong of course. But besides the question of how it got into this inconsistent state, it would also be nice if it handled the situation a bit more gracefully, don't you think? |
Describe the bug
Using the lxd driver, after trying to start one of my VMs, the cli dies when executing a number of commands, while some others still work.
The cause turned out to be a mismatch in mac_addr between
/var/snap/multipass/common/data/multipassd/lxd/multipassd-vm-instances.json
and the actual address in lxd./var/snap/multipass/common/data/multipassd/lxd/multipassd-vm-instances.json
contains:lxc network list-leases --project multipass mpbr0
showsNote how the mac address is identical for the mptest VM, but differs for the openstack VM.
The VM does start correctly, as shown by
sudo lxc list --project multipass
, but because of the mac_addr mismatch, many command stop working, including simplemultipass ls
. I think any command that requires multipassd to ssh into the broken VM would fail.For those commands the lxd driver tries to determine the VM's IP by querying the lxd leases and looking for the VM's mac address. For example, for the 'ls' command the call path is:
list
indaemon.cpp
->get_all_ipv4
->ssh_hostname
->ip_address_for
->get_ip_for
.get_ip_for
queries the lxd leases, and can't find the mac address because the VM's actual mac is different from the address it's looking for.ip_address_for
then keeps retrying for two minutes and fails with a timeout exception.I think there are two problems here:
To Reproduce
Unfortunately, I haven't been able to reproduce it using multipass commands yet (still trying though). But manually editing
lxd/multipassd-vm-instances.json
should reproduce it.Expected behavior
a) I would expect mac_addr in
lxd/multipassd-vm-instances.json
to match the mac address in lxd.b) In case it doesn't, I would expect multipass to handle the error more gracefully, either because we may be able to find the IP in a different way, in which case everything should just work. Or if not, then the cli should notify the user there's a problem with one of the VMs, but still allow the user to work normally with everything but the offending VM.
Logs
This is the most relevant part from ``journalctl --unit snap.multipass.multipassd
when it's started with
--verbosity debug```. There's a lot more output of course, but this shows multipassd querying the lxd leases and getting a different mac than the one it's looking for. This process then repeats several times until it times out.Additional info
multipass version
multipass 1.13.1
multipassd 1.13.1
multipass info
Name: mptest
State: Running
IPv4: 10.199.64.235
Release: Ubuntu 22.04.4 LTS
Image hash: da10d667adf5 (Ubuntu 22.04 LTS)
CPU(s): 2
Load: 0.00 0.00 0.00
Disk usage: 1.6GiB out of 48.4GiB
Memory usage: 216.0MiB out of 7.7GiB
Mounts: --
Name: openstack
State: Stopped
IPv4: --
Release: --
Image hash: da10d667adf5 (Ubuntu 22.04 LTS)
CPU(s): --
Load: --
Disk usage: --
Memory usage: --
Mounts: --
multipass get local.driver
lxd
Additional context
There may be some other issues at play here. I'm quite new to multipass. Potentially a really great tool, but it's been quite unstable on my machine unfortunately.
I'm pretty sure this is an issue and at times it worked exactly as I expected given my current understanding. As long as my openstack VM was stopped, everything worked fine. After a
multipass start openstack
, the cli broke for many commands, but ``lxcshowed the VM did start. And stopping the VM with
lxc``` brought the multipass cli back to life.However, as I'm typing this, there seems to be at least one other issue since stopping the offending VM in
lxc
didn't work and mymultipass ls
has been unresponsive for several minutes now, much longer than the 2 minute timeout exception I got previously. So at the moment things seems to be going wrong at a different level.The text was updated successfully, but these errors were encountered: