New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when collecting sosreport from live environment: Could not enumerate network devices: [Errno 2] No such file or directory: '/mnt/sys/class/net'
#3307
Comments
I'm not sure what the ask is here? Does sos run from that point on, and just not collect network device information? Does it spew out a traceback and exit? You mention you'd expect it to exit but it's not clear what the behavior you're seeing is after the error. If it exits on that error, doesn't that match your expectation? |
Hello,
It hangs. It shows this message and doesn't proceed further: Could not enumerate network devices: [Errno 2] No such file or directory: '/mnt/sys/class/net'
No traceback is shown, and sos does not exit. |
Ah, ok. It looks like that is percolating up from I'm working on something locally and hope to have a PR for testing/review before too long. |
Although, that being said - I'm not sure why it is hanging on you. Can you post the results (pastebin ideally) of |
Hello, I attempted the simplest command with
Is that enough information to proceed? I can provide more information on my test setup if that would be helpful. |
Wow, ok...that's not what I was expecting, but it does at least confirm it's within I'm going to try and have a PR up later today for testing that should avoid the exception that gets trapped and prints that message - which in turn should hopefully break us out of this. I still don't know why it's hanging on you there, though. |
That sounds good. I do get alerts when this gets replies, so let me know if I can help. =/ |
If sos is being used in a live environment to diagnose an issue, using sysroot can cause the network device enumeration via /sys/class/net crawling to fail. This will be the case for systems that do not use `nmcli`. When in a live environment, network devices will not be under `/$sysroot/sys/class/net` but the "regular" path for the booted environment. Similarly, if sos is being run in a container that is properly configured, network devices will appear under `/sys/class/net` and not (necessarily) under the sysroot path that mounts the host's filesystem. As such, disregard a configured sysroot when enumerating network devices by crawling `/sys/class/net`, and trap any exceptions that may percolate up from this in edge case environments. Closes: sosreport#3307 Signed-off-by: Jake Hunsaker <jacob.r.hunsaker@gmail.com>
I've opened #3313 for this. Upon checking further, I don't believe there's an actual use case where using sysroot for this check would actually be valid, so I've removed it entirely so we should always check the "regular" Please give this a try, and let us know if this resolves your scenario. If so, we can likely include this in 4.5.6 which is closing tomorrow. |
Okay, I'll add that to my TODO list. Thank you. =) |
Sorry for the delay. I ran the PR, the error is gone, but Steps:
Then, trying again with
Which also hangs. Then, I tried to
It starts to hang here:
So it looks like it's not releasing a mutex properly? At any rate, I hope that's helpful. Take care. --P |
With my limited knowledge of
We lack timestamps, but the The hung sosreport should generate a directory
Could you please provide us that file to understand the phase of sosreport run when it got stuck? Or ideally re-run with better
to have |
Okay, schedule permitting, I'll carry out those steps. Thanks. |
As requested I ran the command and killed the process after letting it hang for a few moments:
Then, strangely, I could not find the temp files...
But I did find some in
Though looking at the content I'm not sure they will be helpful:
The |
If sos is being used in a live environment to diagnose an issue, using sysroot can cause the network device enumeration via /sys/class/net crawling to fail. This will be the case for systems that do not use `nmcli`. When in a live environment, network devices will not be under `/$sysroot/sys/class/net` but the "regular" path for the booted environment. Similarly, if sos is being run in a container that is properly configured, network devices will appear under `/sys/class/net` and not (necessarily) under the sysroot path that mounts the host's filesystem. As such, disregard a configured sysroot when enumerating network devices by crawling `/sys/class/net`, and trap any exceptions that may percolate up from this in edge case environments. Related: sosreport#3307 Signed-off-by: Jake Hunsaker <jacob.r.hunsaker@gmail.com>
Sigh, I assumed some more file content in the
text, If my understanding of Until @TurboTurtle got a different idea, could you get
(this happens when Also @TurboTurtle : does it make sense to add some debugs to this pre-setup phase, to diagnose this type of issues more easily the next time? (or is this issue too sole to sacrifise microseconds of each and every sos run for that?) |
I'm not opposed to more debug logging, but I'm curious what would be helpful here. Also, the fact that this only occurs in a rescue environment is puzzling. I don't have a better idea off the top of my head than drilling down with a coredump, unfortunately. |
If sos is being used in a live environment to diagnose an issue, using sysroot can cause the network device enumeration via /sys/class/net crawling to fail. This will be the case for systems that do not use `nmcli`. When in a live environment, network devices will not be under `/$sysroot/sys/class/net` but the "regular" path for the booted environment. Similarly, if sos is being run in a container that is properly configured, network devices will appear under `/sys/class/net` and not (necessarily) under the sysroot path that mounts the host's filesystem. As such, disregard a configured sysroot when enumerating network devices by crawling `/sys/class/net`, and trap any exceptions that may percolate up from this in edge case environments. Related: #3307 Signed-off-by: Jake Hunsaker <jacob.r.hunsaker@gmail.com>
This should not have been closed, re-opening. I'm guessing the original |
Hi Pavel, just to clarify, did you want me to try this part?
Thanks. |
Hello,
Last two options require |
Here's one part:
|
Then the core dump:
see: |
I assume you mean https://en.wikipedia.org/wiki/Tracing_(software)?
Then after several runs, a pattern emerged:
I'm guessing that the plugins are run in different threads, and that either: (1) There's some issue with Looking forward to your reply. |
Hi Jake, I wanted to check on the status of this? I had been keeping tabs on this, as using sosreport form a live system would help me to complete a KB(s) I'm writing. Thanks. |
Scenario: Writing a KB on a possible scenario where a customer has a system that does not boot normally, but can be booted into a live environment. For troubleshooting it is desirable to collect information from the host system, while in the live environment.
The issue was observed when the live environment was booted, the host root partition was mounted on
/mnt
and sos report was used in following forms:sudo sos report -a --all-logs --sysroot=/mnt --chroot=always --estimate-only
sudo sos report -a --all-logs --sysroot=/mnt --estimate-only
sudo sos report -a --all-logs --sysroot=/mnt --chroot=always
sudo sos report -a --all-logs --sysroot=/mnt
Which resulted in the error:
Could not enumerate network devices: [Errno 2] No such file or directory: '/mnt/sys/class/net'
Looking more closely at the
/mnt/sys/class/net
directory, it was found that the/mnt/sys
directory was empty. This is to be expected when booting from a live environment.Expected behaviour: I would expect that sosreport might error on finding
/mnt/sys
empty, and exit. Or continue with some other workaround; For example, falling back to using the live-system kernel. But when executing the above commandssos
hangs and will not exit unless killed.Additional information:
Live environment:
Host environment:
Host is a VM created through lib-virt.
The text was updated successfully, but these errors were encountered: