Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow logs from non-main threads to be safely logged to the debug file #3917

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

daj-code
Copy link

@daj-code daj-code commented Apr 3, 2021

Why this change?

Robot Framework currently silently drops all logs from non-main threads, as documented in the user guide. This can make it difficult to diagnose test failures when multiple threads are involved, even with the use of the robotbackgroundlogger.

This change makes all logs logged to the Robot logger (directly or via the standard Python logging module) be logged in the debug file (when it is used), including from non-main threads. These logs to the debug file are logged at the time that they are logged with the relevant logger, rather than having to be explicitly logged to file at a later point, as you have with the robotbackgroundlogger, so it makes it much easier to quickly have a consistent view of logs from all relevant entities in the test in the right order. To help with this, I have made it so that logs to the debug file explicitly add the thread name to the log line to distinguish the different threads now logging to this file.

To protect the debug file from getting output written inconsistently from different threads (and multiprocesses), I have made the debug file use a multiprocessing lock to protect write access. This is enabled by default, but I have added a parameter to the relevant classes so that users can override this lock protection if needed (e.g. because they never run threads in Robot and it was found to be slowing down tests for some reason).

This change doesn't change the behaviour for any log files other than the debug file.

Relevant testing

I have added a testcase to non_main_threads_logging.robot which verifies that logs from non-main threads (as well as the main thread) are written to the debug file without clobbering each other. I have also ensured that all existing tests still pass locally.

@tysonite
Copy link
Contributor

Personally, I very like this change. We also have such cases when we need to log smth from multiple threads. Right now, we are stick to robotbackgroundlogger, but it would be really nice if its functionality migrated to core Robot. However we are mostly using HTML log rather than debug file, thus I would personally love to see it may accept log records from multiple threads too.

Not sure about the reasoning why it can't be done, may be some performance penalties are present. If this is so, behavior of logging from multiple threads (e.g. enable/disable) can be made configurable via, for example, some method of robot logger.

@pekkaklarck
Copy link
Member

Sorry for missing this PR earlier. On a very quick changes look reasonable but also a bit too risky for RF 4.1.1 that's currently under development. I'd like to return to this when we start RF 5.0 development and it's safer to do bigger changes. At that point we could also think could we enable logging from non-main threads in general. It might not be too much more complicated than special casing debug files.

Could you please submit a separate issue about this? We use them for release planning and having and thus having an issue is important. I also tend to follow new issues more closely than PRs (but I should get better at the latter as well as this PR demonstrates).

Somewhat related to this change, I'd like to rewrite the internal logging functionality in Robot altogether. The current system is a mess and I'm honestly impressed you were able to add new functionality to it. I'm not sure is RF 5.0 the right version where to do such bigger changes, though. There are other somewhat big internal changes needed in that release and making the release overly big isn't a good idea.

@daj-code
Copy link
Author

Sorry for the delay in getting back to you on this @pekkaklarck - thanks for the comments, I've now raised #4092 to track the issue here. Please update the issue as appropriate to better reflect what you'd like it to track.

I did have a look at how it might be able to be done more generally, but as you commented, the current implementation was quite tricky to navigate, so I didn't immediately see a clear way to do so, without also risking mainline performance (e.g. due to lock contention introduced by adding locks on each log file) or needing a substantial rewrite. That was the main reason I chose to just implement this for the debug file initially - it seemed like a positive step I could make to improve things here, which was least likely to negatively impact other users.

I'm about to push a catch-up merge which resolves the (very minor) merge conflict on this PR, but otherwise I'm happy to leave this with you for the time being to decide what's best to do with it. If you do decide you want this PR, I'm happy to handle any review comments at that time, if that would be helpful.

@pekkaklarck
Copy link
Member

At least at the current form PR is too complicated. A big reason why the code is pretty complicated is that our logging code is a mess, but I believe it could nevertheless be enhanced somehow. A bigger problem is that I don't really understand what tests are doing.

Related to the functionality itself, I don't think it's a good idea to log the name of the thread when the main thread is logging. Using threads is a special case and logging the main thread name just adds noise in the common case.

Let me know if you are still interested in this, and we can discuss how this should be improved to make it good enough to be merged.

@daj-code
Copy link
Author

Thanks for the feedback @pekkaklarck. I'm still interested in this, and happy to try and make changes as you suggest.

It should be quite straightforward to not log the thread name for the main thread and I agree that that makes sense.

Regarding the tests, I was trying to test:

  • non-main threads' logs don't get logged when the debug file is not in use (this is the existing testcase Log messages from non-main threads should be ignored)
  • non-main threads' logs get logged to the debug file when the debug file is in use, and only the debug file in this case (this is covered in the new testcase Log messages from non-main threads should be written to debug file)
  • logs from different threads running simultaneously don't overwrite or interleave each other in the debug file (this is also covered in the new testcase Log messages from non-main threads should be written to debug file)
    • i.e. you can have:
    Thread-1 log
    Thread-2 log
    Thread-2 another-log
    Thread-1 another-log-again
    
    but don't get
    Thread-1 lo
    Thread-2 lgog
    Thread-2 another
    Thread-log-1 another-log-again
    

The basis for both tests was the same, so I commonised some of the logic into keywords (e.g. Check Log Messages In Main Testcase) and re-structured the test suite so that each test ran the base test with the relevant arguments (one with and one without the debug file enabled). The base test calls into atest/testdata/test_libraries/ThreadLoggingLib.py to generate log messages, so this was enhanced to drop specifically identifiable logs from each thread, which could then be explicitly checked in the test. These log messages were deliberately made to be reasonably large with newlines, so that we can test for overwriting/interleaving of log messages in the debug file. They were also deliberately logged at two different levels to check that high level logs didn't slip through into the main log files. The new test for this PR uses the new keywords in the test to create the expected log message strings that can then be checked for in the debug file (that was generated by running the base test which should have logged them).

I appreciate that it's probably not very easy to understand these currently (particularly coming back to this now), so please let me know how I can help to improve and clarify these.

@michalisv
Copy link

hi @pekkaklarck , @daj-code

may i ask what's the status of this PR ? Is this solution - or something similar - planned for some future release of the framework? My use case is pretty much identical - i'm running tests against multiple devices all of which produce logs, and i'd like to be able to log them into the RF debug file during execution so that they are interleaved with the keyword output, making it easier to detect errors and correlate them with the execution stage.

@pekkaklarck
Copy link
Member

Sorry for dropping the ball on this. If someone is interested to continue with this PR, or possibly create a new one, there would be still time to get it into RF 7.0. If continuing wit this PR, the first step would be resolving conflicts. I can then do a more thorough review.

@michalisv
Copy link

thanks for the feedback Pekka - i'll checkout RF7 and see what can be done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants