Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add thread safety checks to async_create_task #116339

Merged
merged 13 commits into from Apr 28, 2024
Merged

Conversation

bdraco
Copy link
Member

@bdraco bdraco commented Apr 28, 2024

Proposed change

Calling async_create_task from a thread other than the event loop thread almost always results in a fast crash. Since most internals are using async_create_background_task or other task APIs, the performance impact of adding a check here is not so bad, and this is the one integrations seem to get wrong the most, add a thread safety check here.

Reasoning: eager_start is now the default and any unsafe thread operations start right away, so mistakes (generally in custom components) calling from the wrong thread have a higher impact. While it’s nicer to crash quickly so mistakes can be found as opposed to crashing randomly, which makes for hard-to-find issues, in the short term, we need a way to catch these mistakes in integrations (custom components) even if there is a performance cost.

We turned on asyncio debug in April 2024 in the dev containers in the hope of catching some of the issues that have been reported. It will take a while to get all the issues fixed in custom components.

In 2025.5 we should guard the verify_event_loop_thread check with a check for the hass.config.debug flag being set as long term we don't want to be checking this in production environments since it is a performance hit. For context, the run time of the verify_event_loop_thread is slightly faster than asyncio.get_running_loop() so the while the performance hit isn't large its still undesirable.

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Deprecation (breaking change to happen in the future)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

  • This PR fixes or closes issue: fixes #
  • This PR is related to issue:
  • Link to documentation pull request:

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • I have followed the perfect PR recommendations
  • The code has been formatted using Ruff (ruff format homeassistant tests)
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all.
  • For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.
  • Untested files have been added to .coveragerc.

To help with the load of incoming pull requests:

Calling async_create_task from a thread almost always results in an
fast crash. Since most internals are using async_create_background_task
or other task APIs, and this is the one integrations seem to get wrong
the most, add a thread safety check here
Calling async_create_task from a thread almost always results in an
fast crash. Since most internals are using async_create_background_task
or other task APIs, and this is the one integrations seem to get wrong
the most, add a thread safety check here
@bdraco bdraco added this to the 2024.5.0 milestone Apr 28, 2024
@bdraco bdraco requested a review from a team as a code owner April 28, 2024 13:38
@bdraco bdraco marked this pull request as draft April 28, 2024 13:38
homeassistant/core.py Outdated Show resolved Hide resolved
@bdraco bdraco marked this pull request as ready for review April 28, 2024 14:45
Copy link
Member

@balloob balloob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's give it a shot to see how much things we catch.

Btw, one thing I was wondering is if verify_event_loop_thread should include a suggestion what method to use instead (for another PR).

@balloob balloob merged commit 164403d into dev Apr 28, 2024
38 checks passed
@balloob balloob deleted the async_create_task_thread_safety branch April 28, 2024 22:29
@bdraco
Copy link
Member Author

bdraco commented Apr 28, 2024

Btw, one thing I was wondering is if verify_event_loop_thread should include a suggestion what method to use instead (for another PR).

I was thinking on having it spit out a url and creating pages on the development site to explain what to do instead for each one. Before I do that I need to move all the ones that are currently hitting on async fire in the other registries sooner so they give a useful suggestion

balloob pushed a commit that referenced this pull request Apr 29, 2024
* Add thread safety checks to async_create_task

Calling async_create_task from a thread almost always results in an
fast crash. Since most internals are using async_create_background_task
or other task APIs, and this is the one integrations seem to get wrong
the most, add a thread safety check here

* Add thread safety checks to async_create_task

Calling async_create_task from a thread almost always results in an
fast crash. Since most internals are using async_create_background_task
or other task APIs, and this is the one integrations seem to get wrong
the most, add a thread safety check here

* missed one

* Update homeassistant/core.py

* fix mocks

* one more internal

* more places where internal can be used

* more places where internal can be used

* more places where internal can be used

* internal one more place since this is high volume and was already eager_start
@github-actions github-actions bot locked and limited conversation to collaborators Apr 30, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants