-
-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keep observation data valid for 60 min and retry with no data for nws #117109
Conversation
Hey there @kamiyo, mind taking a look at this pull request as it has been labeled with an integration ( Code owner commandsCode owners of
|
Just to be certain, the intent is to NOT tag this for a patch release? |
I mean it could, but this one is quality of life upgrade only. Which is why I intended it to do in a new monthly release. If you feel it can go in a patch, Im okay with that too. Edit: the others were definitely fixes to existing issues and were more critical. This one just adds some nice to have feature. |
Getting these myself, which shows the usefulness of this addition. I think I need to make this draft for now anyway as the errors could be handled better in the coordinator: [traceback shortened]
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/opt/python/lib/python3.12/concurrent/futures/_base.py", li
ne 401, in __get_result
raise self._exception
File "/mnt/data/home-assistant/venv/lib/python3.12/site-packages/
tenacity/_asyncio.py", line 50, in __call__
result = await fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/home-assistant/venv/lib/python3.12/site-packages/
pynws/simple_nws.py", line 262, in update_observation
raise NwsNoDataError("Observation received with no data.")
pynws.nws.NwsNoDataError: Observation received with no data. |
I see that is common to catch custom errors and wrap with |
Now that I have a potential addition for catching the error better, the server goes back up! |
Error in log is much better now:
|
Hmmm. I did find a bug in pynws that would interact with this PR, but it is only for the forecasts and not the observations that you are showing. The bug only means that this PR would cause the caching of the forecast data to get overwritten when no data is returned by NWS. This would result in behavior similar to dev branch. I will fix this and bump here as we are already bumping it here. Will mark as draft. This looks like NWS server flakiness to me, which tends to fluctuate over time. Some time periods it is solid, some time periods there are frequent drop outs. The only way to find out with the current logging is to monitor the API calls and data. Set pynws logger to If these drop outs are for no data, then in the current dev branch, you would have had |
pynws version bumped again to fix forecasts caching particularly with retries. Added PR release notes to description. This won't affect the lack of data in observations as noted above. |
I noticed the API worked in Chrome, but not via the lib. The no data problem seems to have gone away if I change the headers Maybe I'm just getting lucky though... |
This will likely be a temporary fix until the server starts messing up the cache for this new user agent again. This has happened in the past on another end point on this server. See weather-gov/api#492. They explicitly request developers not to use cache-busting techniques however (I'm not suggesting that you are proposing this). My plan is to gather some key raw data and open a ticket. I have confirmed in the past that our request interval would be nowhere near frequent enough to cause a problem (in regards to use terms, but practically I'm not sure?). |
One thought for feedback... We keep track of the last successful update times (including whether data is returned) in the |
Most significant changes in 8b99d0e include using a custom In #117533, it was observed that getting no data in HA correlated with getting 2+ hr old data from a simpler I'm opening this for review now. We could consider wrapping the forecast coordinators into this change too. Pending feedback, I opted to not include this since it would make this PR even larger and there is already caching done in |
If there are no breaking changes in the library bump, please do that in a separate PR first. Thanks! |
|
One unresolved pain point in this implementation is that if the first update fails, it will wait a full 10 minutes to make another update request. To keep the PR less unwieldy as it is already getting very lengthy, I would propose this for a future PR. |
I'll put this on my production system in a bit |
We can't fix the server not returning valid data for 90 minutes without cache busting, which is against the terms of use. This PR brings extra robustness in the HA integration while still marking the observations as invalid when applicable. The current state in the dev branch will continue to show data > 1 hr old, and users will not be able to know this. |
This might be a separate PR, but should we add a sensor for the data being stale? |
The observation data will obviously be stale in this PR when the data is marked as "unavailable" in the entity. A sensor that holds the time of the observation sounds fine to me in a separate PR. Unfortunately the forecast data model isn't the best in HA in regards to this and I'm planning on opening an issue about it. |
with patch("homeassistant.components.nws.utcnow") as mock_utc: | ||
with patch("homeassistant.components.nws.coordinator.utcnow") as mock_utc: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we use FrozenDateTimeFactory
instead of patching utcnow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to keep PR semi-manageable. This was changed to accommodate the change in location, not a change in functionality. I agree that these tests needs to be updated in the future.
Proposed change
Retry when no data is returned during an update. Use a custom
DataUpdateCoordinator
to mark data as unavailable after 60 minutes for observations.Unlike recent PRs for nws, this one should go in a new minor release.
Type of change
Additional information
Checklist
ruff format homeassistant tests
)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest
.requirements_all.txt
.Updated by running
python3 -m script.gen_requirements_all
..coveragerc
.To help with the load of incoming pull requests: