-
-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more prometheus labels (including area, device) #113849
base: dev
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take a look at the requested changes, and use the Ready for review button when you are done, thanks 👍 |
Hey there @knyar, mind taking a look at this pull request as it has been labeled with an integration ( Code owner commandsCode owners of
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
Most of the tests check label values, so you will need to adjust those as well.
@@ -345,6 +361,11 @@ def _labels(state: State) -> dict[str, Any]: | |||
"entity": state.entity_id, | |||
"domain": state.domain, | |||
"friendly_name": state.attributes.get(ATTR_FRIENDLY_NAME), | |||
"area": state.attributes.get(ATTR_AREA_ID), | |||
# TODO: how to get zone here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just removed zone
for now. We can address that in another pr (if the need arises, I'm not using it currently)
"area": state.attributes.get(ATTR_AREA_ID), | ||
# TODO: how to get zone here? | ||
# "zone": state.attributes.get(ATTR_AREA_ID), | ||
"entity_id": state.object_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is called object_id
internally, could you please use object_id
for the label name as well?
@@ -345,6 +361,11 @@ def _labels(state: State) -> dict[str, Any]: | |||
"entity": state.entity_id, | |||
"domain": state.domain, | |||
"friendly_name": state.attributes.get(ATTR_FRIENDLY_NAME), | |||
"area": state.attributes.get(ATTR_AREA_ID), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the attribute is unset, the label value should be empty rather than the string None
.
"area": state.attributes.get(ATTR_AREA_ID), | |
"area": state.attributes.get(ATTR_AREA_ID) or "", |
# TODO: how to get zone here? | ||
# "zone": state.attributes.get(ATTR_AREA_ID), | ||
"entity_id": state.object_id, | ||
"device_class": state.attributes.get(ATTR_DEVICE_CLASS), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if device class can be unset, but just in case it is:
"device_class": state.attributes.get(ATTR_DEVICE_CLASS), | |
"device_class": state.attributes.get(ATTR_DEVICE_CLASS) or "", |
Hi! Thank you for the earlier feedback. I added the labels (except for As of result of finding the helper necessary, I did add some simple smoke tests for the helper. Let me know what you think! It's exciting to try and contribute back to a project I use so much. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the labels (except for
zone
) and also updated the unit tests. It ended up being a huge effort to update the unit tests, they are unintentionally brittle for label changes. I refactored things into a helper class. I know that kind of reduces the legibility of a single test and abstracts a bit, but it should make other label changes much easier going forward.
Thank you!
I don't mind refactoring tests, but I would advise against combining such refactoring in a single PR with other changes.
At the moment it's hard to see the impact of the code change you have made, because all tests have changed as well. Coupling two changes like this makes reviewing (and potentially reverting) each of them more difficult.
May I suggest splitting test refactoring into a separate PR? I don't mind which of the two you'd like to discuss first. I shared some feedback on the specifics of the test refactoring here, and will be happy to discuss more in a separate issue or PR.
@@ -344,6 +359,9 @@ def _labels(state: State) -> dict[str, Any]: | |||
"entity": state.entity_id, | |||
"domain": state.domain, | |||
"friendly_name": state.attributes.get(ATTR_FRIENDLY_NAME), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're changing all the metrics, we should probably fix friendly_name="None"
:
"friendly_name": state.attributes.get(ATTR_FRIENDLY_NAME), | |
"friendly_name": state.attributes.get(ATTR_FRIENDLY_NAME) or "", |
@@ -116,6 +119,25 @@ async def generate_latest_metrics(client): | |||
return body | |||
|
|||
|
|||
@pytest.mark.parametrize("namespace", [""]) | |||
async def test_metrics_labels_list( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test checks that two lists defined in two different modules have the same values. Do you have a bug or a failure condition in mind that would be caught by this test and not by any of the other tests we have?
MetricsTestHelper._perform_sensor_metric_assert( | ||
"homeassistant_sensor_temperature_celsius", | ||
"12.3", | ||
"Outside Temperature", | ||
"outside_temperature", | ||
body, | ||
device_class=SensorDeviceClass.TEMPERATURE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the diff here, I think for an individual assertion the new version is less legible and clear.
The previous version was very simple: it confirmed that a particular string was present in a list.
In the new version, usage of positional arguments makes it difficult to understand what those arguments are. Perhaps the helpers should accept a dict
with label keys & values, to make it more obvious what is being checked? E.g.:
assert_metric_present(body, name="homeassistant_sensor_temperature_celsius", value=12.3, labels=dict(
entity="sensor.outside_temperature",
domain="sensor",
friendly_name="Outside Temperature",
object_id="outside_temperature",
device_class="temperature",
))
"Radio Energy", | ||
"radio_energy", | ||
body, | ||
device_class=SensorDeviceClass.POWER, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please use strings for checking label values instead of referring to constants in other modules?
"""Helper methods for prometheus tests.""" | ||
|
||
|
||
class MetricsTestHelper: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of this class? It seems like all of its methods can just be separate functions in this module.
device_class=device_class, | ||
metric_value=metric_value, | ||
) | ||
if positive_comparison: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be more convenient to provide separate assertion functions (e.g. assert_metric_present
vs assert_metric_absent
) rather than make the user set a bool.
device_class=None, | ||
): | ||
device_class_label_line = cls.get_device_class_label_line(device_class) | ||
final_metric_value = f" {metric_value}" if metric_value else "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to my comment about positive_comparison
, perhaps assertions without a value should be handled by separate functions.
'humidifier_target_humidity_percent{domain="humidifier",' | ||
'entity="humidifier.humidifier",' | ||
'friendly_name="Humidifier"} 68.0' in body | ||
MetricsTestHelper._perform_humidifier_metric_assert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These device_class specific assertion methods seem like an unnecessary layer of abstraction. The _perform_humidifier_metric_assert
is only used a handful of times here - it might be simpler to use a generic function, or have a helper function that's defined closer to where it's used.
I'm willing to split this into 2 prs and keep test refactoring and logic changes separate. How about let's focus on test refactoring first, because simply adding "area" as a label to prometheus will break every single test and I'd love to address that first (I'm also interested in a later pass to include the new tagging and labeling features in Home Assistant in prometheus metrics as well, but let's discuss that later). Maybe I can make this PR the test refactoring PR since most of the work is around that. And I break out the logic changes into a follow up PR? Thanks so much for the responses so far! I'm really looking forward to having these features in my Home Assistant set up. |
That sounds good! I've left a few comments about tests in my previous message. In general I think we should be careful about over-abstraction here: the purpose of these tests is to verify that exported metrics have the format that we expect, and to allow folks to understand and debug them I would suggest erring on the side of keeping them simple & explicit even if they seem verbose. Do you have a list of goals for this refactoring or problems you've identified and want to fix? I think this might be helpful to describe (even if at a high level). For example, if you find the current approach of hand-crafting full metric strings annoying, I think an alternative might be to parse metric output and examine it in a more structured way. One thing that I find pretty inconvenient in our current testing approach is that most tests are broken into two functions: a fixture that creates Home Assistant entities, and the test function itself. These two functions are located in different parts of the same file, making it difficult to understand and modify a single test. Many fixtures are only used once. Perhaps moving fixtures closer to the test functions, or maybe combining them in some cases would be simpler? Not sure what's common here in Home Assistant ecosystem - I've not looked at how other components structure their testing, but I think it's worth examining and trying to be consistent with if we are making large changes here. |
So, long term goal: I wanted to add 2/3 labels here and ended up needing to adjust ~800 lines of tests. But the tests themselves are pretty good. So just hoping that we can have test helpers that are still legible and human-readable but flexible enough to make prometheus label changes reasonable going forward. I do agree that I overcomplicated this first pass. I do see what you mean about fixtures and stuff. I also saw your comment about following patterns elsewhere-- it seems this project is so big there's not exactly one way of doing things. But this approach did seem to make sense for the most part, except I definitely think I over-abstracted the refactor. |
Proposed change
I was inspired by a closed pull request here: #85553 to add some more labels
Type of change
Additional information
Checklist
ruff format homeassistant tests
)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest
.requirements_all.txt
.Updated by running
python3 -m script.gen_requirements_all
..coveragerc
.To help with the load of incoming pull requests: