-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tf: update Elasticsearch alerting #1600
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good; I think we need to take a second look at grouping and it might be easier to do this after updating the google TF provider
condition_absent = "300s" | ||
min_group_by = "metric.label.es_cluster" | ||
}, | ||
"es-cluster-available-shards-${var.environment}" = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to see this go; I wonder if we ought to add (in a follow up so not blocking) an alert for the shard size growing beyond a certain size. That would be a comparable alert to this one we are removing for growth in search content we may not have noticed
threshold_value = 1 | ||
duration = "60s" | ||
threshold_value = 0 | ||
duration = "900s" | ||
condition_absent = "300s" | ||
min_group_by = "metric.label.es_cluster" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was somewhat struggling to confirm my suspicion but I suspect after inverting the COMPARISON_LT
to COMPARISON_GT
this should become max_group_by
(and same logic for the red
check)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look's like the provider has undergone a major version bump beyond where we are at (5.29.1) and then looking here https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/monitoring_alert_policy is seems like the format may have changed somewhat in general for grouping to become more standard with the other apis and we might now need to us the https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/monitoring_alert_policy#cross_series_reducer for this logic.
These need to go out before we update the Google Terraform provider |
This patch is a no-op on production and will make these changes on staging |
whoops, this was wrong - I'd not clocked the final commit in this PR also touched production |
No description provided.