-
Notifications
You must be signed in to change notification settings - Fork 780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make TSDB max exemplars config per tenant #5080
Conversation
Signed-off-by: sahnib <sahnib@amazon.com>
Signed-off-by: sahnib <sahnib@amazon.com>
Thanks for the contribution!
Can we make it hot reloadable then as it is already reloadable in Prometheus. We can update it in the loop https://github.com/cortexproject/cortex/blob/master/pkg/ingester/ingester.go#L776 here by calling https://github.com/prometheus/prometheus/blob/main/tsdb/head.go#L855. |
I have added the hot reload changes to the PR. TestingRan cortex using docker with a update exemplars time of 10 seconds. Below are logs showing updates happening for a "fake" org_id.
|
Signed-off-by: sahnib <sahnib@amazon.com>
} | ||
|
||
level.Info(logutil.WithUserID(userID, i.logger)).Log("meg", "updating max exemplars configuration.") | ||
err := userDB.db.ApplyConfig(cfg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method currently updates OutOfOrderTimeWindow
and MaxExemplars
value. I could not find usage of OutOfOrderTimeWindow
in ingester. Please point me to where its configured per tsdb if it needs to be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That one is fine. The OOO config is still WIP so we can ignore that in this pr.
Signed-off-by: sahnib <sahnib@amazon.com>
pkg/ingester/ingester.go
Outdated
@@ -821,6 +829,31 @@ func (i *Ingester) updateLoop(ctx context.Context) error { | |||
} | |||
} | |||
|
|||
func (i *Ingester) updateTSDBMaxExemplars() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we will support OOO config in the future, can we rename the function to not only updateTSDBMaxExemplars
? Maybe something like updateUserTSDBs
or updateDBConfigs
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about updateUserTSDBConfigs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM!
pkg/ingester/ingester.go
Outdated
@@ -92,7 +94,8 @@ type Config struct { | |||
// Config for metadata purging. | |||
MetadataRetainPeriod time.Duration `yaml:"metadata_retain_period"` | |||
|
|||
RateUpdatePeriod time.Duration `yaml:"rate_update_period"` | |||
RateUpdatePeriod time.Duration `yaml:"rate_update_period"` | |||
MaxExemplarsUpdatePeriod time.Duration `yaml:"max_exemplars_update_period"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be better to use a different config name to not only cover exemplars.
…db config. Signed-off-by: sahnib <sahnib@amazon.com>
Signed-off-by: sahnib <sahnib@amazon.com>
I have made the API renaming changes. |
@@ -173,7 +170,6 @@ func (cfg *TSDBConfig) RegisterFlags(f *flag.FlagSet) { | |||
f.IntVar(&cfg.WALSegmentSizeBytes, "blocks-storage.tsdb.wal-segment-size-bytes", wal.DefaultSegmentSize, "TSDB WAL segments files max size (bytes).") | |||
f.BoolVar(&cfg.FlushBlocksOnShutdown, "blocks-storage.tsdb.flush-blocks-on-shutdown", false, "True to flush blocks to storage on shutdown. If false, incomplete blocks will be reused after restart.") | |||
f.DurationVar(&cfg.CloseIdleTSDBTimeout, "blocks-storage.tsdb.close-idle-tsdb-timeout", 0, "If TSDB has not received any data for this duration, and all blocks from TSDB have been shipped, TSDB is closed and deleted from local disk. If set to positive value, this value should be equal or higher than -querier.query-ingesters-within flag to make sure that TSDB is not closed prematurely, which could cause partial query results. 0 or negative value disables closing of idle TSDB.") | |||
f.IntVar(&cfg.MaxExemplars, "blocks-storage.tsdb.max-exemplars", 0, "Enables support for exemplars in TSDB and sets the maximum number that will be stored. 0 or less means disabled.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alanprot Is it okay to remove an existing flag cause it breaks backward compatibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shuold keep around for 2 releases as per governance, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Maybe we can do something like #5068
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept the original flag in storage config, and marked it as deprecated.
Further, cortex will fall back to storage config value if exemplars value for the user is set to zero. Discussed these changes with @alanprot offline.
Signed-off-by: sahnib <sahnib@amazon.com>
Signed-off-by: sahnib <sahnib@amazon.com>
Signed-off-by: sahnib <sahnib@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great work!
* Make TSDB max exemplars config per tenant - cortexproject#5016 Signed-off-by: sahnib <sahnib@amazon.com> * Fix documentation as per new flags Signed-off-by: sahnib <sahnib@amazon.com> * Enable hot reload of max exemplars value. Signed-off-by: sahnib <sahnib@amazon.com> * Fix goimports in ingester. Signed-off-by: sahnib <sahnib@amazon.com> * Rename config flag, and method to be more generic towards updating tsdb config. Signed-off-by: sahnib <sahnib@amazon.com> * Remove un-intended single process config change. Signed-off-by: sahnib <sahnib@amazon.com> * Remove un-necessary logging. Signed-off-by: sahnib <sahnib@amazon.com> * Add fallback on previous limit in block storage config Signed-off-by: sahnib <sahnib@amazon.com> * Added unit-test cases for fallback logic, and updated docs. Signed-off-by: sahnib <sahnib@amazon.com> --------- Signed-off-by: sahnib <sahnib@amazon.com> Signed-off-by: Alex Le <leqiyue@amazon.com>
Signed-off-by: sahnib sahnib@amazon.com
What this PR does:
Makes TSDB max exemplars config per tenant. Note that the MaxExemplars value is passed down to prometheus tsdb at
tsdb.Open
call, hence this configuration would not be hot loaded at this time - unless the ingesters re-open the database handle, or go through a restart..Which issue(s) this PR fixes:
Fixes #5016
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]