Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receiver: cache matchers for series calls #7353

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

pedro-stanaka
Copy link
Contributor

@pedro-stanaka pedro-stanaka commented May 13, 2024

Summary

We have tried caching matchers before with a time-based expiration cache, this time we are trying with LRU cache.

We saw some of our receivers busy with compiling regexes and with high CPU usage, similar to the profile of the benchmark I added here:

image

Benchmark results

Expand!
Result on store-proxy-cache-matchers
BenchmarkProxySeriesRegex-11    	 1545795	       768.7 ns/op	    1144 B/op	      19 allocs/op
BenchmarkProxySeriesRegex-11    	 1548040	       769.4 ns/op	    1144 B/op	      19 allocs/op
BenchmarkProxySeriesRegex-11    	 1545019	       778.3 ns/op	    1144 B/op	      19 allocs/op
BenchmarkProxySeriesRegex-11    	 1539387	       771.1 ns/op	    1144 B/op	      19 allocs/op

Result on main
BenchmarkProxySeriesRegex-11    	  130292	      8803 ns/op	   10288 B/op	      78 allocs/op
BenchmarkProxySeriesRegex-11    	  124045	      8533 ns/op	   10288 B/op	      78 allocs/op
BenchmarkProxySeriesRegex-11    	  125092	      8712 ns/op	   10288 B/op	      78 allocs/op
BenchmarkProxySeriesRegex-11    	  120110	      8676 ns/op	   10288 B/op	      78 allocs/op

The results indicate that the "store-proxy-cache-matchers" branch considerably outperforms the "main" branch in all observed aspects of the BenchmarkProxySeriesRegex function. It is roughly 10 times faster regarding execution time while using about 9 times less memory and making about 4 times fewer allocations per operation. These improvements suggest significant optimizations in the regex handling or related data processing in the "store-proxy-cache-matchers" branch compared to the "main" branch

Changes

  • Adding matcher cache for method MatchersToPromMatchers and a new version which uses the cache.
  • The main change is in matchesExternalLabels function which now receives a cache instance.

Verification

  • I have added tests to the change and new benchmarks.

@pedro-stanaka pedro-stanaka marked this pull request as ready for review May 13, 2024 08:13
@pedro-stanaka pedro-stanaka marked this pull request as draft May 13, 2024 08:14
@pedro-stanaka pedro-stanaka force-pushed the store-proxy-cache-matchers branch 2 times, most recently from 0528b9c to a58508d Compare May 13, 2024 09:29
@pedro-stanaka pedro-stanaka changed the title Receivers|Store: cache matchers for series calls Receiver: cache matchers for series calls May 13, 2024
@pedro-stanaka pedro-stanaka force-pushed the store-proxy-cache-matchers branch 2 times, most recently from 23c786a to 34e4852 Compare May 13, 2024 12:02
We have tried caching matchers before with a time-based expiration cache, this time we are trying with LRU cache.

We saw some of our receivers busy with compiling regexes and with high CPU usage, similar to the profile of the benchmark I added here:

* Adding matcher cache for method `MatchersToPromMatchers` and a new version which uses the cache.
* The main change is in `matchesExternalLabels` function which now receives a cache instance.

adding matcher cache and refactor matchers

Co-authored-by: Andre Branchizio <andre.branchizio@shopify.com>

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

Using the cache in proxy and tsdb stores (only receiver)

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

fixing problem with deep equality

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

adding some docs

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

Adding benchmark

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

undo unecessary changes

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

Adjusting metric names

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

adding changelog

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

wiring changes to the receiver

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

Fixing linting

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
@pedro-stanaka pedro-stanaka marked this pull request as ready for review May 14, 2024 13:05
Copy link
Member

@GiedriusS GiedriusS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The results indicate that the "store-proxy-cache-matchers" branch considerably outperforms the "main" branch in all observed aspects of the BenchmarkProxySeriesRegex function. It is roughly 10 times faster regarding execution time while using about 9 times less memory and making about 4 times fewer allocations per operation. These improvements suggest significant optimizations in the regex handling or related data processing in the "store-proxy-cache-matchers" branch compared to the "main" branch

Was this AI generated? 😄


func (c *MatchersCache) GetOrSet(key LabelMatcher, newItem NewItemFunc) (*labels.Matcher, error) {
c.metrics.requestsTotal.Inc()
if item, ok := c.cache.Get(key); ok {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using singleflight here to reduce allocations even more

@@ -973,6 +986,8 @@ func (rc *receiveConfig) registerFlag(cmd extkingpin.FlagClause) {
"about order.").
Default("false").Hidden().BoolVar(&rc.allowOutOfOrderUpload)

cmd.Flag("matcher-cache-size", "The size of the cache used for matching against external labels. Using 0 disables caching.").Default("0").IntVar(&rc.matcherCacheSize)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add this to other components as well like Thanos Store?

tms []*labels.Matcher
err error
)
if cache != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could put *storepb.MatchersCache behind an interface to avoid this if cache != nil { ... } else { ... } everywhere?

}
}

func NewMatchersCache(opts ...MatcherCacheOption) (*MatchersCache, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can just use pkg/cache/inmemory.go? It's another LRU implementation that already exists in the tree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants