Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Content-Encoding zstandard / zstd for scraping metrics #13866

Open
mrueg opened this issue Apr 1, 2024 · 3 comments
Open

Allow Content-Encoding zstandard / zstd for scraping metrics #13866

mrueg opened this issue Apr 1, 2024 · 3 comments

Comments

@mrueg
Copy link
Contributor

mrueg commented Apr 1, 2024

Proposal

Right now, Prometheus supports scraping metrics either uncompressed or via gzip.

I created arbitrary metrics via avalanche:

docker run -p 9001:9001 quay.io/prometheuscommunity/avalanche:main  --metric-count=50000
curl localhost:9001 > foo.metrics

When compressing these via the command line, it looks like zstd has a bit higher CPU usage, but a better Wall time.

time gzip -k foo.metrics
gzip -k foo.metrics  0,97s user 0,03s system 99% cpu 1,000 total
time zstd -k foo.metrics
zstd -k foo.metrics  0,13s user 0,05s system 151% cpu 0,113 total

At the same time, compression for metrics seems make them 27.5% smaller than with gzip:

compression size duration to compress
uncompressed 216M -
gzip 4M 1.0s
zstd 2.9M 0.113s

From this perspective, it might be promising to support additional zstd content-encoding for Prometheus Scrapes as well, so Prometheus could offer "zstd, gzip" as options (and fall back on gzip or even uncompressed) if the scrape target does not support it.

A possible candidate for this could be kube-state-metrics, which is creates a lot of metrics with lots of labels that could compress well with zstd.

Unrelated to the prometheus ecosystem, it might be interesting to see if some http based benchmarks appear, Chrome will support content-encoding with zstd as well: https://chromestatus.com/feature/6186023867908096

@bboreham
Copy link
Member

bboreham commented Apr 9, 2024

higher CPU usage, but a better Wall time

Why would you choose those, for scraping ?

@mrueg
Copy link
Contributor Author

mrueg commented Apr 10, 2024

higher CPU usage, but a better Wall time

Why would you choose those, for scraping ?

In total this results in less cycles spent (usage * cpu time = cycles) for compression, so should be more efficient. In the synthetic case above, it looks like the size is also smaller, so should save some network transfers as well.

I have some initial results here: prometheus/client_golang#1496

@bboreham
Copy link
Member

Oh you meant higher utilization?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants