Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.12.0 exposes some high cardinality metrics #967

Closed
prymitive opened this issue Jan 19, 2022 · 30 comments · Fixed by #974 or #1033
Closed

v1.12.0 exposes some high cardinality metrics #967

prymitive opened this issue Jan 19, 2022 · 30 comments · Fixed by #974 or #1033

Comments

@prymitive
Copy link
Contributor

It seems that v1.12.0 includes some extra high cardinality metrics, most likely autogenerated by Go itself thanks to #955.

Here's a full diff of metrics between 1.11 and 1.12 - https://gist.github.com/prymitive/3d16b79a8187448bf6e0d61db7336ca0

Having so many extra metrics exposed by default might cause problems for people with lots of Go services.

@kakkoyun
Copy link
Member

Thanks for reporting. It seems like we need to think more about default bucket configuration.
@mknyszek what do you think?

@kakkoyun
Copy link
Member

cc @bwplotka

@beorn7
Copy link
Member

beorn7 commented Jan 20, 2022

Yeah, that's rather extreme for default metrics. Perhaps those could be opt-in at that granularity.

I would also like to change that before releasing the final Prometheus v2.33.

@mknyszek
Copy link
Contributor

Yeah, the histograms in runtime/metrics are large by some standards. The time-related ones have pretty high granularity, but it's fairly straightforward to downsample and make them exponentially bucketed. Maybe some code to check a threshold on bucket granularity and then downsample if necessary?

FWIW I think it would be nice keep the allocs-by-size and frees-by-size as-is -- they're bounded by the number of size classes (currently; that can change but likely won't change drastically), so that's 136 total. Is that too much? Perhaps that's why MemStats.BySize wasn't included initially.

@mknyszek
Copy link
Contributor

Having so many extra metrics exposed by default might cause problems for people with lots of Go services.

Out of curiosity, what kind of problems do you expect?

@beorn7
Copy link
Member

beorn7 commented Jan 20, 2022

Having so many extra metrics exposed by default might cause problems for people with lots of Go services.

Out of curiosity, what kind of problems do you expect?

If these metrics came with a newly introduced package, it might be fine. Users could do the resource planning for it or decide to drop them on ingestion. However, the problem is that these default Go metrics are in a lot of existing binaries, and all of them will suddenly expose a significantly increased number of metrics after merely bumping the dependencies to the current minor release. This might easily push many users beyond their cardinality budget.

@mknyszek
Copy link
Contributor

This might easily push many users beyond their cardinality budget.

Not to derail this too hard, but just for my understanding: what does this cardinality budget look like? Memory? Throughput?

@prymitive
Copy link
Contributor Author

Each unique time series eats ~4KB on average in our production environments. The diff I posted has more than 1k lines. Adding 1k extra metrics adds ~4MB of Prometheus memory per instance for every Go service scraped by Prometheus (plus some memory for exposing those metrics in those services). 4MB might not be much but it adds up quickly.

@mknyszek
Copy link
Contributor

Thanks, that's good to know. I meant to say this earlier, but I'll send a PR to reduce the bucket granularity.

I'm thinking something like a function that iterates over the buckets and merges buckets within either a factor of 2 or a factor of 10 (depending on the metric unit), making the buckets exponential. Does this sound reasonable?

Opt-in to higher granularity can be future work; an exponential distribution is already pretty useful.

@prymitive
Copy link
Contributor Author

FWIW I think it would be nice keep the allocs-by-size and frees-by-size as-is -- they're bounded by the number of size classes (currently; that can change but likely won't change drastically), so that's 136 total. Is that too much? Perhaps that's why MemStats.BySize wasn't included initially.

Is 136 too much?
I see that an example Go service I run with only a few custom metrics exposes ~54 time series on average across all instances. If you add 136 time series to that you're more than doubling metrics footprint of client_golang and that's a big jump. Nowhere near as huge as adding over a 1k time series but still relatively large.
It's probably gonna be fine for some people, might be too much for others. It can be dropped via relabel rules if needed, but that requires everyone to update scrape configs.

@mknyszek
Copy link
Contributor

@prymitive Sure. I can also reduce the impact of this further if there's a way to alias metrics (i.e. just say "here's metric name, but it's really just this other metric, so use that data instead") since some existing metrics are re-exported under new names. I can also drop the new names for just those metrics altogether, but that would make for a somewhat worse user experience. Currently, the names of the metrics are generated from the runtime/metrics keys, so there's a clear mapping and it's easy to find a metric from knowing its runtime/metrics key and vice versa.

@beorn7
Copy link
Member

beorn7 commented Jan 20, 2022

I leave it to @bwplotka and @kakkoyun to make the call what increase in metric cardinality is fine for a minor release. As I might have said before, this library is used a lot, so we have to navigate with care and in doubt make potentially disruptive features opt-in.

@mknyszek
Copy link
Contributor

mknyszek commented Jan 20, 2022

For sure. The current state of things is not great. I didn't anticipate that each histogram bucket would be treated as an individual metric, and I also forgot quite how big I made these histograms. 😅

Going back over the list, it looks like at a bare minimum (without removing "duplicate" metrics -- I'd rather not because the new forms of them are more orthogonal and clearer; the divisions and groupings of memory in runtime.MemStats are pretty arbitrary), that's 25 new individual metrics and 4 new histograms.

2 of those new histograms are time histograms. They're high dynamic range histograms which are a combination of exponentially and linearly bucketed. Collapsing those down to be exponentially bucketed, we get 45 buckets each. We can then square the base of the exponential series and cut that down to 23, and then cut out a lot of the higher buckets (the range goes all the way up to O(days), we can just catch everything above 1 second into its own bucket for all of the existing ones). That gets us down to about 15. For the other two histograms, we can just do power-of-two exponential bucketing which collapses the 67 or so buckets down to 13 or 14.

To be clear, this code would be generic, so any new runtime/metrics histograms metrics with unit "bytes" or "seconds" would immediately be minimized. I can add a test that ensures new metrics to be audited via the generator do not accidentally create a whole bunch of bloat.

Another alternative is a Summary but I have a slight preference against that, just because it implicitly assumes a normal distribution of the underlying data where it may not exist, resulting in potentially misleading information.

So, in total, that's 83 additional metrics assuming we just shrink the histograms as I suggested above. According to @prymitive's back-of-envelope math that's an added on-server bloat of about 400 KiB for this release. Like I said earlier we can cut out metrics that already exist in their old form (in some fashion) (EDIT: cut out new metrics that represent the same information as old metrics with a new name and a slightly different form) and get back an additional 23. Personally I'd prefer being more aggressive about scaling down the histograms than getting rid of the "duplicate" metrics, but if this isn't acceptable then so be it. Earlier I asked about an aliasing mechanism because some of these are a 1:1 mapping, and so we could avoid the storage of both by just having one point to the other's data in the backend.

Side-note: the reasoning behind these huge histograms is it doesn't cost much (< 10 KiB I'll guess?) to actually manage these in the runtime, and it's better for the runtime to expose more granular data than less granular data. It's totally reasonable for downstream users to adjust that granularity.

@kakkoyun
Copy link
Member

kakkoyun commented Jan 21, 2022

So, in total, that's 83 additional metrics assuming we just shrink the histograms as I suggested above. According to @prymitive's back-of-envelope math that's an added on-server bloat of about 400 KiB for this release. Like I said earlier we can cut out metrics that already exist in their old form (in some fashion) and get back an additional 23. Personally I'd prefer being more aggressive about scaling down the histograms than getting rid of the "duplicate" metrics, but if this isn't acceptable then so be it.

First of all, let's not remove old metrics. There are lots of dashboards out there would be broken.

And @mknyszek as you suggested let's try to aggressively reduce the number of the buckets. With the a reason number of cardinality increase and better communication, this should be doable.

To be clear, this code would be generic, so any new runtime/metrics histograms metrics with unit "bytes" or "seconds" would immediately be minimized. I can add a test that ensures new metrics to be audited via the generator do not accidentally create a whole bunch of bloat.

Yes, tests are essential for this case.

And also as @beorn7 suggested, maybe we should go in a direction to make these additional costly metrics opt-in rather than enabling them implicitly until the major version bump.

Or as it already discussed we can start with really conservative bucket distribution for all of them and increase the granularity with configuration rather than having an additional collector.

Personally I would prefer this approach more. Otherwise, it would take longer times to discover these new useful metrics.

What do you think? @bwplotka

@mknyszek
Copy link
Contributor

mknyszek commented Jan 21, 2022

So, in total, that's 83 additional metrics assuming we just shrink the histograms as I suggested above. According to @prymitive's back-of-envelope math that's an added on-server bloat of about 400 KiB for this release. Like I said earlier we can cut out metrics that already exist in their old form (in some fashion) and get back an additional 23. Personally I'd prefer being more aggressive about scaling down the histograms than getting rid of the "duplicate" metrics, but if this isn't acceptable then so be it.

First of all, let's not remove old metrics. There are lots of dashboards out there would be broken.

Er, very sorry. I was definitely not suggesting that. I recognize how important that is. And I can see how what I wrote that reads that way, but I meant that we can cut out new metrics that already exist in some form.

And @mknyszek as you suggested let's try to aggressively reduce the number of the buckets. With the a reason number of cardinality increase and better communication, this should be doable.

I have a WIP implementation of the reduced bucketing that means 10 exponential buckets for 2 of the histograms, and 13 for the other 2. That makes for a total of 25+10+10+13+13=71 new metrics. I can have this up for review with tests tomorrow.

@bwplotka
Copy link
Member

Thanks, reducing those histograms is definitely the first step.

What if we make it so a user has 3 options:

  • Have new ones from MemStats (default)
  • Have new with extended buckets and ones from MemStats
  • Have only old ones from MemStats for those who don't want extra 71 metrics?

Potentially also 4th one? without MemStats?

@beorn7
Copy link
Member

beorn7 commented Jan 21, 2022

Also a general note: I love high-res histograms, but Prometheus doesn't really do those yet. We are working on it, but it will take months before they are generally available (at which point we should definitely offer those awesome metrics for users of the new high-res histograms at fine granularity).

@mknyszek
Copy link
Contributor

@bwplotka Yeah, that works. My apologies for not just doing this up-front, but understanding the constraints now I get where you're coming from.

Looking back over the comment history, if the biggest concern here is that 71 metrics being added for a minor release is too much, then I'm fine with only exporting the old metrics, backed by MemStats, and then flipping to the added 71 metrics on the major release (is that something that's happening relatively soon?).

Finally, how should I expose the options? A new GoCollector API in the collector package that has the options, with the default one just an implementation that chooses one of them? e.g.

func CustomGoCollector(kind GoCollectorKind) Collector
...
func GoCollector() Collector {
    return CustomGoCollector(GoCollectorMemStatsOnly)
}

(This is not the exact code I'd write, just a mock-up to express the idea.)

@beorn7 Cool! Looking forward to that. Honestly though I wasn't necessarily expecting the high granularity histograms to always be used in production anyway; this whole issue is basically just because I forgot how big they were, otherwise I would've done the bucket minimization up-front.

@jameshartig
Copy link

jameshartig commented Jan 23, 2022

I found this issue after seeing this:
image

and we only have this deployed to 1 service (out of 50+) and already 2 of the new histograms are already in the top 3 of our metrics. go_sched_latencies_seconds_bucket and go_gc_pauses_seconds_total_bucket both have over 700 buckets per instance for each of them. With over 3000 go instances, this could easily be millions of metrics if deployed to all of our services.

I definitely agree this is completely unexpected for a minor release.

@beorn7
Copy link
Member

beorn7 commented Jan 23, 2022

Yeah, I think this deserves a timely bugfix release. Together with #969.

@mknyszek
Copy link
Contributor

Sent #974 to mitigate the immediate problem. Waiting on advice on the best way to expose multiple options so users may opt out of the additional 79 values introduced in this release (I was off by a few because I forgot about the sum and count dimensions of histograms).

@bwplotka
Copy link
Member

bwplotka commented Jan 25, 2022

Sorry for the lag @mknyszek and thanks for the quick reaction! Let's aim to release this week.

Finally, how should I expose the options?

I would propose adding variadic option pattern so from func NewGoCollector() prometheus.Collector do func NewGoCollector(opts...GoOption) prometheus.Collector where GoOption is some func(o *goOptions) and then we can have WithMemStatsEnabled(bool) WithRuntimeMetricsEnabled(bool) and both true by default? 🤔 Then also WithExpandedBuckets some day?

@mknyszek
Copy link
Contributor

Given how big #974 has gotten, I'll do the options as a separate change. Sorry for the delay this week, I'm on a rotation.

@beorn7
Copy link
Member

beorn7 commented Jan 28, 2022

Thank you very much for fixing this quickly.

@kakkoyun are you planning to cut 1.12.1 any time soon? It would help me to cut the final Prometheus v2.33.0 release (the rc.1 has client_golang 1.12.0 with the race condition and the excessive amount of Go metrics).

@kakkoyun
Copy link
Member

kakkoyun commented Jan 29, 2022 via email

@prymitive
Copy link
Contributor Author

Thanks for fixing excessive bucket number count.
I see it still adds 80 extra metrics, it's not a small number but something most of us can live with.
There's only one bamboozle I have after upgrading to v1.12.1:

# HELP go_gc_pauses_seconds_total Distribution individual GC-related stop-the-world pause latencies.
# TYPE go_gc_pauses_seconds_total histogram
go_gc_pauses_seconds_total_bucket{le="-5e-324"} 0
go_gc_pauses_seconds_total_bucket{le="9.999999999999999e-10"} 0
go_gc_pauses_seconds_total_bucket{le="9.999999999999999e-09"} 0
go_gc_pauses_seconds_total_bucket{le="1.2799999999999998e-07"} 0
go_gc_pauses_seconds_total_bucket{le="1.2799999999999998e-06"} 0
go_gc_pauses_seconds_total_bucket{le="1.6383999999999998e-05"} 7598
go_gc_pauses_seconds_total_bucket{le="0.00016383999999999998"} 31961
go_gc_pauses_seconds_total_bucket{le="0.0020971519999999997"} 33919
go_gc_pauses_seconds_total_bucket{le="0.020971519999999997"} 34705
go_gc_pauses_seconds_total_bucket{le="0.26843545599999996"} 34708
go_gc_pauses_seconds_total_bucket{le="+Inf"} 34708
go_gc_pauses_seconds_total_sum NaN
go_gc_pauses_seconds_total_count 34

Isn't le="-5e-324" a negative number? Is that a valid le value?

@beorn7
Copy link
Member

beorn7 commented Feb 3, 2022

It is valid (or in other words: Prometheus histograms do support buckets for negative observations). But in this case, the observed GC pauses should never last a negative amount of time (that would mean the pause ended before it started? ;) .

I assume that's just a floating point precision issue, and the bucket boundary should perhaps be adjusted (to zero?).

@mknyszek you will know more about the internals here.

@mknyszek
Copy link
Contributor

mknyszek commented Feb 3, 2022

@beorn7 is correct. It's the math.NextAfter value for zero (toward -Inf). It is unfortunately sometimes possible, due to OS or runtime bugs, that samples end up in this bucket (probably not for GC pauses, but for scheduling latencies for example). This bucket is supposed to capture any of those cases, but I think there's a point to be made that it might not be worth surfacing that to the user. I'm fine with leaving it in or absorbing it into the next bucket.

And also apologies for not getting around to the opt-out API yet, I've gotten distracted with other things.

@bwplotka
Copy link
Member

bwplotka commented Feb 5, 2022

No worries @mknyszek and thanks for helping!

I created another issue for opt-out API, (#983), so hopefully, users like @prymitive can fall back to the old behaviour if needed. Also help wanted if anyone else want to contribute this! 🤗

@bwplotka
Copy link
Member

Hi All.

With new optionality, I would propose reverting the addition (by default) of new metrics. Please voice your opinion here: #1033

@bwplotka bwplotka reopened this Apr 13, 2022
kakkoyun pushed a commit that referenced this issue Apr 13, 2022
… metrics by default. (#1033)

Fixes #967

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
kakkoyun pushed a commit that referenced this issue May 13, 2022
… metrics by default. (#1033)

Fixes #967

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
breed808 added a commit to prometheus-community/windows_exporter that referenced this issue May 16, 2022
v1.12.2 removes some high cardinality metrics introduced in v1.12.0.
See prometheus/client_golang#967 for context.
breed808 added a commit to prometheus-community/windows_exporter that referenced this issue May 16, 2022
v1.12.2 removes some high cardinality metrics introduced in v1.12.0.
See prometheus/client_golang#967 for context.

Signed-off-by: Ben Reedy <breed808@breed808.com>
kakkoyun added a commit that referenced this issue Jul 6, 2022
* Cut v1.12.0

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Bump the day

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Make the Go 1.17 collector thread-safe (#969)

* Use simpler locking in the Go 1.17 collector (#975)

A previous PR made it so that the Go 1.17 collector locked only around
uses of rmSampleBuf, but really that means that Metric values may be
sent over the channel containing some values from future metrics.Read
calls. While generally-speaking this isn't a problem, we lose any
consistency guarantees provided by the runtime/metrics package.

Also, that optimization to not just lock around all of Collect was
premature. Truthfully, Collect is called relatively infrequently, and
its critical path is fairly fast (10s of µs). To prove it, this change
also adds a benchmark.

name            old time/op  new time/op  delta
GoCollector-16  43.7µs ± 2%  43.2µs ± 2%   ~     (p=0.190 n=9+9)

Note that because the benchmark is single-threaded it actually looks
like it might be getting *slightly* faster, because all those Collect
calls for the Metrics are direct calls instead of interface calls.

Signed-off-by: Michael Anthony Knyszek <mknyszek@google.com>

* API client: make http reads more efficient (#976)

Replace `io.ReadAll` with `bytes.Buffer.ReadFrom`.
Both need to resize a buffer until they have finished reading;
the former increases by 1.25x each time while the latter uses 2x.

Also added a benchmark to demonstrate the benefit:
name             old time/op    new time/op    delta
Client/4KB-8       35.9µs ± 4%    35.3µs ± 3%     ~     (p=0.310 n=5+5)
Client/50KB-8      83.1µs ± 8%    69.5µs ± 1%  -16.37%  (p=0.008 n=5+5)
Client/1000KB-8     891µs ± 6%     750µs ± 0%  -15.83%  (p=0.016 n=5+4)
Client/2000KB-8    1.74ms ± 2%    1.35ms ± 1%  -22.72%  (p=0.008 n=5+5)

name             old alloc/op   new alloc/op   delta
Client/4KB-8       20.2kB ± 0%    20.4kB ± 0%   +1.26%  (p=0.008 n=5+5)
Client/50KB-8       218kB ± 0%     136kB ± 0%  -37.65%  (p=0.008 n=5+5)
Client/1000KB-8    5.88MB ± 0%    2.11MB ± 0%  -64.10%  (p=0.008 n=5+5)
Client/2000KB-8    11.7MB ± 0%     4.2MB ± 0%  -63.93%  (p=0.008 n=5+5)

name             old allocs/op  new allocs/op  delta
Client/4KB-8         75.0 ± 0%      72.0 ± 0%   -4.00%  (p=0.008 n=5+5)
Client/50KB-8         109 ± 0%        98 ± 0%  -10.09%  (p=0.079 n=4+5)
Client/1000KB-8       617 ± 0%       593 ± 0%   -3.89%  (p=0.008 n=5+5)
Client/2000KB-8     1.13k ± 0%     1.09k ± 0%   -3.27%  (p=0.008 n=5+5)

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Reduce granularity of histogram buckets for Go 1.17 collector (#974)

The Go runtime/metrics package currently exports extremely granular
histograms. Exponentially bucket any histogram with unit "seconds"
or "bytes" instead to dramatically reduce the number of buckets, and
thus the number of metrics.

This change also adds a test to check for expected cardinality to
prevent cardinality surprises in the future.

Signed-off-by: Michael Anthony Knyszek <mknyszek@google.com>

* Cut v1.12.1 (#978)

* Cut v1.12.1

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Apply review suggestions

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Fix deprecated `NewBuildInfoCollector` API

Update `examples/random/main.go`:
  `prometheus.NewBuildInfoCollector` is deprecated. Use `collectors.NewBuildInfoCollector` instead.

Signed-off-by: alissa-tung <alissa-tung@outlook.com>

* gocollector: Added options to Go Collector for changing the (#1031)

* Renamed files.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* gocollector: Added options to Go Collector for diffetent collections.

Fixes #983

Also:

* fixed TestMemStatsEquivalence, it was noop before (:
* Removed gc_cpu_fraction metric completely, since it's not working completely for Go1.17+

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* gocollector: Reverted client_golang v1.12 addition of runtime/metrics metrics by default. (#1033)

Fixes #967

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* prometheus: Fix convention violating names for generated collector metrics (#1048)

* Fix convention violating names for generated collector metrics

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Add new Go collector example

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Remove -Inf buckets from go collector histograms (#1049)

* Remove -Inf buckets from go collector histograms

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Update prometheus/collectors/go_collector_latest_test.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Simplify

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Cut v1.12.2 (#1052)

* Cut v1.12.2

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Apply suggestions

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Update CHANGELOG.md

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

Co-authored-by: Kemal Akkoyun <kakkoyun@gmail.com>
Co-authored-by: Michael Knyszek <mknyszek@google.com>
Co-authored-by: Bryan Boreham <bjboreham@gmail.com>
Co-authored-by: Kemal Akkoyun <kakkoyun@users.noreply.github.com>
Co-authored-by: alissa-tung <alissa-tung@outlook.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Cori1109 added a commit to Cori1109/client_golang that referenced this issue Jan 9, 2023
… metrics by default. (#1033)

Fixes prometheus/client_golang#967

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Cori1109 added a commit to Cori1109/client_golang that referenced this issue Jan 9, 2023
… metrics by default. (#1033)

Fixes prometheus/client_golang#967

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Cori1109 added a commit to Cori1109/client_golang that referenced this issue Jan 9, 2023
* Cut v1.12.0

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Bump the day

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Make the Go 1.17 collector thread-safe (#969)

* Use simpler locking in the Go 1.17 collector (#975)

A previous PR made it so that the Go 1.17 collector locked only around
uses of rmSampleBuf, but really that means that Metric values may be
sent over the channel containing some values from future metrics.Read
calls. While generally-speaking this isn't a problem, we lose any
consistency guarantees provided by the runtime/metrics package.

Also, that optimization to not just lock around all of Collect was
premature. Truthfully, Collect is called relatively infrequently, and
its critical path is fairly fast (10s of µs). To prove it, this change
also adds a benchmark.

name            old time/op  new time/op  delta
GoCollector-16  43.7µs ± 2%  43.2µs ± 2%   ~     (p=0.190 n=9+9)

Note that because the benchmark is single-threaded it actually looks
like it might be getting *slightly* faster, because all those Collect
calls for the Metrics are direct calls instead of interface calls.

Signed-off-by: Michael Anthony Knyszek <mknyszek@google.com>

* API client: make http reads more efficient (#976)

Replace `io.ReadAll` with `bytes.Buffer.ReadFrom`.
Both need to resize a buffer until they have finished reading;
the former increases by 1.25x each time while the latter uses 2x.

Also added a benchmark to demonstrate the benefit:
name             old time/op    new time/op    delta
Client/4KB-8       35.9µs ± 4%    35.3µs ± 3%     ~     (p=0.310 n=5+5)
Client/50KB-8      83.1µs ± 8%    69.5µs ± 1%  -16.37%  (p=0.008 n=5+5)
Client/1000KB-8     891µs ± 6%     750µs ± 0%  -15.83%  (p=0.016 n=5+4)
Client/2000KB-8    1.74ms ± 2%    1.35ms ± 1%  -22.72%  (p=0.008 n=5+5)

name             old alloc/op   new alloc/op   delta
Client/4KB-8       20.2kB ± 0%    20.4kB ± 0%   +1.26%  (p=0.008 n=5+5)
Client/50KB-8       218kB ± 0%     136kB ± 0%  -37.65%  (p=0.008 n=5+5)
Client/1000KB-8    5.88MB ± 0%    2.11MB ± 0%  -64.10%  (p=0.008 n=5+5)
Client/2000KB-8    11.7MB ± 0%     4.2MB ± 0%  -63.93%  (p=0.008 n=5+5)

name             old allocs/op  new allocs/op  delta
Client/4KB-8         75.0 ± 0%      72.0 ± 0%   -4.00%  (p=0.008 n=5+5)
Client/50KB-8         109 ± 0%        98 ± 0%  -10.09%  (p=0.079 n=4+5)
Client/1000KB-8       617 ± 0%       593 ± 0%   -3.89%  (p=0.008 n=5+5)
Client/2000KB-8     1.13k ± 0%     1.09k ± 0%   -3.27%  (p=0.008 n=5+5)

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Reduce granularity of histogram buckets for Go 1.17 collector (#974)

The Go runtime/metrics package currently exports extremely granular
histograms. Exponentially bucket any histogram with unit "seconds"
or "bytes" instead to dramatically reduce the number of buckets, and
thus the number of metrics.

This change also adds a test to check for expected cardinality to
prevent cardinality surprises in the future.

Signed-off-by: Michael Anthony Knyszek <mknyszek@google.com>

* Cut v1.12.1 (#978)

* Cut v1.12.1

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Apply review suggestions

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Fix deprecated `NewBuildInfoCollector` API

Update `examples/random/main.go`:
  `prometheus.NewBuildInfoCollector` is deprecated. Use `collectors.NewBuildInfoCollector` instead.

Signed-off-by: alissa-tung <alissa-tung@outlook.com>

* gocollector: Added options to Go Collector for changing the (#1031)

* Renamed files.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* gocollector: Added options to Go Collector for diffetent collections.

Fixes prometheus/client_golang#983

Also:

* fixed TestMemStatsEquivalence, it was noop before (:
* Removed gc_cpu_fraction metric completely, since it's not working completely for Go1.17+

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* gocollector: Reverted client_golang v1.12 addition of runtime/metrics metrics by default. (#1033)

Fixes prometheus/client_golang#967

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

* prometheus: Fix convention violating names for generated collector metrics (#1048)

* Fix convention violating names for generated collector metrics

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Add new Go collector example

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Remove -Inf buckets from go collector histograms (#1049)

* Remove -Inf buckets from go collector histograms

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Update prometheus/collectors/go_collector_latest_test.go

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Simplify

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

* Cut v1.12.2 (#1052)

* Cut v1.12.2

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Apply suggestions

Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

* Update CHANGELOG.md

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>

Co-authored-by: Kemal Akkoyun <kakkoyun@gmail.com>
Co-authored-by: Michael Knyszek <mknyszek@google.com>
Co-authored-by: Bryan Boreham <bjboreham@gmail.com>
Co-authored-by: Kemal Akkoyun <kakkoyun@users.noreply.github.com>
Co-authored-by: alissa-tung <alissa-tung@outlook.com>
Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
mansikulkarni96 pushed a commit to mansikulkarni96/prometheus-community-windows_exporter that referenced this issue Feb 8, 2023
v1.12.2 removes some high cardinality metrics introduced in v1.12.0.
See prometheus/client_golang#967 for context.

Signed-off-by: Ben Reedy <breed808@breed808.com>
mansikulkarni96 pushed a commit to mansikulkarni96/prometheus-community-windows_exporter that referenced this issue Feb 8, 2023
v1.12.2 removes some high cardinality metrics introduced in v1.12.0.
See prometheus/client_golang#967 for context.

Signed-off-by: Ben Reedy <breed808@breed808.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants