Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throughput benchmarks #424

Open
snowp opened this issue Apr 20, 2021 · 11 comments
Open

Throughput benchmarks #424

snowp opened this issue Apr 20, 2021 · 11 comments

Comments

@snowp
Copy link
Contributor

snowp commented Apr 20, 2021

Add benchmarks providing some baseline for performance. An example of this would be a producer that produces new snapshots at a rapid pace and seeing how long it takes for these changes to get sent to the client.

Should at the very least cover the new delta code (as it's the more performant protocol), but could also target sotw.

This would help us better understand the impact of larger changes (like #413) and the cost of per resource computation in delta.

@alecholmez

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label May 20, 2021
@snowp snowp added help wanted and removed stale labels May 20, 2021
@alecholmez
Copy link
Contributor

@snowp We definitely want this, once the delta code lands I think it'd be worth while spending some time on really figuring out the performance gains of delta.

Just an FYI, I introduced this a while ago: #362 but it never got merged. Might be worth revisiting at some point.

@alecholmez
Copy link
Contributor

I think I'm going to take this next, we need to test the performance of what we have now. Utilizing the integration test with something like pprof to scrape for statistics seems appealing.

@snowp
Copy link
Contributor Author

snowp commented Jun 25, 2021

Running on top the integration tests seems like a good start!

One thing to think about beyond that is how performance would differ under different load patterns: lots of streams against the same type, lots of streams against an opaque type, lots of streams with different types, etc. and how the types of updates that are happening affect the stream (what % of streams are getting updates, the ADS case of multiple resources being sent over the same stream etc.)

@alecholmez
Copy link
Contributor

Excellent, myself and @dougfort might iterate on a design doc to see what we can come up with. We'll see what we can capture, theres a gold mine of data here

@alecholmez
Copy link
Contributor

alecholmez commented Jun 29, 2021

@snowp if we intend to utilize the integration tests for what we want, is it safe to assume we don't intend on doing any long term storage of this throughput data? I was iterating design with a few co-workers and we were curious as to what you think is a good fit for this situation:

  1. A one-time executable benchmark that simply outputs formatted data for users of this repo to consume
  2. Or a complex system that holds data over time

We initially sketched out a mix of 1 and 2 but we don't want to creep the scope here to something that is unnecessary for the data we want to collect.

Here's a quick proposal we drafted up. It just sketches out the concepts of each component of the benchmark. We can go back and add technical artifacts to that later.

With a code example of the producer we talk about.

I think we want a system that will allow us to at least run pprof so we can see what's going on at runtime. The go benchmarking framework is limited as we've come to learn. If we do end up going with a system that complex it might even be worth separating from the integration test as a whole and having some sort of standalone test under pkg/test/benchmark.

@alecholmez
Copy link
Contributor

Want to note that eventually we should use this benchmark to test code change too: #451

@alecholmez
Copy link
Contributor

Just FYI this PR enables profiling for cpu, lock contention (block), mutex switching, go routine spawning and memory usage. It currently runs off the integration test but to test throughput I want to build a test client that puts the server through its paces. Once that's done I can profile that code and collect runtime data so we can have some throughput numbers

@hiromis
Copy link

hiromis commented Aug 24, 2021

Hi @snowp, @alecholmez,

I am new to the go-control-plane and trying to gauge the scope of this issue. I see that Alec has a PR up for benchmarking integration tests. What can I do to help complete this issue?

I was thinking maybe I can add benchmarking to SetSnapshot function to measure "the cost of per resource computation" that's mentioned in the original description. Would that be enough to be the first iteration?

As for #424 (comment), what did you guys have in mind exactly? I am looking at internal/example as well as the integration tests and wondering whether having just one proxy running to consume updates suffice or were you guys thinking of standing up an environment with multiple proxies with realistic snapshots?

Do let me know how I can contribute!

@alecholmez
Copy link
Contributor

@snowp any input here? Hiromi has found that the go benchmarks aren't particularly accurate for detecting throughput here, did you have in mind a separate framework that we might need to build out to measure this? I guess we're just looking for some clarification here

@snowp
Copy link
Contributor Author

snowp commented Sep 1, 2021

My original thought was to have a system where we simulate continuous updates to the cache and understand how long it takes for these updates to make it to clients under increasing rate of change. Benchmarking small pieces of the code might be beneficial as microbenchmarks that can be optimized independently, but in order to understand the actual throughput of the system we probably need something end to end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants