Optimistically cache mapped metadata when cluster metadata is periodically refreshed #725

pracucci · 2024-05-09T15:09:25Z

In Mimir, we have a process creating 1 Client (setting MetadataMinAge = MetadataMaxAge = 10s`) and then using it to do two things:

Consume a partition
Request ListOffsets every 1s

I noticed that such Client issues 2 Metadata requests every MetadataMinAge (remember that we set MetadataMaxAge = MetadataMinAge).

One request is due to the periodic Metadata refresh, and one is due to ListOffsets which need Metadata as well. ListOffsets request calls Client.fetchMappedMetadata() which caches the metadata for MetadataMinAge, so effectively leading to 2 Metadata requests every MetadataMinAge.

In this PR I'm proposing to optimistically cache the mapped metadata when cluster metadata is periodically refreshed.

Notes

I'm not an expert of this client, so I may miss something obviously wrong in this PR
The fetchMappedMetadata() comment says "this is garbage heavy, so it is only used in one off requests in this package". Part of the the garbage is introduced by caching the mapped metadata, which is now done each time metadata is refreshed. Do you see any issue?
I'm not a big fan of passing a callback function to storeCachedMappedMetadata(), but I haven't come up with a better idea which either doesn't make code more complex or introduce more allocations. Thoughts?
I've tested it for our use case and the number of Metadata requests dropped from 2 to 1 every MetadataMinAge

…cally refreshed Signed-off-by: Marco Pracucci <marco@pracucci.com>

twmb · 2024-05-23T06:44:41Z

pt2, "Do you see any issue" -- probably not. I currently take deliberate care within the client to not go through the end-user Request path, to not have this secondary caching for the general consumer/producer. Of the 7 requests that use fetchMappedMetadata, 5 are admin requests, and the other two are. ListOffsets and OffsetForLeaderEpoch. These latter two are only used while consuming, and I have a fairly large chunk of code in consumer.go dedicated to issuing these requests directly (and handling the responses in a paranoid way). So, the change here to always pipe producer/consumer metadata responses through the separate fetchMappedMetadata cache will increase memory.

I'll look through the code change / callback more when I get time soon; the past two months have been busier than I was expecting, but getting through issues & PRs is actually at the top of the priority list at the moment.

twmb · 2024-05-26T05:22:07Z

Good enough, we'll see if somebody even notices a minor memory bump. +1!

pracucci · 2024-05-27T08:11:43Z

Thanks for the review! Feel free to revert (or ping me) if anyone complain about it.

Optimistically cache mapped metadata when cluster metadata is periodi…

993544c

…cally refreshed Signed-off-by: Marco Pracucci <marco@pracucci.com>

twmb added TODO minor and removed TODO labels May 26, 2024

twmb merged commit 49e70fb into twmb:master May 26, 2024
4 checks passed

twmb removed the minor label May 26, 2024

pracucci deleted the improve-mapped-metadata-cache branch May 27, 2024 08:11

pracucci mentioned this pull request May 27, 2024

Update franz-go dependency grafana/mimir#8184

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimistically cache mapped metadata when cluster metadata is periodically refreshed #725

Optimistically cache mapped metadata when cluster metadata is periodically refreshed #725

pracucci commented May 9, 2024 •

edited

twmb commented May 23, 2024

twmb commented May 26, 2024

pracucci commented May 27, 2024

Optimistically cache mapped metadata when cluster metadata is periodically refreshed #725

Optimistically cache mapped metadata when cluster metadata is periodically refreshed #725

Conversation

pracucci commented May 9, 2024 • edited

Notes

twmb commented May 23, 2024

twmb commented May 26, 2024

pracucci commented May 27, 2024

pracucci commented May 9, 2024 •

edited