-
Notifications
You must be signed in to change notification settings - Fork 962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tags merge optimization #4959
base: main
Are you sure you want to change the base?
Tags merge optimization #4959
Conversation
Hello @jonatan-ivanov! Could you please review this MR? Thanks a lot in advance! |
Hello @shakuzen, could you please review this merge request when you have a chance? Thank you in advance! |
Thanks for the pull request. It's on our list of things to review. |
Thanks! Is there anything I can do to facilitate review, like maybe separating MR into 2 smaller independent MRs: one with benchmark and follow up with refactoring? So it may be easier to review? |
Hello there! May I kindly ask what is the status of review of this PR? May I somehow facilitate review making it happen any time soon? Thanks a lot in advance! |
32ced6c
to
dfbdde3
Compare
18a1aed
to
6b96b4d
Compare
6b96b4d
to
e7a42c8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all the work on this. It's very appreciated. We just haven't had a chance to give it the attention it deserves yet. I'll be busy traveling for a conference through the end of next week, but I should be able to properly review and test this after that. I've left some initial quick feedback.
private Tags(Tag[] sortedSet, int length) { | ||
assert isSortedSet(sortedSet, length) : "bug on caller side: construction invariant violated"; | ||
this.sortedSet = sortedSet; | ||
this.length = length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not understanding why we would take a length parameter here instead of get the length from the sortedSet
array. I would guess it is mostly for the assertion effectively checking the logic of the dedup
method. If that's the case, see my other comment about the assertion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This constructor now assumes that the first length
elements of sortedSet
array are ordered and unique - so basically sortedSet[0..length]
is a sorted set of tags. This is enforced by "construction", I mean constructor is private and other factory methods guarantee that arguments passed into sortedSet
+ length
satisfy sorted set property.
This constructor now used not only after sort
+ dedup
calls which guarantee that sorted set property of sortedSet[0..length]
holds, but also from several other places like creating Tags
from two other Tags
objects when two ordered sets are merged such that merged sortedSet[0..length]
must also be ordered set.
assert
here is just an extra safety measure to make sure the enforcement by "construction" is additionally checked on non-production runs like tests. I'm quite confident there is no bugs in merge procedure so I can remove this assert.
Arrays.sort(this.tags); | ||
dedup(); | ||
private Tags(Tag[] sortedSet, int length) { | ||
assert isSortedSet(sortedSet, length) : "bug on caller side: construction invariant violated"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have any other Java asserts in our code now. I think it'd be better to consider introducing them separate from the performance optimization proposed by this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So maybe then also split this PR into one which adds only benchmark without modifying Tags
and another which does optimisation inside Tags
?
I've observed unexpectedly significant CPU time spent in Tags merge function in one of the production services which rely on micrometer:
I've slightly refactored internals of
Tags
class to take advantage of tags set is being already sorted array, which could be then combined with other sorted set in linear time.I've also added microbenchmark for tags concat operation to see the performance gain before I can test this change in production.
Below I'm providing results from
Apple M2 Pro
+JDK 21
machine:Before:
Full JMH output (before)
After:
Full JMH output (after)
Part of #5140.