Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce allocations by using same UnsafeAttributes in Instrumenter.onEnd as onStart. #11092

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

johnbley
Copy link
Member

This idea takes the UnsafeAttributes allocated in onStart, stores it in the context (with the start time), and then reuses it in onEnd.

  • I audited all 72 AttributeExtractor.onEnd implementations and found no issues with passing in the same object as onStart. There were 34 that did nothing with the parameter, 28 that simply did puts, 1 funny one that did a remove (HttpMethodAttributeExtractor), 7 that were just tests, and 2 that existed just to proxy type transforms.
  • I audited the OperationListeners too and they all got a lot simpler from this (in addition to further reducing allocations). Many of them were storing the starting attributes and then doing a lot of copying to make a new combined start+end attributes. Only one (HttpServerExperimentalMetrics) specifically needed to keep the original onStart attributes on their own, and his code was adjusted.
  • In the analysis, even Instrumenters without OperationListeners (i.e., not an http or rpc client or server library) still reduced allocations, since the UnsafeAttribute is more expensive to allocate than the context storage for itself.
  • ./gradlew :benchmark-overhead-jmh:jmh shows that gc.alloc.rate.norm drops from 41119 B/op to 40453 B/op (666 thrash GC bytes per op saving).

The only funny edge case to me is if, somehow, the context storage doesn't work. In that scenario, the starting attributes are "lost" and the operation listeners can't be run. I added logging and checks for this (similar to the logging that existed in this "can't happen" case for the operation listeners doing the same work previously) but it's not as clean as I would like, I think in part because there isn't an "error API" for the context storage insertion (this concept doesn't really make sense to me) which could allow for better fallback code.

@johnbley johnbley requested a review from a team as a code owner April 10, 2024 14:23
@johnbley johnbley changed the title Reduce allocations by using same UnsafeAttributes in Instrumener.onEnd as onStart. Reduce allocations by using same UnsafeAttributes in Instrumenter.onEnd as onStart. Apr 10, 2024
* @param endNanos The nanosecond timestamp marking the end of the operation. Can be used to
* compute the duration of the entire operation.
*/
void onEnd(Context context, Attributes endAttributes, long endNanos);
void onEnd(Context context, Attributes startAndEndAttributes, long startNanos, long endNanos);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately instrumentation-api, that contains this class, is marked as stable. This means that backwards incompatible changes to API can not be made without bumping the major version. If we wish to proceed with this PR we need to figure out how to introduce these changes without breaking the API. @trask do you have any suggestions? We could duplicate the functionality into a new interface or I guess we could try something like

default void onEnd(Context context, Attributes startAndEndAttributes, long startNanos, long endNanos) {
}

@Deprecated
default void onEnd(Context context, Attributes endAttributes, long endNanos) {
      throw new UnsupportedOperationException("This method is deprecated and will be removed in the next major release.");
}

usually we'd do that and make the new method delegate to the old, but here the attributes they expect are different. How would we tell which method to call? Use reflection on the target class and try to figure out which one it has? Add one more method to indicate which to call? Or indicate which to call when the operation listener is registered?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drat.

One alternative I've seen used for interface evolution like this is public boolean canPassStartNanosToEnd() where the result of calling that determines if the new or the old variation on the method is called, with a default implementation of both canPass and onEnd(...startNanos) that assume implementations want the old version called - no exceptions thrown.

Besides this awkward "v1/v2 interface," I could recode this and analyze it without this api change. Most of the operation listeners will then need to allocate context storage just to store the startNanos, but it will probably still be a win because they won't need to smash the start and end attributes together (far more expensive).

Let me know how you want to proceed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, typing that out made me wonder if something as simple as

   default void onEnd(... startNanos...) {
      this.onEnd(originalVersion, noStartNanos); 
   }

But then overriding the new onEnd variation in all the in-repo implementations would work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh. No, actually, any external implementations of this api might already be hurt by the assumption that endAttributes is only the end attributes, and not all of them. So, there's probably no way around some sort of interface change, which could be done in a backward-compatible way (with extra logic and allocations to satisfy any such implementations).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants