New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prototype batch api #4024
Prototype batch api #4024
Conversation
return new SdkBatchRecorder(batchLatch); | ||
} | ||
|
||
static class BatchLatch { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the naive synchronization mechanism that ensures the recordings are included in a collection atomically.
When BatchRecorder#record(Attributes, Context)
is called, startBatchRecord()
is called on start. This blocks if a collection is in progress. If not, it acquires a permit from batchSemaphore
that blocks collections until the record is completed and finishBatchRecord()
is called to release the permit.
When a collect occurs, startCollect()
is called. This sets collectLatch = new CountDownLatch(1)
, which blocks batch records from occurring until the collect completes. It then blocks while acquiring all the permits from batchSemaphore
, which allows in progress recordings to complete. When the collect is done, it calls finishCollect()
, and counts down the latch and returns the permits.
.addMeasurements(10.0, doubleHistogram, doubleCounter, doubleUpDownCounter) | ||
.record(Attributes.builder().put("foo", "bar").build(), Context.current()); | ||
|
||
collectAndPrint(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This compares the API usage currently versus with the batch API.
Good investigation!
I had hoped we could tackle both the Even more crazy, if we had something using some annotation processor magic to do something like:
Additionally, for Async, we even thought about something like:
For sync batch, I was hoping for something more like:
Specifically, it'd be ideal if there was some kind of "registration" phase where we can see what instruments are included in the batch and pre-optimise (in the SDK) a record path. |
I'm struggling to imagine a scenario where an instrument would know the value ahead of time, besides perhaps when incrementing a counter. If an instrument is able to know ahead of time the value and attributes it will be recording, it doesn't seem like its measuring anything very useful, since at that point only the context can change. It's more reasonable that an instrument would know the attributes ahead of time. In this case, the implementation could use the bind API to preallocate. But doing so limits the usefulness of the batch API, since a batch API designed for attributes not known up front would look quite different. I guess I'd like to better understand what problems a batch API aims to solve. The only definitive advantage of a batch API I could identify is being able to atomically record to multiple instruments, ensuring that all appear in the same collection. This only seems marginally useful. Other advantages can be characterized as syntactic sugar, which I think can be implemented via helper functions and extensions on top of the API. |
Sorry I wasn't clear here. I meant pre-register the attributes not the value. |
I was reading through spec PR #2363 and thinking about what a batch API might look like in java. I agree that the concept of having a single callback that can record to multiple async instruments is appealing, since its hard for a user to organize this type of thing if the callback functions are expensive. This PR doesn't really accomplish that very well, but we could mimic what @jamcd proposed with something like:
Notes:
|
|
||
@Test | ||
void batch() { | ||
Meter meter = GlobalOpenTelemetry.get().getMeterProvider().get("meter"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious why you would use the global for this. Why not just construct an instance, rather than rely on the global state and resetting it each time?
I think I would probably reverse the parameter order here, and send in the observers first, then the callback at the end (which I think will be nice for kotlin users, for example). but, this does seem to be a reasonable approach. Not sure about the naming of |
In #2363 I don't see the observers passed into the registerbatchcallback method, is there a need to be doing that here? Also, I suspect we can do something like "run all batch callbacks first" during collection. If so, those could just be synchronous instruments I guess? There doesn't seem to be a difference between the |
What's the status of this - I definitely have use cases where this would be very useful. |
@kittylyst its blocked on this spec issue. I imagine that our first stable metrics release will not have support for batch callbacks. |
Closing in favor of #4376. |
A few weeks ago in the java SIG we discussed that it might be valuable to prototype what a batch API. IIRC, the thought was that prototyping a batch API might find some intersection with the bind API we recently dropped. I dug through the spec and found this old API reference to a batch API, and used some of the notes as inspiration.
The idea is that you can "atomically" record values for several instruments using the same attributes. I interpreted atomic as guaranteeing that either all or none of the recordings make it into a particular collection. This seems only somewhat useful to me. However, the API experience of recording a batch of measurements does feel a bit improved.
The naive implementation I put together in this PR has worse performance than if you were record the instruments using the existing API. This is because the implementation doesn't get to take advantage of acquiring fewer locks by recording in a batch, and actually does some additional synchronization to ensure that all the recordings occur atomically for a collection. There's likely some room for improvement, but is always going to take extra work to guarantee the atomicity.