New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding simple support for backend telemetry. #19257
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! A few comments.
final var telemetryMetrics = metricsProviders.entrySet() | ||
.stream() | ||
.collect(Collectors.toMap(Map.Entry::getKey, entry -> entry.getValue().get())); | ||
telemetryMetrics.forEach((key, value) -> value.map(TelemetryEvent::metrics) | ||
.ifPresent(metrics -> telemetryClient.capture(key, metrics))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we catch any exception here and log a good message instead of letting the periodical system log a generic error message?
...og2-server/src/main/java/org/graylog2/telemetry/scheduler/TelemetrySubmissionPeriodical.java
Show resolved
Hide resolved
pom.xml
Outdated
@@ -490,6 +491,11 @@ | |||
<artifactId>auto-value-javabean</artifactId> | |||
<version>${auto-value-javabean.version}</version> | |||
</dependency> | |||
<dependency> | |||
<groupId>com.posthog.java</groupId> | |||
<artifactId>posthog</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The library looks a bit funky. Using System.out.println
and e.printStackTrace()
for logging, for example. 😬
I wonder if it's worth including or if we should rather implement our own small client based on our OkHttp client. Not sure how much work the client saves us. I don't think we need the client's queueing mechanism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see no way to specify a proxy for the client, so that is already a good reason to roll our own, unfortunately. My only concern are future changes that we cannot mitigate by just bumping the client, but maybe at that point the client is in a better shape, so we can switch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Their HTTP API should be rather stable. If that's not the case, it would be pretty painful for everyone using the product. 😄
this.posthog = () -> new PostHog.Builder(telemetryConfiguration.getTelemetryApiKey()) | ||
.host(telemetryConfiguration.getTelemetryApiHost()) | ||
.build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we stick with the client (see comment in pom.xml
below), we need to set the proxy server if it's configured in the server config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for replacing the lib! 🙏
Two comments below.
} | ||
} | ||
|
||
@POST |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the /batch
path missing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦
.map(entry -> PosthogAPI.Event.create(clusterId, entry.getKey(), entry.getValue().metrics())) | ||
.toList(); | ||
final var request = new PosthogAPI.BatchRequest(apiKey, batch); | ||
posthog.batchSend(request); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's handle response codes and errors here. (https://posthog.com/docs/api/post-only-endpoints#responses)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my comments. I didn't test the code, but the approach looks good! 👍
|
||
@Override | ||
public boolean startOnThisNode() { | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed: we can do the telemetryClient.isEnabled()
call here so the periodical doesn't even run if telemetry is disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed that. Had to rebase the branch unfortunately, messed up a merge.
c535e9c
to
05b1c4c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Description
Motivation and Context
This PR is providing a simple foundation for sending telemetry from the backend. It bootstraps the client using the (already available) API key and endpoint and adds a simple periodical (
TelemetrySubmissionPeriodical
) which is executed every day (to avoid sending excessive telemetry events) to submit metrics (if they are present).For this, it provides the ability to bind simple suppliers (implemented as subclasses of
TelemetryMetricSupplier
) which create (optional) metrics for each invocation. If none are registered or none of them returns actual metrics, metrics submission is skipped. Each supplier is registered with a (unique) key, which is used as the event id for the captured event.A simple supplier could be registered as simple as this:
A supplier does not have to provide a metric, if e.g. the feature was not used:
/nocl
How Has This Been Tested?
Screenshots (if appropriate):
Types of changes
Checklist: