Use OTEL for metrics gathering (WIP) #3148
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What is the problem this PR solves?
An onweek project to change fleet-server's metrics collection to use otel + the APM exporter as a bridge.
We are currently using a mixture of elastic-agent-libs and prometheus+an APM bridge.
This is a little messy and introduces unneeded dependencies and complexity.
OTEL metrics for the routes are now tagged with
server.host
andserver.port
attributes to allow us to determine when the internal/external API ports have issues.A translation/export mechanism is provided so that we can continue to provide fleet-server metrics on the
/stats
endpoint for metricbeat collection/monitoring until we have determined how elastic-agent will monitor components with otel.Note that it completely removes the option for the prometheus endpoint.
It also removes collecting the generic system/cpu/mem datasets.
TODO
How to test this PR locally
Start a server and ping
5066/stats
to view the metricbeat stats endpoint.Checklist
./changelog/fragments
using the changelog tool