Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service analytics #3256

Open
kolesnikovae opened this issue Apr 26, 2024 · 0 comments
Open

Service analytics #3256

kolesnikovae opened this issue Apr 26, 2024 · 0 comments
Assignees
Labels
backend Mostly go code enhancement New feature or request ux

Comments

@kolesnikovae
Copy link
Collaborator

kolesnikovae commented Apr 26, 2024

Currently, we expect users to know what they are looking for and provide quite limited abilities for exploration. Pyroscope should provide users with insight into their data: it should be possible to identify "interesting" services or service instances and code hotspots at a glance, without querying anything specific.

Global View

At the highest level, we should present an overview of the user environment (the piece we're aware of): global statistics on the services sending data to Pyroscope.

We should identify a subset of "interesting" (or important) services that the user may want to take a closer look into. It does not mean we should only collect statistics for those services exclusively, but we should inform the user about them first; like suggestions.

This ought to be user-specific, although we could probably address this by utilizing multi-tenancy; the target audience is software engineers. Some of the criteria (in no particular order):

  • Owned by the user, user favorites.
  • Data statistics (ingested/stored/queried), projections/predictions.
  • Pattern changes (a surge, or drop in traffic or samples).
  • New services (+version?).
  • Dynamics, like new code hotspots (see below), or dimension statistics changes (see below).
  • AI featured.

Service catalog (registry)

List of all the services, including details such as:

  • Number of active instances.
  • Profile types.
  • Language/runtime (+version).
  • SDK/agent (+version).
  • Data statistics (ingested/stored/queried).

A user should be able to get answers to questions like:

  • How much data is ingested, stored, and queried, both total and per dimension (service, language, profile type, SDK/agent).
  • How many instances are sending data, both total and per dimension.
  • Average rate per service instance, average profile size.

These are frequently asked questions, and we can greatly help users if we provide them with all the necessary information.

Dimension statistics

We should provide a detailed breakdown of dimensions (labels) for each service. For each of the service labels, we identify top-K values and for each of the selected label values, we collect statistics such as: share of samples, ingestion traffic, and data on disk. Label query matches could be tracked as well.

Later on, this information could also help us to add new features like:

  • Query costs estimation.
  • Adaptive profiling (generate optimisation suggestions based on the access patterns)

See: #2648, #3037, #3226

Code hot spots

We should provide a list of functions (or call sites) that the user might be interested in. In the simplest form, this is a top-K list of functions (both flat and cumulative). More sophisticated analysis is possible, because the statistics are supposed to be collected service-wide in the background.

Recent activity / query history

This is somewhat unrelated, but we also may want to keep track of the user actions globally (kind of audit log):

  • It would be very useful to have access to the own recent queries and recently visited pages. I personally lack this very much.
  • This could help owners to adopt profiling in their environments/companies/teams.

This part should be handled on the client side (in the Grafana app plugin).

Service relationships

It might be handy to indicate relations between services:

  • Using external services (hello Tempo!)
  • Explicit configuration: e.g., we could group services by user-specified labels
  • Employing ML/DL techniques: ingestion/query pattern match, dimension clusters, "Users also query ..." thing, etc

This part could be handled on the client side (in the Grafana app plugin).

@kolesnikovae kolesnikovae self-assigned this Apr 26, 2024
@kolesnikovae kolesnikovae added enhancement New feature or request backend Mostly go code ux labels Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Mostly go code enhancement New feature or request ux
Projects
None yet
Development

No branches or pull requests

1 participant