You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Currently at large scale Pyroscope struggles with a few issues:
compaction
is using a lot of resources, particularly ram
this is mostly because of the need for deduplication of data, which is caused by replication
replication (x3)
uses a lot of resources (cup, disk, ram)
is hard to maintain
leads to complex bugs
reduces read performance due to the need for deduplication of data
reads and writes are not isolated so spike in reads often affects writes
Describe the solution you'd like
We could do the following:
remove ingesters
make it so that distributors create small blocks in memory and flush them to object storage
remove deduplication code from read path and compaction
tweak compaction to work with increased number of blocks
tweak read path to work with increased number of blocks
Concerns / Risks
These are not blockers, but rather a list of things that might derail the project, so we should make sure we keep these concerns in mind and address these early:
increased object storage write costs
reduced query performance due to too many small blocks
running into some unforeseen performance limitations of underlying object storage
Acceptance Criteria
it should work with traffic in ops
costs should not go up. it is fine if we have to exchange reduced ingesters costs with increased queriers cost though, because we can address this later
performance should not go down
Timeline / Staffing
Great news is that we already have all of the components built for this and so this project becomes a lot more about moving things around and tweaking the system rather than building new stuff.
I think if we split up the task we could get a good working prototype in about 4 weeks and 3 people. I think @kolesnikovae should lead this project.
Assuming the project succeeds we could then spend another 4 weeks doing migrations / further tweaking algorithms to work with the new system.
Outcome
The main thing is that the system becomes simpler. To elaborate on that, these changes would:
improve maintainability of the system
reduce toil
significantly reduce tco
improve read performance
Additional context
All credit for this idea should go to @kolesnikovae — I'm just trying to document the proposed solution. Also, this is somewhat of a meta-issue. I imagine a lot of other existing issues could be steps towards implementation of this project.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Currently at large scale Pyroscope struggles with a few issues:
Describe the solution you'd like
We could do the following:
Concerns / Risks
These are not blockers, but rather a list of things that might derail the project, so we should make sure we keep these concerns in mind and address these early:
Acceptance Criteria
Timeline / Staffing
Great news is that we already have all of the components built for this and so this project becomes a lot more about moving things around and tweaking the system rather than building new stuff.
I think if we split up the task we could get a good working prototype in about 4 weeks and 3 people. I think @kolesnikovae should lead this project.
Assuming the project succeeds we could then spend another 4 weeks doing migrations / further tweaking algorithms to work with the new system.
Outcome
The main thing is that the system becomes simpler. To elaborate on that, these changes would:
Additional context
All credit for this idea should go to @kolesnikovae — I'm just trying to document the proposed solution. Also, this is somewhat of a meta-issue. I imagine a lot of other existing issues could be steps towards implementation of this project.
The text was updated successfully, but these errors were encountered: