Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chore] Refactor allocation strategies #2928

Merged
merged 9 commits into from May 20, 2024

Conversation

swiatekm-sumo
Copy link
Contributor

Allocation strategies are currently messy and duplicate a lot of logic. This is an attempt at making them more DRY.

  • Factored out the bookkeeping into a new Allocator struct. It keeps track of targets, collectors, allocation, metrics, etc.
  • Added a new Strategy interface that strategies need to implement. Right now, the minimum needed is just a method that takes a collector map and a target, and returns the assigned collector.
  • Added some generic allocator tests that run for all strategies.

I'm not entirely happy with the end result, and I've left some TODOs in the code I'd like to address in a follow-up. I haven't included those changes in this PR in the interest of keeping it simpler.

@swiatekm-sumo swiatekm-sumo marked this pull request as ready for review May 6, 2024 12:38
@swiatekm-sumo swiatekm-sumo requested a review from a team as a code owner May 6, 2024 12:38
@jaronoff97
Copy link
Contributor

yeah this is super close to what i tried to do a year or two ago, and am definitely in favor of this change. The trouble I ran into back then was how you give the right context to the strategy without duplicating memory. i.e. it's not great if the strategy has to keep its own collector map and the allocator struct does too. Overall, I think the approach makes sense. I'll give it a better read later today!

@swiatekm-sumo
Copy link
Contributor Author

yeah this is super close to what i tried to do a year or two ago, and am definitely in favor of this change. The trouble I ran into back then was how you give the right context to the strategy without duplicating memory. i.e. it's not great if the strategy has to keep its own collector map and the allocator struct does too. Overall, I think the approach makes sense. I'll give it a better read later today!

Is that a realistic problem? The collector count is going to be less than 100 in almost all practical circumstances, and we're only storing pointers anyway, so the memory cost is small. None of my strategy implementations need to store collectors anyway, the per-node strategy only does it for performance reasons.

In a follow-up, I'd like to restrict what data actually gets passed to the strategy in the SetCollectors call. Right now, the strategy can assume some of this data is immutable until the next call (like the name or node), but not all of it (NumTargets can change). I'd like to express this more directly in the API instead of relying on convention.

@jaronoff97
Copy link
Contributor

I was worried about the labelsets cardinality, but yeah if we're just using some pointers for only the collectors we should be alright. Either way, I would love to see some benchmarks to ensure that's the case. It feels like we've had some brush ups with memory and the TA recently.

@swiatekm-sumo
Copy link
Contributor Author

I was worried about the labelsets cardinality, but yeah if we're just using some pointers for only the collectors we should be alright. Either way, I would love to see some benchmarks to ensure that's the case. It feels like we've had some brush ups with memory and the TA recently.

Ran the Benchmark_Setting benchmark, which looks the most appropriate for my changes. Tried others and saw no change.

                                                             │    old.txt    │               new.txt                │
                                                             │    sec/op     │    sec/op     vs base                │
_Setting/consistent-hashing_num_cols_100_num_jobs_100-32        6.042µ ±  3%   6.054µ ±  5%        ~ (p=0.481 n=10)
_Setting/consistent-hashing_num_cols_100_num_jobs_1000-32       51.12µ ±  4%   50.71µ ±  6%        ~ (p=0.529 n=10)
_Setting/consistent-hashing_num_cols_100_num_jobs_10000-32      723.1µ ±  5%   726.4µ ±  1%        ~ (p=0.579 n=10)
_Setting/consistent-hashing_num_cols_100_num_jobs_100000-32     8.688m ± 12%   8.763m ±  3%        ~ (p=0.529 n=10)
_Setting/consistent-hashing_num_cols_1000_num_jobs_100-32       48.88µ ±  7%   49.89µ ±  6%        ~ (p=0.218 n=10)
_Setting/consistent-hashing_num_cols_1000_num_jobs_1000-32      103.6µ ±  3%   105.2µ ±  2%        ~ (p=0.684 n=10)
_Setting/consistent-hashing_num_cols_1000_num_jobs_10000-32     781.2µ ±  5%   781.5µ ±  2%        ~ (p=0.853 n=10)
_Setting/consistent-hashing_num_cols_1000_num_jobs_100000-32    8.701m ±  8%   9.831m ± 12%  +12.99% (p=0.007 n=10)
_Setting/per-node_num_cols_100_num_jobs_100-32                 42.934µ ±  1%   5.933µ ±  5%  -86.18% (p=0.000 n=10)
_Setting/per-node_num_cols_100_num_jobs_1000-32                175.94µ ±  1%   50.84µ ±  6%  -71.10% (p=0.000 n=10)
_Setting/per-node_num_cols_100_num_jobs_10000-32               1723.0µ ±  3%   726.5µ ±  6%  -57.84% (p=0.000 n=10)
_Setting/per-node_num_cols_100_num_jobs_100000-32              28.073m ± 20%   8.726m ±  4%  -68.92% (p=0.000 n=10)
_Setting/per-node_num_cols_1000_num_jobs_100-32                346.49µ ±  1%   48.89µ ±  7%  -85.89% (p=0.000 n=10)
_Setting/per-node_num_cols_1000_num_jobs_1000-32                496.6µ ±  1%   105.4µ ±  3%  -78.77% (p=0.000 n=10)
_Setting/per-node_num_cols_1000_num_jobs_10000-32              2065.8µ ±  2%   768.4µ ±  3%  -62.80% (p=0.000 n=10)
_Setting/per-node_num_cols_1000_num_jobs_100000-32             27.328m ± 15%   8.874m ± 10%  -67.53% (p=0.000 n=10)
_Setting/least-weighted_num_cols_100_num_jobs_100-32            5.885µ ±  4%   5.971µ ±  4%        ~ (p=0.089 n=10)
_Setting/least-weighted_num_cols_100_num_jobs_1000-32           51.27µ ±  6%   50.84µ ±  8%        ~ (p=0.579 n=10)
_Setting/least-weighted_num_cols_100_num_jobs_10000-32          701.0µ ±  5%   703.9µ ±  3%        ~ (p=0.190 n=10)
_Setting/least-weighted_num_cols_100_num_jobs_100000-32         8.838m ± 10%   8.635m ± 16%        ~ (p=0.063 n=10)
_Setting/least-weighted_num_cols_1000_num_jobs_100-32           49.83µ ±  4%   49.19µ ±  3%        ~ (p=0.469 n=10)
_Setting/least-weighted_num_cols_1000_num_jobs_1000-32          105.2µ ±  2%   105.1µ ±  2%        ~ (p=0.579 n=10)
_Setting/least-weighted_num_cols_1000_num_jobs_10000-32         761.8µ ±  5%   783.8µ ±  5%        ~ (p=0.739 n=10)
_Setting/least-weighted_num_cols_1000_num_jobs_100000-32        8.951m ±  8%   8.805m ±  7%        ~ (p=0.315 n=10)
geomean                                                         473.1µ         302.6µ        -36.04%

                                                             │    old.txt     │                new.txt                │
                                                             │      B/op      │    B/op     vs base                   │
_Setting/consistent-hashing_num_cols_100_num_jobs_100-32           192.0 ± 0%   192.0 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/consistent-hashing_num_cols_100_num_jobs_1000-32          192.0 ± 0%   192.0 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/consistent-hashing_num_cols_100_num_jobs_10000-32         192.0 ± 0%   192.0 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/consistent-hashing_num_cols_100_num_jobs_100000-32        192.0 ± 0%   192.0 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/consistent-hashing_num_cols_1000_num_jobs_100-32          192.0 ± 0%   192.0 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/consistent-hashing_num_cols_1000_num_jobs_1000-32         192.0 ± 0%   192.0 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/consistent-hashing_num_cols_1000_num_jobs_10000-32        192.0 ± 0%   192.0 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/consistent-hashing_num_cols_1000_num_jobs_100000-32       192.0 ± 0%   192.0 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/per-node_num_cols_100_num_jobs_100-32                   27842.5 ± 0%   192.0 ± 0%   -99.31% (p=0.000 n=10)
_Setting/per-node_num_cols_100_num_jobs_1000-32                  85440.0 ± 0%   192.0 ± 0%   -99.78% (p=0.000 n=10)
_Setting/per-node_num_cols_100_num_jobs_10000-32                661422.0 ± 0%   192.0 ± 0%   -99.97% (p=0.000 n=10)
_Setting/per-node_num_cols_100_num_jobs_100000-32              6421405.5 ± 0%   192.0 ± 0%  -100.00% (p=0.000 n=10)
_Setting/per-node_num_cols_1000_num_jobs_100-32                 299523.5 ± 0%   192.0 ± 0%   -99.94% (p=0.000 n=10)
_Setting/per-node_num_cols_1000_num_jobs_1000-32                357104.0 ± 0%   192.0 ± 0%   -99.95% (p=0.000 n=10)
_Setting/per-node_num_cols_1000_num_jobs_10000-32               933133.0 ± 0%   192.0 ± 0%   -99.98% (p=0.000 n=10)
_Setting/per-node_num_cols_1000_num_jobs_100000-32             6693000.0 ± 0%   192.0 ± 0%  -100.00% (p=0.000 n=10)
_Setting/least-weighted_num_cols_100_num_jobs_100-32               192.0 ± 0%   192.0 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/least-weighted_num_cols_100_num_jobs_1000-32              192.0 ± 0%   192.0 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/least-weighted_num_cols_100_num_jobs_10000-32             192.0 ± 0%   192.0 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/least-weighted_num_cols_100_num_jobs_100000-32            192.0 ± 0%   192.0 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/least-weighted_num_cols_1000_num_jobs_100-32              192.0 ± 0%   192.0 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/least-weighted_num_cols_1000_num_jobs_1000-32             192.0 ± 0%   192.0 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/least-weighted_num_cols_1000_num_jobs_10000-32            192.0 ± 0%   192.0 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/least-weighted_num_cols_1000_num_jobs_100000-32           192.0 ± 0%   192.0 ± 0%         ~ (p=1.000 n=10) ¹
geomean                                                          2.639Ki        192.0        -92.90%
¹ all samples are equal

                                                             │     old.txt     │                new.txt                │
                                                             │    allocs/op    │ allocs/op   vs base                   │
_Setting/consistent-hashing_num_cols_100_num_jobs_100-32            4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/consistent-hashing_num_cols_100_num_jobs_1000-32           4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/consistent-hashing_num_cols_100_num_jobs_10000-32          4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/consistent-hashing_num_cols_100_num_jobs_100000-32         4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/consistent-hashing_num_cols_1000_num_jobs_100-32           4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/consistent-hashing_num_cols_1000_num_jobs_1000-32          4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/consistent-hashing_num_cols_1000_num_jobs_10000-32         4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/consistent-hashing_num_cols_1000_num_jobs_100000-32        4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/per-node_num_cols_100_num_jobs_100-32                    222.000 ± 0%   4.000 ± 0%   -98.20% (p=0.000 n=10)
_Setting/per-node_num_cols_100_num_jobs_1000-32                  1122.000 ± 0%   4.000 ± 0%   -99.64% (p=0.000 n=10)
_Setting/per-node_num_cols_100_num_jobs_10000-32                10122.000 ± 0%   4.000 ± 0%   -99.96% (p=0.000 n=10)
_Setting/per-node_num_cols_100_num_jobs_100000-32              100122.000 ± 0%   4.000 ± 0%  -100.00% (p=0.000 n=10)
_Setting/per-node_num_cols_1000_num_jobs_100-32                  1185.000 ± 0%   4.000 ± 0%   -99.66% (p=0.000 n=10)
_Setting/per-node_num_cols_1000_num_jobs_1000-32                 2085.000 ± 0%   4.000 ± 0%   -99.81% (p=0.000 n=10)
_Setting/per-node_num_cols_1000_num_jobs_10000-32               11085.000 ± 0%   4.000 ± 0%   -99.96% (p=0.000 n=10)
_Setting/per-node_num_cols_1000_num_jobs_100000-32             101085.000 ± 0%   4.000 ± 0%  -100.00% (p=0.000 n=10)
_Setting/least-weighted_num_cols_100_num_jobs_100-32                4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/least-weighted_num_cols_100_num_jobs_1000-32               4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/least-weighted_num_cols_100_num_jobs_10000-32              4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/least-weighted_num_cols_100_num_jobs_100000-32             4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/least-weighted_num_cols_1000_num_jobs_100-32               4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/least-weighted_num_cols_1000_num_jobs_1000-32              4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/least-weighted_num_cols_1000_num_jobs_10000-32             4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=10) ¹
_Setting/least-weighted_num_cols_1000_num_jobs_100000-32            4.000 ± 0%   4.000 ± 0%         ~ (p=1.000 n=10) ¹
geomean                                                             44.15        4.000        -90.94%

So consistent-hashing is slightly slower and per-node is much faster and uses much less memory.

@jaronoff97
Copy link
Contributor

WOW! yeah that per-node allocation is 🔥 great stuff.

Copy link
Contributor

@jaronoff97 jaronoff97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely in favor of these, but I would love a second look from @kristinapathak or @matej-g

Copy link
Contributor

@kristinapathak kristinapathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! 🥳 Left a few very minor comments.

cmd/otel-allocator/allocation/allocator.go Outdated Show resolved Hide resolved
TargetsPerCollector.WithLabelValues(item.CollectorName, a.strategy.GetName()).Set(float64(c.NumTargets))
}
delete(a.targetItems, k)
delete(a.targetItemsPerJobPerCollector[item.CollectorName][item.JobName], item.Hash())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the last value in a.targetItemsPerJobPerCollector[item.CollectorName] for the job, we could probably also delete the job entry:

if len(a.targetItemsPerJobPerCollector[item.CollectorName][item.JobName]) == 0 {
    delete(a.targetItemsPerJobPerCollector[item.CollectorName], item.JobName)
}

but I don't think this has to be done in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up refactoring the deletion and unassignment routines a bit and did this as well.

for k, v := range allocator.collectors {
collectorsCopy[k] = v
func (s *leastWeightedStrategy) GetCollectorForTarget(collectors map[string]*Collector, item *target.Item) (*Collector, error) {
// if a collector is already assigned, do nothing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is sneaky but perfect for maintaining this functionality 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a follow up, I want to remove the CollectorName field from target.Item and instead track this in a separate map in the allocator. Then we can pass this map as an argument to this function and be explicit about the dependency.

@swiatekm-sumo
Copy link
Contributor Author

@jaronoff97 @kristinapathak I ended up making a fairly substantial change while addressing your feedback, so I re-requested reviews from both of you.

On a separate note, should I add a changelog entry for this change? It's technically a refactor, but I've changed enough that I think it'd be fair to alert users it happened. That would make it easier to attribute any bugs I might've introduced.

cmd/otel-allocator/allocation/allocator.go Outdated Show resolved Hide resolved
cmd/otel-allocator/allocation/allocator.go Outdated Show resolved Hide resolved
cmd/otel-allocator/allocation/allocator.go Outdated Show resolved Hide resolved
@jaronoff97
Copy link
Contributor

@swiatekm-sumo i think it's worth adding a changelog given all that's going on here. I also think it's worth mentioning the performance win in there as that's been some feedback we've gotten in the past few weeks. (maybe this issue in particular #2916)

cmd/otel-allocator/allocation/allocator.go Outdated Show resolved Hide resolved
err := a.addTargetToTargetItems(item)
if err != nil {
assignmentErrors = append(assignmentErrors, err)
item.CollectorName = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we do this here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting rid of dubious test failures, mostly. One of the problems with having CollectorName be an attribute of a target is that it's not clear if targets supplied in SetTargets can have that attribute set, and what that means. The real target pipeline never sets this, but some tests sneakily do. I set it here instead of fixing those tests, because I really wanted my changes to pass existing tests unmodified.

This is one of the things I want to fix in a follow-up.

Copy link

linux-foundation-easycla bot commented May 18, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@swiatekm-sumo swiatekm-sumo merged commit ddd9f34 into open-telemetry:main May 20, 2024
33 checks passed
@swiatekm-sumo swiatekm-sumo deleted the refactor-strategies branch May 20, 2024 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants