New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[jaeger-v2] Add remotesampling
extension
#5389
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5389 +/- ##
===========================================
- Coverage 94.54% 41.12% -53.42%
===========================================
Files 346 171 -175
Lines 16947 9091 -7856
===========================================
- Hits 16022 3739 -12283
- Misses 724 5018 +4294
- Partials 201 334 +133
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
I am thinking of this approach.:
That's how it works in the current Jaeger collector, but we are serving this API in the collector server. Here, we have to create a new one. suggestions |
One more question: How can I run hotrod with jaeger-v2 binary? |
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
We need a
We also need an
All of this should require very minimal implementation, primarily just wiring up the existing components already implemented in |
The existing component (from jaeger v1) are in blue: flowchart LR
Receiver --> AdaptiveSamplingProcessor --> BatchProcessor --> Exporter
Exporter -->|"(1) get storage"| JaegerStorageExension
Exporter -->|"(2) write trace"| TraceStorage
AdaptiveSamplingProcessor -->|"getStorage()"| StorageConfig
OTEL_SDK[OTEL
SDK]
OTEL_SDK -->|"(1) GET /sampling"| HTTP_endpoint
HTTP_endpoint -->|"(2) getStrategy()"| StrategiesProvider
style HTTP_endpoint fill:blue,color:white
subgraph Jaeger Collector
Receiver
BatchProcessor[Batch
Processor]
Exporter
TraceStorage[(Trace
Storage)]
AdaptiveSamplingProcessor[Adaptive
Sampling
Processor]
AdaptiveSamplingProcessorV1[Adaptive
Sampling
Processor_v1]
style AdaptiveSamplingProcessorV1 fill:blue,color:white
AdaptiveSamplingProcessor -->|"[]*model.Span"| AdaptiveSamplingProcessorV1
AdaptiveSamplingProcessorV1 ---|use| SamplingStorage
subgraph JaegerStorageExension[Jaeger Storage Exension]
Storage[[Storage
Config]]
end
subgraph RemoteSamplingExtension[Remote Sampling Extension]
StrategiesProvider -->|"(3b) getStrategy()"| AdaptiveProvider
StrategiesProvider -->|"(3a) getStrategy()"| FileProvider
FileProvider --> FileConfig
AdaptiveProvider --> StorageConfig
HTTP_endpoint[HTTP
endpoint]
StrategiesProvider[Strategies
Provider]
FileProvider[File
Provider]
AdaptiveProvider[Adaptive
Provider]
style StrategiesProvider fill:blue,color:white
style FileProvider fill:blue,color:white
style AdaptiveProvider fill:blue,color:white
subgraph Config
FileConfig[[File Config]]
StorageConfig[[Storage Config]]
end
StorageConfig --- SamplingStorage
SamplingStorage[(Sampling
Storage)]
style SamplingStorage fill:blue,color:white
end
end
|
remotesampling
extension
…acing#5382) This PR started as a dependabot upgrade of OTEL Collector dependencies, but the [tests were failing](https://github.com/jaegertracing/jaeger/actions/runs/8854720626/job/24318352528?pr=5382). The main change here is adding a ping from `e2eInitialize` to the jaeger-v2 binary to make sure that it started before proceeding with tests (ideally OTEL Collector should have a healthcheck endpoint that signals when all components have been started successfully). Also added some better logging. --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Yuri Shkuro <github@ysh.us> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Yuri Shkuro <github@ysh.us>
Bumps [anchore/sbom-action](https://github.com/anchore/sbom-action) from 0.15.10 to 0.15.11. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/anchore/sbom-action/releases">anchore/sbom-action's releases</a>.</em></p> <blockquote> <h2>v0.15.11</h2> <h2>Changes in v0.15.11</h2> <ul> <li>chore(deps): update Syft to v1.3.0 (<a href="https://redirect.github.com/anchore/sbom-action/issues/456">#456</a>) [<a href="https://github.com/anchore-actions-token-generator">anchore-actions-token-generator</a>]</li> <li>chore: remove outdated snapshot workflow (<a href="https://redirect.github.com/anchore/sbom-action/issues/457">#457</a>) [<a href="https://github.com/spiffcs">spiffcs</a>]</li> <li>fix: don't pass in a separate env. This makes it impossible to pass env vars via the action context to syft. (<a href="https://redirect.github.com/anchore/sbom-action/issues/455">#455</a>) [<a href="https://github.com/iNoahNothing">iNoahNothing</a>]</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/anchore/sbom-action/commit/7ccf588e3cf3cc2611714c2eeae48550fbc17552"><code>7ccf588</code></a> chore(deps): update Syft to v1.3.0 (<a href="https://redirect.github.com/anchore/sbom-action/issues/456">#456</a>)</li> <li><a href="https://github.com/anchore/sbom-action/commit/7f33cf5b409c25dd63ce12b34f585feaed60b5bf"><code>7f33cf5</code></a> chore: remove outdated snapshot workflow (<a href="https://redirect.github.com/anchore/sbom-action/issues/457">#457</a>)</li> <li><a href="https://github.com/anchore/sbom-action/commit/04a486a88617c017d26e11a4d30b3cd198f44824"><code>04a486a</code></a> fix: extend existing environment when invoking syft instead of creating a new...</li> <li>See full diff in <a href="https://github.com/anchore/sbom-action/compare/ab5d7b5f48981941c4c5d6bf33aeb98fe3bae38c...7ccf588e3cf3cc2611714c2eeae48550fbc17552">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=anchore/sbom-action&package-manager=github_actions&previous-version=0.15.10&new-version=0.15.11)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Description of the changes - Use consistent naming for config files ## How was this change tested? - CI Signed-off-by: Yuri Shkuro <github@ysh.us>
…gertracing#5399) ## Which problem is this PR solving? - Part of jaegertracing#5334 ## Description of the changes - An API design of Storage V2 spanstore and its proto file. - For the detailed discussion and how we arrived to this design, please take a look at https://docs.google.com/document/d/1IvUcDsdRxMPK-xTUE32w3NSAGTkUcmnDQttN6ijUnWs/edit?usp=sharing ## How was this change tested? - This PR hasn't been tested yet since it only contains interfaces and no actual implementation to be tested. ## Checklist - [x] I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md - [x] I have signed all commits - [ ] I have added unit tests for the new functionality - [x] I have run lint and test steps successfully - for `jaeger`: `make lint test` - for `jaeger-ui`: `yarn lint` and `yarn test` --------- Signed-off-by: James Ryans <james.ryans2012@gmail.com>
…5401) Bumps google.golang.org/protobuf from 1.33.0 to 1.34.0. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=google.golang.org/protobuf&package-manager=go_modules&previous-version=1.33.0&new-version=1.34.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Description of the changes - This is the first jaeger release where Jaeger UI changes are included in the release notes and where we have version parity between the two repos. Signed-off-by: Albert Teoh <albert@packsmith.io> Co-authored-by: Albert Teoh <albert@packsmith.io>
## Which problem is this PR solving? - part of jaegertracing#5345 ## Description of the changes - Added Purge method for ES/OS - optimized integration test for es/os storage ## How was this change tested? - `STORAGE=elasticsearch make storage-integration-test` ## Checklist - [x] I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md - [x] I have signed all commits - [x] I have added unit tests for the new functionality - [x] I have run lint and test steps successfully - for `jaeger`: `make lint test` - for `jaeger-ui`: `yarn lint` and `yarn test` --------- Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
## Which problem is this PR solving? - The cache was initialized with incorrect parameter, and when I looked around it wasn't even used anywhere ## Description of the changes - delete indexCache and indexCacheTTL argument ## How was this change tested? ``` $ go test ./plugin/storage/es/spanstore ok github.com/jaegertracing/jaeger/plugin/storage/es/spanstore 0.471s ``` Signed-off-by: Yuri Shkuro <github@ysh.us>
…racing#5410) ## Which problem is this PR solving? - Resolves jaegertracing#5350 ## Description of the changes - Added the logic same in the docker-image CI to build linux/amd64 architecture only ## How was this change tested? - Not yet ## Checklist - [x] I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md - [x] I have signed all commits - [x] I have added unit tests for the new functionality - [x] I have run lint and test steps successfully - for `jaeger`: `make lint test` - for `jaeger-ui`: `yarn lint` and `yarn test` Signed-off-by: Vamshi Maskuri <gwcchintu@gmail.com>
## Which problem is this PR solving? - Objective is to cut down on the number of CI failures that are often due to codecov uploads failing - Resolves jaegertracing#5173 - Supersedes and closes jaegertracing#5184 ## Description of the changes - Add a new helper action which encapsulates defining a CODECOV token and uploading with retries ## How was this change tested? - CI --------- Signed-off-by: Yuri Shkuro <github@ysh.us>
…ertracing#5345) ## Which problem is this PR solving? - part of jaegertracing#5254 ## Description of the changes - Utilizing existing `StorageIntegration` to test the jaeger-v2 OTel Collector and gRPC storage backend with the provided config file at `cmd/jaeger/config-elasticsearch.yaml`. ## How was this change tested? - Start a elasticsearch or opensearch docker instance. - Run `STORAGE=elasticsearch SPAN_STORAGE_TYPE=elasticsearch make jaeger-v2-storage-integration-test` ## Checklist - [x] I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md - [x] I have signed all commits - [x] I have added unit tests for the new functionality - [x] I have run lint and test steps successfully - for `jaeger`: `make lint test` - for `jaeger-ui`: `yarn lint` and `yarn test` --------- Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
## Which problem is this PR solving? - jaegertracing#5398 (comment) ## Description of the changes - added purge for method for cassandra ## How was this change tested? - via integration tests ## Checklist - [x] I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md - [x] I have signed all commits - [x] I have added unit tests for the new functionality - [x] I have run lint and test steps successfully - for `jaeger`: `make lint test` - for `jaeger-ui`: `yarn lint` and `yarn test` --------- Signed-off-by: Harshvir Potpose <hpotpose62@gmail.com> Signed-off-by: Yuri Shkuro <github@ysh.us> Co-authored-by: Yuri Shkuro <github@ysh.us>
## Which problem is this PR solving? - Previous PR did not include markup and the diagram is not rendering correctly ## Description of the changes - Add markup Signed-off-by: Yuri Shkuro <github@ysh.us>
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
i haven;t pushed latest changes yet. |
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, in the right direction
cmd/jaeger/internal/extension/remotesampling/testdata/strategy.json
Outdated
Show resolved
Hide resolved
Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com>
return nil | ||
} | ||
|
||
if j.cfg.Adaptive.StrategyStore != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these if()s are getting very busy, I recommend extracting their content into helper functions
} | ||
|
||
func (tp *traceProcessor) start(_ context.Context, host component.Host) error { | ||
ss, err := remotesampling.GetSamplingStorageFactory(tp.config.StrategyStore, host) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're only using ss
to call ss.CreateSamplingStore(maxBuckets)
. Do you actually need remotesampling.GetSamplingStorageFactory
, or perhaps only remotesampling.GetSamplingStorage
(without maxBuckets argument)? This way you would not need to repeat maxBuckets
config in the processor, only in the remotesampling extension
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, please review
## Which problem is this PR solving? - part of #5389 ## Description of the changes - processor is co-located in strategy_store and aggregator. - In aggregator to run `generateStrategyResponses`, `runCalculationLoop`. - In strategy_store to run `loadProbabilities`, `runUpdateProbabilitiesLoop` ## How was this change tested? - `make test` ## Checklist - [x] I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md - [x] I have signed all commits - [x] I have added unit tests for the new functionality - [x] I have run lint and test steps successfully - for `jaeger`: `make lint test` - for `jaeger-ui`: `yarn lint` and `yarn test` --------- Signed-off-by: Pushkar Mishra <pushkarmishra029@gmail.com> Signed-off-by: pushkarm029 <pushkarmishra029@gmail.com>
Signed-off-by: pushkarm029 <pushkarmishra029@gmail.com>
Signed-off-by: pushkarm029 <pushkarmishra029@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do you plan to test it?
) | ||
|
||
var ( | ||
errNoSource = errors.New("no sampling strategy specified, has to be either 'adaptive' or 'static'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we use file
now instead of static
?
var componentType = component.MustNewType("remote_sampling") | ||
|
||
const ( | ||
HTTPPort = 12345 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have constants for ports defined in the /ports
package. Don't we already have one for sampling endpoint?
// Delay is the amount of time to delay probability generation by, ie. if the CalculationInterval | ||
// is 1 minute, the number of buckets is 10, and the delay is 2 minutes, then at one time | ||
// we'll have [now()-12m,now()-2m] range of throughput data in memory to base the calculations | ||
// off of. This delay is necessary to counteract the rate at which the jaeger clients poll for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// off of. This delay is necessary to counteract the rate at which the jaeger clients poll for | |
// off of. This delay is necessary to counteract the rate at which the SDKs poll for |
import "time" | ||
|
||
type Config struct { | ||
StrategyStore string `mapstructure:"strategy_store"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StrategyStore string `mapstructure:"strategy_store"` | |
// name of the strategy storage defined in the jaegerstorage extension | |
StrategyStore string `mapstructure:"strategy_store"` |
is that correct?
@@ -129,6 +129,23 @@ func (a *aggregator) RecordThroughput(service, operation string, samplerType spa | |||
} | |||
} | |||
|
|||
func RecordThroughput(agg strategystore.Aggregator, span *span_model.Span, logger *zap.Logger) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does this need to change in this PR only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, creating a new PR.
|
||
const defaultResourceName = "sampling_store_leader" | ||
|
||
type RsExtension struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- does it need to be a public type?
- there are no other extensions in this package, no need to namespace it with Rs
} | ||
|
||
func (ext *RsExtension) Start(ctx context.Context, host component.Host) error { | ||
if ext.Cfg.File.Path != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function is very busy and it's easy to miss that it is also starting a server. I suggest moving file/adaptive intialization into separate helper functions
ext.strategyStore = ss | ||
} | ||
|
||
if err := ext.startHTTPServer(ctx, host); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think HTTP is not enough, Jaeger-v1 also has gRPC server that is used by some OTEL SDKs. In fact, I would check if HTTP is used by any OTEL SDK, maybe they all use gRPC.
httpMux := http.NewServeMux() | ||
|
||
handler := clientcfghttp.NewHTTPHandler(clientcfghttp.HTTPHandlerParams{ | ||
ConfigManager: &clientcfghttp.ConfigManager{SamplingStrategyStore: ext.strategyStore}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ConfigManager: &clientcfghttp.ConfigManager{SamplingStrategyStore: ext.strategyStore}, | |
ConfigManager: &clientcfghttp.ConfigManager{ | |
SamplingStrategyStore: ext.strategyStore, | |
}, |
|
||
ext.telemetry.Logger.Info("Starting HTTP server", zap.String("endpoint", ext.Cfg.HTTP.ServerConfig.Endpoint)) | ||
var hln net.Listener | ||
if hln, err = ext.Cfg.HTTP.ServerConfig.ToListener(ctx); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does OTEL collector ever allow multiple handlers to be attached to a single server? In Jaeger-v1 we often reused many ports for different services, e.g. for both submitting traces into backend and for retrieving sampling strategies.
Description of the changes
How was this change tested?
go run -tags=ui ./cmd/jaeger --config ./cmd/jaeger/config-badger.yaml
make test
Checklist
jaeger
:make lint test
jaeger-ui
:yarn lint
andyarn test