Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Visualize uninstrumented services in the dependency diagrams #3804

Open
yurishkuro opened this issue Jul 8, 2022 · 8 comments · May be fixed by #5062
Open

[Feature]: Visualize uninstrumented services in the dependency diagrams #3804

yurishkuro opened this issue Jul 8, 2022 · 8 comments · May be fixed by #5062
Labels
enhancement help wanted Features that maintainers are willing to accept but do not have cycles to implement

Comments

@yurishkuro
Copy link
Member

Requirement

Visualize services in the dependency diagram even when they are not instrumented, but known from the caller side.

Problem

When the trace leaf nodes that represent outbound calls to uninstrumented services, those services are not shown in the dependency diagram (e.g. see how Zipkin shows them in #3803).

Proposal

Jaeger can infer that there is an existing callee service when the caller service logs a span with tag span.kind=client without the corresponding span.kind=server span.

There are several places in the code base where this will need to be accounted for:

  • in the in-memory storage used by all-in-one (the easiest to start with)
  • in the Flink/Spark jobs for production usage

Aside from changing the graph logic, another alternative is to have a trace enrichment which will add artificial server spans to the trace. Then the graph building logic would not need to change at all, and the inferred nodes could also be shown in the single-trace views.

Open questions

Deciding what to call the missing callee services can be tricky. We will need to implement a heuristic that derives the name from some of the tags of the client span:

  • based on OpenTracing semantic conventions
    • peer.service
    • peer.address
    • peer.ip? + peer.port
  • based on similar OTEL semantic conventions

There was a discussion in OTEL once about labeling the type of downstream service (e.g. an SQL db, etc), which could also be taken into account when naming the derived services.

@paule96
Copy link

paule96 commented Jul 9, 2022

Hi @yurishkuro,

I like your idea to just add it to the trace enrichment. Can you help me where to find the code of this? So I can try my luck to implement it? :)
From a first look into the code I didn't find the right place.

@yurishkuro
Copy link
Member Author

@paule96 the dependencies calculation done by all-in-one happens inside memory storage:

func (m *Store) GetDependencies(ctx context.Context, endTs time.Time, lookback time.Duration) ([]model.DependencyLink, error) {

It may not be very straightforward to combine this with the enricher idea because dependencies logic does not query for traces, it just accesses them directly.

@pranoyk
Copy link

pranoyk commented Jul 4, 2023

@yurishkuro I am planning to look into this issue. Will share my findings here and also create a WIP PR.
Also, do let me know if I should be aware of anything to make changes for this issue

@nidhey27
Copy link

Hi @yurishkuro,
I noticed that this issue is still open. Is it currently being worked on? If not, I would like to take this up and contribute a fix.
Please let me know if there are any requirements that I should be aware of to make changes for this issue.

@nidhey27
Copy link

nidhey27 commented Oct 13, 2023

Hi @yurishkuro,
I’ve been looking into this issue and have come up with a preliminary approach to address the visualization of uninstrumented services in the dependency diagram. I would appreciate your feedback and insights to ensure alignment with Jaeger’s design principles and performance expectations.

1. In-Memory Storage Modification:

  • Enhance the GetDependencies function to identify spans with span.kind=client that lack corresponding span.kind=server spans, indicating uninstrumented callee services.
  • Introduce a mechanism to create and add artificial server spans to represent these uninstrumented services in the in-memory store.
  • Ensure these artificial spans are considered in dependency calculations and visualizations.

2. Flink/Spark Jobs Modification (for Production Usage):

  • Implement a parallel logic within Flink/Spark jobs that mirrors the in-memory storage modification, ensuring uninstrumented services are visualized in production environments as well.

Alternative - Trace Enrichment:

  • Develop a separate process or module that can dynamically add artificial server spans to existing traces when uninstrumented callee services are detected. This would be an alternative to modifying the core dependency graph logic or the in-memory/Flink/Spark storage.
  • Ensure that these artificially added spans are seamlessly integrated into the existing visualization tools, so they appear naturally in the dependency diagrams and single-trace views.

I am eager to kick start the implementation upon your feedback and any additional insights or considerations that should be taken into account to align with Jaeger’s existing architecture.

@yurishkuro
Copy link
Member Author

@nidhey27 I don't completely follow your write-up. E.g. in step 1, are you presenting multiple options or think all 3 steps need to be done?

I wouldn't go with a trace enrichment approach. The basic logic in both implementations is to construct a tree of spans, and walk it while outputting parent-child links. In that algorithm, it's pretty easy to add an extra conditional branch to handle leaf CLIENT spans.

The other issue you will run into is figuring out the name of the destination service from the client span. Sometimes it may have a peer.service tag, but it may not. You will likely need to build a bit of heuristic to infer the destination from a combination of tags on the client span. This will probably require understanding a number of semantic conventions, e.g. to detect that it's a database call (and specifically which database). Unfortunately, this logic will be way more complicated than the piece I mentioned above, and even more unfortunately that we have independent Go and Java implementations for it.

@nidhey27
Copy link

@yurishkuro I had initially set my sights on inserting artificial server spans into the in-memory store to represent those uninstrumented services, and then bringing these artificial spans into play during dependency calculations and visualizations.

However, having absorbed your insights, I’ve pivoted my approach for something a bit more streamlined and, dare I say, elegant.

  • Modify the existing algorithm to precisely pinpoint leaf CLIENT spans lacking corresponding SERVER spans.
  • Inject a conditional branch to process these identified CLIENT spans seamlessly, ensuring the visualization of uninstrumented services is accurate and efficient.

As for the heuristic approach to nail down the destination service name, that’s still a work in progress - I’m exploring and weighing my options there.

Could you please confirm if this approach is in sync with your expectations, or is there room for some tweaks?

@yurishkuro
Copy link
Member Author

approach makes sense to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement help wanted Features that maintainers are willing to accept but do not have cycles to implement
Projects
None yet
4 participants