Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow otlp clients to use existing grpc connection #2002

Merged
merged 2 commits into from Oct 11, 2021

Conversation

tonistiigi
Copy link
Contributor

This is needed when the same endpoint is already reused for other gRPC services.

Signed-off-by: Tonis Tiigi tonistiigi@gmail.com

@Aneurysm9
Copy link
Member

How does this interact with the reconnection logic? It looks like this would prevent a reconnection attempt from ever happening.

@tonistiigi
Copy link
Contributor Author

How does this interact with the reconnection logic? It looks like this would prevent a reconnection attempt from ever happening.

Iiuc

disconnectedCh only gets called on SetStateDisconnected that only gets called if c.connect returns an error. Only way c.connect returns and error is if c. dialToCollector returns an error. With this change dialToCollector can not return error if GRPCConn is set. Another case is that SetStateDisconnected is manually called on Export error. Imho this is not correct as gRPC internally redials on connection drop but if we are not going to make changes in that atm. how about I change *grpc.ClientConn to GRPCDialer func(target string, opts ...DialOption) (*ClientConn, error). Then this function defines when new grpc conns are created or can choose to ignore it and let grpc library handle reconnection.

@paivagustavo
Copy link
Member

I do have a feeling that we have an unnecessary complexity with our reconnection logic implementation. As @tonistiigi pointed out, GPRC already handles reconnection.

I'm curious to know why did we need to implement that in the first place, I would love to simply delete that reconnection logic if we don't actually need. That would be a good thing to do before accepting the proposed changes of this PR.

@Aneurysm9
Copy link
Member

It looks like the background reconnection logic has been there since the exporter was initially created in #497. I'm honestly not sure whether it is required or not. Do we have an easy way of figuring it out, one way or the other?

@codecov
Copy link

codecov bot commented Jun 14, 2021

Codecov Report

Merging #2002 (9e0c53c) into main (d3f1ea2) will decrease coverage by 0.1%.
The diff coverage is 0.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #2002     +/-   ##
=======================================
- Coverage   72.5%   72.4%   -0.2%     
=======================================
  Files        168     168             
  Lines      11874   11888     +14     
=======================================
- Hits        8611    8608      -3     
- Misses      3029    3045     +16     
- Partials     234     235      +1     
Impacted Files Coverage Δ
.../otlp/otlpmetric/internal/connection/connection.go 3.4% <0.0%> (-0.1%) ⬇️
...ers/otlp/otlpmetric/internal/otlpconfig/options.go 65.8% <ø> (ø)
...xporters/otlp/otlpmetric/otlpmetricgrpc/options.go 75.0% <0.0%> (-8.4%) ⬇️
...s/otlp/otlptrace/internal/connection/connection.go 14.6% <0.0%> (-1.9%) ⬇️
...ters/otlp/otlptrace/internal/otlpconfig/options.go 72.9% <ø> (ø)
exporters/otlp/otlptrace/otlptracegrpc/options.go 75.0% <0.0%> (-8.4%) ⬇️

@MadVikingGod
Copy link
Contributor

How does this interact with a connection that has been instrumented?

I haven't tested it but it looks like we are enabling a path for loops to happen. I understand that this could be fixed by a future OTEP, but I would think it might be wise to wait to add this feature when those loops are preventable.

@tonistiigi
Copy link
Contributor Author

tonistiigi commented Jun 15, 2021

when those loops are preventable.

What do you mean by that?

As the caller passes the conn, it is its responsibility to pass a correct one that avoids the loops. In the code where I'm using it moby/buildkit@f2d9f02#diff-bd3a55a72186f59e2e63efb4951573b2f9e4a7cc98086e922b0859f8ccc1dd09R243, I am actually tracing the grpc connection itself. So I made sure to create it with an interceptor filter. Do you mean that this should be just simpler (eg. in opentracing there is a builtin method but in opentelemetry it needs to be done manually)? If that is the case I agree that otel-contrib should have a filter method as well and willing to contribute it but I don't think it is blocking.

@MrAlias
Copy link
Contributor

MrAlias commented Jul 22, 2021

It looks like the background reconnection logic has been there since the exporter was initially created in #497. I'm honestly not sure whether it is required or not. Do we have an easy way of figuring it out, one way or the other?

I'm pretty sure this was inherited from OpenCensus: https://github.com/census-ecosystem/opencensus-go-exporter-ocagent/blob/master/connection.go

I do have a feeling that we have an unnecessary complexity with our reconnection logic implementation. As @tonistiigi pointed out, GPRC already handles reconnection.

I'm curious to know why did we need to implement that in the first place, I would love to simply delete that reconnection logic if we don't actually need. That would be a good thing to do before accepting the proposed changes of this PR.

I'm guessing we might have inherited the code from a time before gRPC handled this. I agree we should validate and remove it if is not used.

@thaJeztah
Copy link

@tonistiigi needs a rebase

Copy link
Member

@pellared pellared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first impression is that the existing reconnection logic is not needed (and can be even harmful). I have not looked into the details.

Thanks to this PR the user can use things like:

  1. retry middleware from https://github.com/grpc-ecosystem/go-grpc-middleware
  2. "vanilla" gRPC Go retry feature: https://github.com/grpc/grpc-go/tree/master/examples/features/retry

@thaJeztah
Copy link

@tonistiigi needs a rebase (again)

@pellared @Aneurysm9 @MrAlias anything left to do on this one (besides the rebase)? Trying to get rid of (what we hoped to be a short-lived) fork in BuildKit; https://github.com/moby/buildkit/blob/d211b5a50553ad12b7c6c8012e9212ce033beda6/go.mod

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
@MrAlias MrAlias merged commit c71afaf into open-telemetry:main Oct 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants