Skip to content
This repository has been archived by the owner on Dec 23, 2023. It is now read-only.

TimeLimitedHandler creates an executor on every invocation of export() and does not shut it down causing application shutdown to hang #2031

Open
wfhartford opened this issue Apr 14, 2020 · 3 comments
Labels

Comments

@wfhartford
Copy link

What version of OpenCensus are you using?

0.26.0

What JVM are you using (java -version)?

Tested on 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 and adoptopenjdk:14_36-jre-hotspot

What did you do?

Setup trace exporting to zipkin about like this:

      ZipkinTraceExporter.createAndRegister(
          ZipkinExporterConfiguration.builder()
              .setV2Url(zipkinUrl)
              .setServiceName(serverName)
              .setEncoder(SpanBytesEncoder.JSON_V2)
              .setDeadline(Duration.create(60, 0))
              .build()
      )

      val traceConfig = Tracing.getTraceConfig()
      val params = traceConfig.activeTraceParams
      traceConfig.updateActiveTraceParams(
          params.toBuilder()
              .setSampler(Samplers.alwaysSample())
              .build()
      )

Run some code which generates spans so that the io.opencensus.exporter.trace.util.TimeLimitedHandler.export() method gets called. Shutdown the exporter by calling

Tracing.getExportComponent().shutdown()

Exit the application by returning from a the main method.

What did you expect to see?

Application exits quickly.

What did you see instead?

Application hangs. Upon further investigation, the application is being kept alive by some number of non-daemon threads named pool-#-thread-1. I eventually tracked these threads down to the zipkin exporter and specifically, the TimeLimitedHandler class's export method:

  @Override
  public void export(final Collection<SpanData> spanDataList) {
    final Scope exportScope = newExportScope();
    try {
      TimeLimiter timeLimiter = SimpleTimeLimiter.create(Executors.newSingleThreadExecutor());
      timeLimiter.callWithTimeout(
          new Callable<Void>() {
            @Override
            public Void call() throws Exception {
              timeLimitedExport(spanDataList);
              return null;
            }
          },
          deadline.toMillis(),
          TimeUnit.MILLISECONDS);
    } catch (TimeoutException e) {
      handleException(e, "Timeout when exporting traces: " + e);
    } catch (InterruptedException e) {
      handleException(e, "Interrupted when exporting traces: " + e);
    } catch (Exception e) {
      handleException(e, "Failed to export traces: " + e);
    } finally {
      exportScope.close();
    }
  }

Every invocation of this method will creates a single thread executor for the SimpleTimeLimiter. The executor is never shut down, leaving its thread to idle for eternity.

@wfhartford wfhartford added the bug label Apr 14, 2020
@mariusoe
Copy link

Hi,

yesterday we encountered the same problem and were able to narrow it down to the above mentioned code regarding the continuous creation of Executors and SimpleTimeLimiters. In our case, it was caused by the Jaeger exporter instead of Zipkin.

The following exception is thrown:

[io.opencensus.trace.export.ExportComponent] (ExportComponent.ServiceExporterThread-0) Exception thrown by the service export io.opencensus.exporter.trace.jaeger.JaegerTraceExporter:
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:717)
        at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:957)
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1367)
        at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134)
        at java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:681)
        at com.google.common.util.concurrent.SimpleTimeLimiter.callWithTimeout(SimpleTimeLimiter.java:153)
        at io.opencensus.exporter.trace.util.TimeLimitedHandler.export(TimeLimitedHandler.java:86)
...

In our case, the appliaction has quite a lot memory available, so we assume that the unused Executors will not be garbage collected because a GC is not necessary at that point in time, thus, their threads are not freed leading to the problem that the operation system runs out of available native threads.

@mjh-c
Copy link

mjh-c commented Jun 21, 2020

Same problem here which causes applications not to exit.
This is a blocker for us. Is there any known workaround?

@mjh-c
Copy link

mjh-c commented Jun 22, 2020

It is easy to reproduce the problem:

Execute the official zipkin trace example class TracingToZipkin from https://opencensus.io/quickstart/java/tracing as standalone Java process (not within mvn exec:java as described in the page).

After execution the application hangs with a pool thread "pool-1-thread-1" still running.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants