Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BugReport.handleCrash in updateThread hangs on currentThread().join() #22051

Closed
werkt opened this issue Apr 18, 2024 · 2 comments
Closed

BugReport.handleCrash in updateThread hangs on currentThread().join() #22051

werkt opened this issue Apr 18, 2024 · 2 comments
Labels
help wanted Someone outside the Bazel team could own this P2 We'll consider working on this in future. (Assignee optional) team-CLI Console UI type: bug

Comments

@werkt
Copy link
Contributor

werkt commented Apr 18, 2024

Description of the bug:

UiEventHandler's updateThread will attempt to uninterruptiblyJoin() itself when an uncaught exception occurs in conjunction with BlazeRuntime's BugReport handler. The guava join util has no detection of the deadlock that occurs when a thread joins itself.

The trace is from 6.2.0, but the lines are updated here to reflect their master position, where this is still possible:

"cli-update-thread" #2223 daemon prio=5 os_prio=0 cpu=357.24ms elapsed=14822.22s tid=0x00007fc8b09a2000 nid=0x3da066 in Object.wait()  [0x00007fc7452d5000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(java.base@11.0.6/Native Method)
        - waiting on <no object reference available>
        at java.lang.Thread.join(java.base@11.0.6/Unknown Source)
        - waiting to re-lock in wait() <0x00007fca0d4b0920> (a java.lang.Thread)
        at java.lang.Thread.join(java.base@11.0.6/Unknown Source)
        at com.google.common.util.concurrent.Uninterruptibles.joinUninterruptibly(Uninterruptibles.java:162)
        at com.google.devtools.build.lib.runtime.UiEventHandler.stopUpdateThread(UiEventHandler.java:964)

Uninterruptibles.joinUninterruptibly(threadToWaitFor);

Where threadToWaitFor == Thread.currentThread()

        at com.google.devtools.build.lib.runtime.UiEventHandler.handle(UiEventHandler.java:464)

        at com.google.devtools.build.lib.events.Reporter.handle(Reporter.java:127)

Line matches

        at com.google.devtools.build.lib.bugreport.BugReport.handleCrash(BugReport.java:274)

ctx.getEventHandler().handle(Event.fatal(crashMsg));

Note that the deadlock detection above this has not triggered

        - locked <0x00007fc9cd132f70> (a java.lang.Object)
        at com.google.devtools.build.lib.bugreport.BugReport.handleCrash(BugReport.java:231)

handleCrash(Crash.from(throwable), CrashContext.halt().withArgs(args));

        at com.google.devtools.build.lib.runtime.BlazeRuntime.lambda$newRuntime$4(BlazeRuntime.java:1217)

        at com.google.devtools.build.lib.runtime.BlazeRuntime$$Lambda$90/0x00007fc94a36fc40.handleException(Unknown Source)
        at com.google.devtools.build.lib.runtime.BlazeRuntime.lambda$newRuntime$5(BlazeRuntime.java:1227)

(thread, throwable) -> subscriberExceptionHandler.handleException(throwable, null));

        at com.google.devtools.build.lib.runtime.BlazeRuntime$$Lambda$91/0x00007fc94a36f458.uncaughtException(Unknown Source)
        at java.lang.ThreadGroup.uncaughtException(java.base@11.0.6/Unknown Source)
        at java.lang.ThreadGroup.uncaughtException(java.base@11.0.6/Unknown Source)
        at java.lang.Thread.dispatchUncaughtException(java.base@11.0.6/Unknown Source)

Which category does this issue belong to?

No response

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Throw an uncaught exception in the updateThread "cli-update-thread" with the Thread.setDefaultUncaughtExceptionHandler -> BugReport.handleCrash sequence established in BlazeRuntime.java

Which operating system are you running Bazel on?

linux

What is the output of bazel info release?

6.2.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

Original trace:

"cli-update-thread" #2223 daemon prio=5 os_prio=0 cpu=357.24ms elapsed=14822.22s tid=0x00007fc8b09a2000 nid=0x3da066 in Object.wait()  [0x00007fc7452d5000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(java.base@11.0.6/Native Method)
        - waiting on <no object reference available>
        at java.lang.Thread.join(java.base@11.0.6/Unknown Source)
        - waiting to re-lock in wait() <0x00007fca0d4b0920> (a java.lang.Thread)
        at java.lang.Thread.join(java.base@11.0.6/Unknown Source)
        at com.google.common.util.concurrent.Uninterruptibles.joinUninterruptibly(Uninterruptibles.java:162)
        at com.google.devtools.build.lib.runtime.UiEventHandler.stopUpdateThread(UiEventHandler.java:964)
        at com.google.devtools.build.lib.runtime.UiEventHandler.handle(UiEventHandler.java:464)
        at com.google.devtools.build.lib.events.Reporter.handle(Reporter.java:127)
        at com.google.devtools.build.lib.bugreport.BugReport.handleCrash(BugReport.java:274)
        - locked <0x00007fc9cd132f70> (a java.lang.Object)
        at com.google.devtools.build.lib.bugreport.BugReport.handleCrash(BugReport.java:231)
        at com.google.devtools.build.lib.runtime.BlazeRuntime.lambda$newRuntime$4(BlazeRuntime.java:1217)
        at com.google.devtools.build.lib.runtime.BlazeRuntime$$Lambda$90/0x00007fc94a36fc40.handleException(Unknown Source)
        at com.google.devtools.build.lib.runtime.BlazeRuntime.lambda$newRuntime$5(BlazeRuntime.java:1227)
        at com.google.devtools.build.lib.runtime.BlazeRuntime$$Lambda$91/0x00007fc94a36f458.uncaughtException(Unknown Source)
        at java.lang.ThreadGroup.uncaughtException(java.base@11.0.6/Unknown Source)
        at java.lang.ThreadGroup.uncaughtException(java.base@11.0.6/Unknown Source)
        at java.lang.Thread.dispatchUncaughtException(java.base@11.0.6/Unknown Source)
@tjgq
Copy link
Contributor

tjgq commented Apr 23, 2024

It sounds like we could fix this by checking whether we're joining with the current thread. Could you send a PR?

@werkt
Copy link
Contributor Author

werkt commented Apr 25, 2024

It sounds like we could fix this by checking whether we're joining with the current thread. Could you send a PR?

I certainly can. Hopefully there's no chance that we're stopUpdateThread-ing when in the updateThread where we aren't handling a FATAL, otherwise I don't know if it will shut down...

werkt added a commit to werkt/bazel that referenced this issue Apr 25, 2024
If the cli-update-thread is crashing, it may attempt to interrupt and
join on itself. Hopefully no updateThread could be in stopUpdateThread
without going through handleCrash() -> Event.FATAL sequence through
BlazeRuntime.

Fixes bazelbuild#22051
@meisterT meisterT added P2 We'll consider working on this in future. (Assignee optional) help wanted Someone outside the Bazel team could own this and removed untriaged labels May 14, 2024
bazel-io pushed a commit to bazel-io/bazel that referenced this issue May 14, 2024
If the cli-update-thread is crashing, it may attempt to interrupt and join on itself. Hopefully no updateThread could be in stopUpdateThread without going through handleCrash() -> Event.FATAL sequence through BlazeRuntime.

Fixes bazelbuild#22051

Closes bazelbuild#22122.

PiperOrigin-RevId: 633653817
Change-Id: Iaef5df56358d74bd7210ad8cb3562b452de9eb6a
github-merge-queue bot pushed a commit that referenced this issue May 14, 2024
If the cli-update-thread is crashing, it may attempt to interrupt and
join on itself. Hopefully no updateThread could be in stopUpdateThread
without going through handleCrash() -> Event.FATAL sequence through
BlazeRuntime.

Fixes #22051

Closes #22122.

PiperOrigin-RevId: 633653817
Change-Id: Iaef5df56358d74bd7210ad8cb3562b452de9eb6a

Commit
6306240

Co-authored-by: George Gensure <werkt0@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Someone outside the Bazel team could own this P2 We'll consider working on this in future. (Assignee optional) team-CLI Console UI type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants