Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/go/ssa: commit 75ff53bc introduced a race condition #67079

Open
aykevl opened this issue Apr 27, 2024 · 8 comments
Open

x/tools/go/ssa: commit 75ff53bc introduced a race condition #67079

aykevl opened this issue Apr 27, 2024 · 8 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Tools This label describes issues relating to any tools in the x/tools repository.
Milestone

Comments

@aykevl
Copy link

aykevl commented Apr 27, 2024

Go version

go version go1.22.2 linux/arm64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/home/ayke/.cache/go-build'
GOENV='/home/ayke/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/ayke/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/ayke'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go1.22.2'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go1.22.2/pkg/tool/linux_arm64'
GOVCS=''
GOVERSION='go1.22.2'
GCCGO='gccgo'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/home/ayke/src/tinygo/tinygo/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build553721638=/tmp/go-build -gno-record-gcc-switches'

What did you do?

I upgraded the x/tools package to v0.17.0.

What did you see happen?

Lots of infrequent crashes starting from golang/tools@75ff53b (see tinygo-org/tinygo#4206).
The symptoms are all kinds of weird crashes, but in particular what I'm seeing often is functions that appear to be unfinished (no terminator in a basic block).

Interestingly, once I build the program using -race the race condition goes away. This makes it particularly difficult to pinpoint where exactly the bug is located.

I've also tried the master branch (golang/tools@0b45163) but sadly it still shows the same crashes.

I can try to work on a standalone reproducer, but with the nature of this crash it's difficult to find one. Still, if it's difficult to find the culprit just from the bisected commit I can try to work on a reproducer.

What did you expect to see?

No race condition.

@gopherbot gopherbot added the Tools This label describes issues relating to any tools in the x/tools repository. label Apr 27, 2024
@gopherbot gopherbot added this to the Unreleased milestone Apr 27, 2024
@adonovan
Copy link
Member

adonovan commented May 3, 2024

Hi, maintainer of go/ssa and author of https://go.dev/cl/538358 here. I don't have a good hypothesis yet: that CL looks sound w.r.t. concurrency, as does the same code at master. Interestingly, tinygo does not appear to use Program.RuntimeTypes or ssautil.AllFunctions (which is good: these functions should be deprecated!), which were the primary focus of that CL. The tinygo logic to construct go/ssa code looks generally sound too.

Can you share the test case you used for bisecting to this CL? Even if it takes hundreds of runs to reliably fail, that would make it much easier to investigate.

@timothy-king
Copy link
Contributor

@aykevl It looks like ssa.Package.Build is called with some synchronization between packages? I don't fully understand the synchronization though.

https://github.com/tinygo-org/tinygo/blob/release/builder/build.go#L350
https://github.com/tinygo-org/tinygo/blob/release/builder/jobs.go#L123

A lack of synchronization could be a source of races once generics are involved. (ssa.Program.Build solves this.) I don't know why https://go.dev/cl/538358 would be involved.

A test case would really help.

@adonovan
Copy link
Member

adonovan commented May 3, 2024

@aykevl It looks like ssa.Package.Build is called with some synchronization between packages? I don't fully understand the synchronization though.

https://github.com/tinygo-org/tinygo/blob/release/builder/build.go#L350
https://github.com/tinygo-org/tinygo/blob/release/builder/jobs.go#L123

A lack of synchronization could be a source of races once generics are involved. (ssa.Program.Build solves this.) I don't know why https://go.dev/cl/538358 would be involved.

It's quite safe to go/ssa-Build all Packages in parallel--that's exactly what Program.Build does--so that doesn't explain it.

tinygo seems to use concurrency for building and compiling each SSA package in parallel while respecting topological order, which seems sound. (There's more available parallelism of course, in reading, parsing, type-checking, etc.) The only synchronization within each compile job (package) is a cross-process lock on the output file, presumably to prevent concurrent writes from different tinygo processes. There's no fine-grained concurrency, so the lack of Mutex.Lock operations isn't obviously a problem.

I wonder about the fact that some functions in low-level packages of intrinsics (runtime et al) are used from all other packages (e.g. when compiling ssa.Go); if the intrinsics package is not fully llvm-built before the other packages this could lead to a race.

A test case would really help.

Agreed.

@timothy-king
Copy link
Contributor

timothy-king commented May 3, 2024

It's quite safe to go/ssa-Build all Packages in parallel--that's exactly what Program.Build does--so that doesn't explain it.

ssa is very careful to only read from specific fixed fields on functions from other packages or created functions while building. Reading from generic instantiations before they are done can race. Waiting for every package to finish building is good enough, which is what ssa.Program does. Annoyingly diamonds can cause the same instantiation to be created so package topology is not enough. This definitely used to be the case, and I don't remember fixing this. (My memory might be off though.)

@cherrymui cherrymui added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 6, 2024
@deadprogram
Copy link

if the intrinsics package is not fully llvm-built before the other packages this could lead to a race.

That sounds very much like the behavior I have observed anecdotally. However, I do not have a reproducible test case.

@aykevl
Copy link
Author

aykevl commented May 13, 2024

The way I can reproduce this issue in TinyGo is as follows:

  1. Build TinyGo using a system-installed LLVM (I haven't tried with a self-built LLVM, it probably also works). This is Linux and macOS only at the moment. Guide: https://tinygo.org/docs/guides/build/. Importantly, use go install to build (without -race, and not make to run the build).
  2. cd tests/os/smoke
  3. while true; rm ~/.cache/tinygo/pkg-*; tinygo test -c -target=pybadge; end to try the build until it fails (this is the fish shell, bash will have a slightly different syntax).

Normally it prints lines like this:

FAIL    github.com/tinygo-org/tinygo/tests/os/smoke     0.000s

But sometimes it will result in a crash, like this (though there are many forms this crash can take, this is a common one):

panic: runtime error: index out of range [0] with length 0

goroutine 180 [running]:
github.com/tinygo-org/tinygo/compiler.(*builder).createFunctionStart(0x400c88bc80, 0x0)
        /home/ayke/src/tinygo/tinygo/compiler/compiler.go:1207 +0xe44
github.com/tinygo-org/tinygo/compiler.(*builder).createFunction(0x400c88bc80)
        /home/ayke/src/tinygo/tinygo/compiler/compiler.go:1296 +0x2c
github.com/tinygo-org/tinygo/compiler.(*compilerContext).getFunction(0x4010faf7c0, 0x400d1221a0)
        /home/ayke/src/tinygo/tinygo/compiler/symbol.go:222 +0xc08
github.com/tinygo-org/tinygo/compiler.(*compilerContext).getTypeMethodSet(0x4010faf7c0, {0x920408, 0x400d9a72d0})
        /home/ayke/src/tinygo/tinygo/compiler/interface.go:619 +0x1d4
github.com/tinygo-org/tinygo/compiler.(*compilerContext).getTypeCode(0x4010faf7c0, {0x920408, 0x400d9a72d0})
        /home/ayke/src/tinygo/tinygo/compiler/interface.go:417 +0x2b64
github.com/tinygo-org/tinygo/compiler.(*compilerContext).getTypeCode(0x4010faf7c0, {0x9202f0, 0x40001ba540})
        /home/ayke/src/tinygo/tinygo/compiler/interface.go:291 +0x226c
github.com/tinygo-org/tinygo/compiler.(*compilerContext).getTypeCode(0x4010faf7c0, {0x9203e0, 0x40039551d0})
        /home/ayke/src/tinygo/tinygo/compiler/interface.go:394 +0x3268
github.com/tinygo-org/tinygo/compiler.(*compilerContext).getTypeCode(0x4010faf7c0, {0x9202f0, 0x40001c67e0})
        /home/ayke/src/tinygo/tinygo/compiler/interface.go:292 +0x2288
github.com/tinygo-org/tinygo/compiler.(*compilerContext).getTypeCode(0x4010faf7c0, {0x920408, 0x40028d6e50})
        /home/ayke/src/tinygo/tinygo/compiler/interface.go:322 +0x2938
github.com/tinygo-org/tinygo/compiler.(*builder).createMakeInterface(0x400c88b380, {0x9243a8?}, {0x920408, 0x40028d6e50}, 0x7b7d80?)
        /home/ayke/src/tinygo/tinygo/compiler/interface.go:80 +0x6c
github.com/tinygo-org/tinygo/compiler.(*builder).createExpr(0x400c88b380, {0x923f70, 0x400d016550})
        /home/ayke/src/tinygo/tinygo/compiler/compiler.go:2217 +0x1838
github.com/tinygo-org/tinygo/compiler.(*builder).createInstruction(0x400c88b380, {0x925050, 0x400d016550})
        /home/ayke/src/tinygo/tinygo/compiler/compiler.go:1448 +0x89c
github.com/tinygo-org/tinygo/compiler.(*builder).createFunction(0x400c88b380)
        /home/ayke/src/tinygo/tinygo/compiler/compiler.go:1337 +0x704
github.com/tinygo-org/tinygo/compiler.(*compilerContext).createPackage(0x4010faf7c0, {0x4006a5e4e0?}, 0x4001bd4d00)
        /home/ayke/src/tinygo/tinygo/compiler/compiler.go:853 +0x3ac
github.com/tinygo-org/tinygo/compiler.CompilePackage({0x40003cfa50?, 0x58?}, 0x40003ea000, 0x4001bd4d00, {0x0?}, 0x87e50c?, 0x0?)
        /home/ayke/src/tinygo/tinygo/compiler/compiler.go:312 +0x3a0
github.com/tinygo-org/tinygo/builder.Build.func3(0x400da54780)
        /home/ayke/src/tinygo/tinygo/builder/build.go:378 +0x18c
github.com/tinygo-org/tinygo/builder.runJob(0x400da54780, 0x4010701800)
        /home/ayke/src/tinygo/tinygo/builder/jobs.go:222 +0x48
created by github.com/tinygo-org/tinygo/builder.runJobs in goroutine 23
        /home/ayke/src/tinygo/tinygo/builder/jobs.go:123 +0x458

I have tested this with the current dev branch, https://github.com/tinygo-org/tinygo/tree/1d9f26cee1f1501b09647186a25ba29aa6a0c58c.

@adonovan
Copy link
Member

An easy and helpful experiment would be to bracket each ssa.Package.Build operation for package p with log.Print("go/ssa start|end %s", p) statements, and do the same thing around the LLVM build operation. Then attach the log of those statements to the issue, along with the the output of go list -json -deps <mainpackage>, assuming you're comfortable sharing it.

That would allow us to see whether both SSA and LLVM build steps are respecting the topological order of the graph.

@aykevl
Copy link
Author

aykevl commented May 13, 2024

I am currently trimming down the list of packages that, when built together, trigger this race. That's rather time-consuming. Right now I'm down to this list:

"internal/testlog"
"encoding"
"sync"
"context"
"time"

So it's not related to the runtime package it seems.

@adonovan thanks for the suggestion! I'll try that once I've trimmed down the package list some more. Though I think I could also trigger the crash when I built all of the SSA at the start using program.Build() (instead of building each package individually). I'll try that again after trimming down the list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Tools This label describes issues relating to any tools in the x/tools repository.
Projects
None yet
Development

No branches or pull requests

6 participants