Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Earthly randomly hangs on Init step in v0.6.24+ #2247

Closed
vladaionescu opened this issue Oct 5, 2022 · 11 comments · Fixed by earthly/homebrew-earthly#20
Closed

Earthly randomly hangs on Init step in v0.6.24+ #2247

vladaionescu opened this issue Oct 5, 2022 · 11 comments · Fixed by earthly/homebrew-earthly#20
Assignees
Labels
type:bug Something isn't working

Comments

@vladaionescu
Copy link
Member

vladaionescu commented Oct 5, 2022

SIGABRT stack trace: https://gist.github.com/jazzdan/505c9211c30efc533ab687831a169306

The stack trace shows hanging on docker info. Though that doesn't make a lot of sense.

The fact that this does not reproduce in v0.6.23 seems to point to this possible change: #2207.

EDIT: A workaround has been identified: brew uninstall earthly && brew install earthly/earthly/earthly. This only happens on the homebrew core version of Earthly, but not on the version from our own tap.

@vladaionescu vladaionescu added the type:bug Something isn't working label Oct 5, 2022
@vladaionescu vladaionescu self-assigned this Oct 5, 2022
@vladaionescu
Copy link
Member Author

vladaionescu commented Oct 6, 2022

Trying to emulate @jazzdan's usage (running Earthly via Tilt, which kills it frequently) - I tried running earthly in a loop and killing it randomly. I could never get it to hang. @jazzdan's docker version is the same as mine and we both use M1.

This is the script I used:

#!/bin/bash

set -eux -o pipefail

earthly=./build/darwin/arm64/earthly

one_test() {
    it="$1"
    echo "Iteration $it"
    "$earthly" --no-output +for-darwin-m1 &
    pid=$!
    sleep $(( ( RANDOM % 10 ) + 0 ))
    kill "$pid" || true
}

for i in {1..100}; do
    one_test "$i" &
done

wait

@vladaionescu
Copy link
Member Author

This script does cause docker info to time out:

package main

import (
	"context"
	"fmt"
	"os"
	"os/exec"
	"strings"
	"time"
)

type commmandContextOutput struct {
	stdout strings.Builder
	stderr strings.Builder
}

func command(ctx context.Context, suffix int) {
	output := &commmandContextOutput{}
	cmd := exec.CommandContext(ctx, "docker", "info")
	cmd.Env = os.Environ() // Ensure all shellouts are using the current environment, picks up DOCKER_/PODMAN_ env vars when they matter
	cmd.Stdout = &output.stdout
	cmd.Stderr = &output.stderr

	err := cmd.Run()
	if err != nil {
		out := output.stdout.String()
		e := output.stderr.String()
		if false {
			fmt.Printf("%s, %s", out, e)
		}
		// fmt.Printf("error during run: %s\n", err)
	} else {
		// fmt.Printf("got info %d\n", suffix)
	}
}

func forever() {
	var i = 0
	for {
		i++
		ctx, _ := context.WithTimeout(context.Background(), time.Minute*5)
		go func() {
			<-ctx.Done()
			if ctx.Err() != nil {
				fmt.Printf("context error: %s\n", ctx.Err())
			}
		}()
		command(ctx, i)
	}
}

func main() {
	for i := 0; i < 200; i++ {
		go forever()
	}
	forever()
}

But not this bash script

for i in {1..100000}; do
    docker info &
done

@vladaionescu
Copy link
Member Author

I've started getting this myself on my M1 mac. No repeated restarts or anything. SIGABRT stack trace points to the same docker info call.

@vladaionescu
Copy link
Member Author

It seems that this only affects earthly installed via brew install earthly and NOT earthly installed via brew install earthly/earthly/earthly.

A workaround is to brew uninstall earthly && brew install earthly/earthly/earthly.

CC @jazzdan

@vladaionescu
Copy link
Member Author

FTR, this reproduces the issue very consistently:

for i in {1..20}; do earthly +t; done

It doesn't matter what the build is, so it can be something really simple like RUN true.

vladaionescu added a commit that referenced this issue Oct 14, 2022
Also, add verbose printing on executed frontend-related commands.

Re #2247

Co-authored-by: Vlad A. Ionescu <vladaionescu@users.noreply.github.com>
@jazzdan
Copy link
Contributor

jazzdan commented Nov 4, 2022

I noticed this happening again on an earthly/earthly/earthly earthly. At least I'm pretty sure that I am using an earthly/earthly/earthly version

earthly version v0.6.28 7e4f1df4c124db1644d51d312b19313217cbe478 darwin/arm64; macOS 12.6
Running custom build cmd "earthly --verbose --satellite cloud-dev --no-output --push --build-arg ENV=development --build-arg BACKEND_DIR=$PWD --build-arg LOGGER= --build-arg ENV_HONEYCOMB_DATASET= --build-arg DEV_RECORDING_ENABLED=false --build-arg CONTROL_IMAGE_REF=$EXPECTED_REF +control"
 1. Init 🚀
————————————————————————————————————————————————————————————————————————————————

            frontend | Running command: docker info

ps aux | grep earthly | grep control
dan              10660  98.2  0.0 409262256    288 s001  R+    3:26PM  37:08.06 earthly --verbose --satellite cloud-dev --no-output --push --build-arg ENV=development --build-arg BACKEND_DIR=/Users/dan/devel/backend --build-arg LOGGER= --build-arg ENV_HONEYCOMB_DATASET= --build-arg DEV_RECORDING_ENABLED=false --build-arg CONTROL_IMAGE_REF=439372405271.dkr.ecr.us-east-2.amazonaws.com/recordreplay-control:tilt-build-1667600813 +control

And now no matter how many times I send SIGHUP it doesn't even print a stack trace. But maybe I"m sending it to the wrong process somehow?

@jazzdan
Copy link
Contributor

jazzdan commented Nov 9, 2022

I got a stack trace from a more recent version of earthly that exhibited this problem. This was installed with brew install earthly/earthly/earthly:

earthly --version
earthly version v0.6.29 7e4f1df4c124db1644d51d312b19313217cbe478 darwin/arm64; macOS 12.6
SIGABRT: abort
PC=0x1a81fe270 m=0 sigcode=0

goroutine 0 [idle]:
runtime.pthread_cond_wait(0x105c8e240, 0x105c8e200)
	runtime/sys_darwin.go:450 +0x20 fp=0x16b8d2ee0 sp=0x16b8d2eb0 pc=0x104584240
runtime.semasleep(0xffffffffffffffff)
	runtime/os_darwin.go:66 +0x78 fp=0x16b8d2f40 sp=0x16b8d2ee0 pc=0x1045610d8
runtime.notesleep(0x105c8e008)
	runtime/lock_sema.go:181 +0xc0 fp=0x16b8d2f80 sp=0x16b8d2f40 pc=0x1045391a0
runtime.mPark()
	runtime/proc.go:1457 +0x20 fp=0x16b8d2fa0 sp=0x16b8d2f80 pc=0x104569620
runtime.stoplockedm()
	runtime/proc.go:2448 +0x50 fp=0x16b8d2ff0 sp=0x16b8d2fa0 pc=0x10456b670
runtime.schedule()
	runtime/proc.go:3193 +0x34 fp=0x16b8d3030 sp=0x16b8d2ff0 pc=0x10456d644
runtime.park_m(0x110?)
	runtime/proc.go:3363 +0x138 fp=0x16b8d3060 sp=0x16b8d3030 pc=0x10456dbf8
runtime.mcall()
	runtime/asm_arm64.s:192 +0x54 fp=0x16b8d3070 sp=0x16b8d3060 pc=0x104595984

goroutine 1 [syscall, 3 minutes]:
syscall.syscall(0x140004fc688?, 0x1045a5230?, 0x140004fc740?, 0x1400002c000?)
	runtime/sys_darwin.go:23 +0x54 fp=0x140004fc650 sp=0x140004fc5c0 pc=0x1045946e4
syscall.readlen(0x105cbe3a0?, 0x140005e6228?, 0x3?)
	syscall/syscall_darwin.go:241 +0x3c fp=0x140004fc690 sp=0x140004fc650 pc=0x1045a6d5c
syscall.forkExec({0x140005f20f0?, 0x12d92d658?}, {0x140005e4420, 0x2, 0x2}, 0x10521d1e0?)
	syscall/exec_unix.go:221 +0x398 fp=0x140004fc7a0 sp=0x140004fc690 pc=0x1045a5328
syscall.StartProcess(...)
	syscall/exec_unix.go:255
os.startProcess({0x140005f20f0, 0x15}, {0x140005e4420, 0x2, 0x2}, 0x140004fca60)
	os/exec_posix.go:54 +0x250 fp=0x140004fc890 sp=0x140004fc7a0 pc=0x1046036f0
os.StartProcess({0x140005f20f0, 0x15}, {0x140005e4420, 0x2, 0x2}, 0x15?)
	os/exec.go:109 +0x50 fp=0x140004fc8e0 sp=0x140004fc890 pc=0x1046031f0
os/exec.(*Cmd).Start(0x140005fe160)
	os/exec/exec.go:524 +0x490 fp=0x140004fcab0 sp=0x140004fc8e0 pc=0x104c06180
os/exec.(*Cmd).Run(0x104fa98bd?)
	os/exec/exec.go:434 +0x20 fp=0x140004fcad0 sp=0x140004fcab0 pc=0x104c05cb0
github.com/earthly/earthly/util/containerutil.(*shellFrontend).commandContextOutput(0x140000260f0, {0x1054272e8?, 0x14000708480}, {0x140004fcfe8, 0x1, 0x104544714?})
	github.com/earthly/earthly/util/containerutil/shell_shared.go:320 +0x2b0 fp=0x140004fcd00 sp=0x140004fcad0 pc=0x104e4f7b0
github.com/earthly/earthly/util/containerutil.(*shellFrontend).commandContextOutputWithRetry(0x140000260f0, {0x105427278, 0x14000519380}, 0xa, 0xf0?, {0x140004fcfe8, 0x1, 0x1})
	github.com/earthly/earthly/util/containerutil/shell_shared.go:297 +0xe0 fp=0x140004fcf30 sp=0x140004fcd00 pc=0x104e4f250
github.com/earthly/earthly/util/containerutil.NewDockerShellFrontend({0x105427278, 0x14000519380}, 0x14000026000)
	github.com/earthly/earthly/util/containerutil/docker.go:44 +0x1f0 fp=0x140004fd020 sp=0x140004fcf30 pc=0x104e494b0
github.com/earthly/earthly/util/containerutil.frontendIfAvailable({0x105427278, 0x14000519380}, {0x104fb138e, 0xc}, 0x105cbf0c0?)
	github.com/earthly/earthly/util/containerutil/frontend.go:91 +0xcc fp=0x140004fd0a0 sp=0x140004fd020 pc=0x104e4ab9c
github.com/earthly/earthly/util/containerutil.autodetectFrontend({0x105427278, 0x14000519380}, 0xe8?)
	github.com/earthly/earthly/util/containerutil/frontend.go:70 +0xb8 fp=0x140004fd140 sp=0x140004fd0a0 pc=0x104e4aa18
github.com/earthly/earthly/util/containerutil.FrontendForSetting({0x105427278?, 0x14000519380?}, {0x104fa75d2?, 0x0?}, 0x0?)
	github.com/earthly/earthly/util/containerutil/frontend.go:57 +0x48 fp=0x140004fd180 sp=0x140004fd140 pc=0x104e4a908
main.(*earthlyApp).initFrontend(0x14000712600, 0x140006a3380)
	github.com/earthly/earthly/cmd/earthly/buildkit.go:31 +0x21c fp=0x140004fd480 sp=0x140004fd180 pc=0x104f8290c
main.(*earthlyApp).actionBuildImp(0x14000712600, 0x140006a3380, {0x1400069f330, 0x0, 0x1}, {0x1400069f340?, 0x1, 0x1})
	github.com/earthly/earthly/cmd/earthly/build_cmd.go:216 +0xab0 fp=0x140004fec20 sp=0x140004fd480 pc=0x104f7fa40
main.(*earthlyApp).actionBuild(0x14000712600, 0x140006a3380)
	github.com/earthly/earthly/cmd/earthly/build_cmd.go:79 +0x324 fp=0x140004fecd0 sp=0x140004fec20 pc=0x104f7ea44
main.(*earthlyApp).actionBuild-fm(0x105228aa0?)
	<autogenerated>:1 +0x30 fp=0x140004fecf0 sp=0x140004fecd0 pc=0x104fa35b0
github.com/urfave/cli/v2.(*App).RunContext(0x140005009c0, {0x105427278?, 0x14000519380}, {0x1400003c140, 0x13, 0x14})
	github.com/urfave/cli/v2@v2.3.0/app.go:322 +0x6dc fp=0x140004fee90 sp=0x140004fecf0 pc=0x1046bf45c
main.(*earthlyApp).run(0x14000712600, {0x105427278, 0x14000519380}, {0x1400003c140, 0x13, 0x14})
	github.com/earthly/earthly/cmd/earthly/main.go:538 +0x70 fp=0x140004ffac0 sp=0x140004fee90 pc=0x104f8d3e0
main.main()
	github.com/earthly/earthly/cmd/earthly/main.go:268 +0x6fc fp=0x140004fff70 sp=0x140004ffac0 pc=0x104f8b26c
runtime.main()
	runtime/proc.go:250 +0x24c fp=0x140004fffd0 sp=0x140004fff70 pc=0x1045668ac
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x140004fffd0 sp=0x140004fffd0 pc=0x104598024

goroutine 2 [force gc (idle), 3 minutes]:
runtime.gopark(0x1c5be52b93b2f?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:363 +0xe4 fp=0x14000074fa0 sp=0x14000074f80 pc=0x104566c74
runtime.goparkunlock(...)
	runtime/proc.go:369
runtime.forcegchelper()
	runtime/proc.go:302 +0xac fp=0x14000074fd0 sp=0x14000074fa0 pc=0x104566b0c
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x14000074fd0 sp=0x14000074fd0 pc=0x104598024
created by runtime.init.6
	runtime/proc.go:290 +0x24

goroutine 3 [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:363 +0xe4 fp=0x14000075770 sp=0x14000075750 pc=0x104566c74
runtime.goparkunlock(...)
	runtime/proc.go:369
runtime.bgsweep(0x0?)
	runtime/mgcsweep.go:297 +0x10c fp=0x140000757b0 sp=0x14000075770 pc=0x1045521ec
runtime.gcenable.func1()
	runtime/mgc.go:178 +0x28 fp=0x140000757d0 sp=0x140000757b0 pc=0x104546748
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x140000757d0 sp=0x140000757d0 pc=0x104598024
created by runtime.gcenable
	runtime/mgc.go:178 +0x74

goroutine 4 [GC scavenge wait]:
runtime.gopark(0x11148f?, 0xe9029?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:363 +0xe4 fp=0x14000075f50 sp=0x14000075f30 pc=0x104566c74
runtime.goparkunlock(...)
	runtime/proc.go:369
runtime.(*scavengerState).park(0x105c8d620)
	runtime/mgcscavenge.go:389 +0x5c fp=0x14000075f80 sp=0x14000075f50 pc=0x10455013c
runtime.bgscavenge(0x0?)
	runtime/mgcscavenge.go:622 +0xac fp=0x14000075fb0 sp=0x14000075f80 pc=0x10455074c
runtime.gcenable.func2()
	runtime/mgc.go:179 +0x28 fp=0x14000075fd0 sp=0x14000075fb0 pc=0x1045466e8
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x14000075fd0 sp=0x14000075fd0 pc=0x104598024
created by runtime.gcenable
	runtime/mgc.go:179 +0xb8

goroutine 5 [finalizer wait]:
runtime.gopark(0x0?, 0x1054178f0?, 0x60?, 0x60?, 0x2000000020?)
	runtime/proc.go:363 +0xe4 fp=0x14000074580 sp=0x14000074560 pc=0x104566c74
runtime.goparkunlock(...)
	runtime/proc.go:369
runtime.runfinq()
	runtime/mfinal.go:180 +0x120 fp=0x140000747d0 sp=0x14000074580 pc=0x104545970
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x140000747d0 sp=0x140000747d0 pc=0x104598024
created by runtime.createfing
	runtime/mfinal.go:157 +0x84

goroutine 6 [GC worker (idle), 3 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:363 +0xe4 fp=0x14000076740 sp=0x14000076720 pc=0x104566c74
runtime.gcBgMarkWorker()
	runtime/mgc.go:1235 +0xec fp=0x140000767d0 sp=0x14000076740 pc=0x104548b2c
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x140000767d0 sp=0x140000767d0 pc=0x104598024
created by runtime.gcBgMarkStartWorkers
	runtime/mgc.go:1159 +0x28

goroutine 18 [GC worker (idle), 3 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:363 +0xe4 fp=0x14000070740 sp=0x14000070720 pc=0x104566c74
runtime.gcBgMarkWorker()
	runtime/mgc.go:1235 +0xec fp=0x140000707d0 sp=0x14000070740 pc=0x104548b2c
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x140000707d0 sp=0x140000707d0 pc=0x104598024
created by runtime.gcBgMarkStartWorkers
	runtime/mgc.go:1159 +0x28

goroutine 34 [GC worker (idle), 3 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:363 +0xe4 fp=0x140000bc740 sp=0x140000bc720 pc=0x104566c74
runtime.gcBgMarkWorker()
	runtime/mgc.go:1235 +0xec fp=0x140000bc7d0 sp=0x140000bc740 pc=0x104548b2c
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x140000bc7d0 sp=0x140000bc7d0 pc=0x104598024
created by runtime.gcBgMarkStartWorkers
	runtime/mgc.go:1159 +0x28

goroutine 35 [GC worker (idle), 3 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:363 +0xe4 fp=0x140000bcf40 sp=0x140000bcf20 pc=0x104566c74
runtime.gcBgMarkWorker()
	runtime/mgc.go:1235 +0xec fp=0x140000bcfd0 sp=0x140000bcf40 pc=0x104548b2c
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x140000bcfd0 sp=0x140000bcfd0 pc=0x104598024
created by runtime.gcBgMarkStartWorkers
	runtime/mgc.go:1159 +0x28

goroutine 7 [GC worker (idle), 3 minutes]:
runtime.gopark(0x1c5a007ea4247?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:363 +0xe4 fp=0x14000076f40 sp=0x14000076f20 pc=0x104566c74
runtime.gcBgMarkWorker()
	runtime/mgc.go:1235 +0xec fp=0x14000076fd0 sp=0x14000076f40 pc=0x104548b2c
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x14000076fd0 sp=0x14000076fd0 pc=0x104598024
created by runtime.gcBgMarkStartWorkers
	runtime/mgc.go:1159 +0x28

goroutine 19 [GC worker (idle), 3 minutes]:
runtime.gopark(0x1c5be52c414db?, 0x1?, 0xc1?, 0xc5?, 0x0?)
	runtime/proc.go:363 +0xe4 fp=0x14000070f40 sp=0x14000070f20 pc=0x104566c74
runtime.gcBgMarkWorker()
	runtime/mgc.go:1235 +0xec fp=0x14000070fd0 sp=0x14000070f40 pc=0x104548b2c
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x14000070fd0 sp=0x14000070fd0 pc=0x104598024
created by runtime.gcBgMarkStartWorkers
	runtime/mgc.go:1159 +0x28

goroutine 20 [GC worker (idle), 3 minutes]:
runtime.gopark(0x105cbf0c0?, 0x1?, 0x77?, 0x85?, 0x0?)
	runtime/proc.go:363 +0xe4 fp=0x14000071740 sp=0x14000071720 pc=0x104566c74
runtime.gcBgMarkWorker()
	runtime/mgc.go:1235 +0xec fp=0x140000717d0 sp=0x14000071740 pc=0x104548b2c
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x140000717d0 sp=0x140000717d0 pc=0x104598024
created by runtime.gcBgMarkStartWorkers
	runtime/mgc.go:1159 +0x28

goroutine 8 [GC worker (idle), 3 minutes]:
runtime.gopark(0x1c5be52c3c6e4?, 0x1?, 0x71?, 0xf2?, 0x0?)
	runtime/proc.go:363 +0xe4 fp=0x14000077740 sp=0x14000077720 pc=0x104566c74
runtime.gcBgMarkWorker()
	runtime/mgc.go:1235 +0xec fp=0x140000777d0 sp=0x14000077740 pc=0x104548b2c
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x140000777d0 sp=0x140000777d0 pc=0x104598024
created by runtime.gcBgMarkStartWorkers
	runtime/mgc.go:1159 +0x28

goroutine 36 [GC worker (idle), 3 minutes]:
runtime.gopark(0x1c5a00b2b0b50?, 0x3?, 0xb8?, 0x4?, 0x0?)
	runtime/proc.go:363 +0xe4 fp=0x140000bd740 sp=0x140000bd720 pc=0x104566c74
runtime.gcBgMarkWorker()
	runtime/mgc.go:1235 +0xec fp=0x140000bd7d0 sp=0x140000bd740 pc=0x104548b2c
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x140000bd7d0 sp=0x140000bd7d0 pc=0x104598024
created by runtime.gcBgMarkStartWorkers
	runtime/mgc.go:1159 +0x28

goroutine 37 [GC worker (idle), 3 minutes]:
runtime.gopark(0x1c5be52c3e6f5?, 0x1?, 0xe6?, 0x26?, 0x0?)
	runtime/proc.go:363 +0xe4 fp=0x140000bdf40 sp=0x140000bdf20 pc=0x104566c74
runtime.gcBgMarkWorker()
	runtime/mgc.go:1235 +0xec fp=0x140000bdfd0 sp=0x140000bdf40 pc=0x104548b2c
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x140000bdfd0 sp=0x140000bdfd0 pc=0x104598024
created by runtime.gcBgMarkStartWorkers
	runtime/mgc.go:1159 +0x28

goroutine 9 [select, 3 minutes, locked to thread]:
runtime.gopark(0x140000bf7a0?, 0x2?, 0x88?, 0xf6?, 0x140000bf79c?)
	runtime/proc.go:363 +0xe4 fp=0x140000bf630 sp=0x140000bf610 pc=0x104566c74
runtime.selectgo(0x140000bf7a0, 0x140000bf798, 0x0?, 0x0, 0x0?, 0x1)
	runtime/select.go:328 +0x688 fp=0x140000bf750 sp=0x140000bf630 pc=0x104577588
runtime.ensureSigM.func1()
	runtime/signal_unix.go:991 +0x190 fp=0x140000bf7d0 sp=0x140000bf750 pc=0x10457ba20
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x140000bf7d0 sp=0x140000bf7d0 pc=0x104598024
created by runtime.ensureSigM
	runtime/signal_unix.go:974 +0xf4

goroutine 10 [syscall, 3 minutes]:
runtime.sigNoteSleep(0x0)
	runtime/os_darwin.go:123 +0x20 fp=0x140000bff90 sp=0x140000bff50 pc=0x104561330
os/signal.signal_recv()
	runtime/sigqueue.go:149 +0x2c fp=0x140000bffb0 sp=0x140000bff90 pc=0x104593e6c
os/signal.loop()
	os/signal/signal_unix.go:23 +0x1c fp=0x140000bffd0 sp=0x140000bffb0 pc=0x10461abac
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x140000bffd0 sp=0x140000bffd0 pc=0x104598024
created by os/signal.Notify.func1.1
	os/signal/signal.go:151 +0x2c

goroutine 50 [chan receive, 3 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:363 +0xe4 fp=0x1400008be70 sp=0x1400008be50 pc=0x104566c74
runtime.chanrecv(0x14000100e40, 0x1400008bfb8, 0x1)
	runtime/chan.go:583 +0x454 fp=0x1400008bf00 sp=0x1400008be70 pc=0x1045341e4
runtime.chanrecv2(0x0?, 0x0?)
	runtime/chan.go:447 +0x14 fp=0x1400008bf30 sp=0x1400008bf00 pc=0x104533d74
main.main.func2()
	github.com/earthly/earthly/cmd/earthly/main.go:180 +0xac fp=0x1400008bfd0 sp=0x1400008bf30 pc=0x104f8b7bc
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x1400008bfd0 sp=0x1400008bfd0 pc=0x104598024
created by main.main
	github.com/earthly/earthly/cmd/earthly/main.go:179 +0x178

goroutine 82 [select]:
runtime.gopark(0x1400051ff88?, 0x2?, 0x20?, 0x7?, 0x1400051ff6c?)
	runtime/proc.go:363 +0xe4 fp=0x1400051fe00 sp=0x1400051fde0 pc=0x104566c74
runtime.selectgo(0x1400051ff88, 0x1400051ff68, 0x140005d0720?, 0x0, 0x0?, 0x1)
	runtime/select.go:328 +0x688 fp=0x1400051ff20 sp=0x1400051fe00 pc=0x104577588
google.golang.org/grpc.(*ccBalancerWrapper).watcher(0x1400007e140)
	google.golang.org/grpc@v1.47.0/balancer_conn_wrappers.go:112 +0x68 fp=0x1400051ffb0 sp=0x1400051ff20 pc=0x1049f8fd8
google.golang.org/grpc.newCCBalancerWrapper.func1()
	google.golang.org/grpc@v1.47.0/balancer_conn_wrappers.go:73 +0x28 fp=0x1400051ffd0 sp=0x1400051ffb0 pc=0x1049f8f38
runtime.goexit()
	runtime/asm_arm64.s:1172 +0x4 fp=0x1400051ffd0 sp=0x1400051ffd0 pc=0x104598024
created by google.golang.org/grpc.newCCBalancerWrapper
	google.golang.org/grpc@v1.47.0/balancer_conn_wrappers.go:73 +0x22c

r0      0x104
r1      0x0
r2      0x1200
r3      0x0
r4      0x0
r5      0xa0
r6      0x0
r7      0x0
r8      0x16b8d2dc8
r9      0x1e01
r10     0x105c8e218
r11     0x2
r12     0x0
r13     0x0
r14     0x0
r15     0x0
r16     0x131
r17     0x20235b638
r18     0x0
r19     0x105c8e200
r20     0x105c8e240
r21     0x1062fc660
r22     0x0
r23     0x0
r24     0x1200
r25     0x1e01
r26     0x1f00
r27     0x20
r28     0x105c8d920
r29     0x16b8d2e40
lr      0x1a823883c
sp      0x16b8d2db0
pc      0x1a81fe270
fault   0x1a81fe270

@vladaionescu
Copy link
Member Author

Hmm... my little script that used to reproduce it for me very consistently no longer reproduces it on either homebrew or earthly/earthly/earthly version.

The stack trace points to the same thing: a docker info command being issued. From our experimentation, the command doesn't actually get issued when this reproduces. So it's something to do with the interaction between Go and Mac OS.

I noticed that you're on Mac 12.6. Any chance you'd be able to upgrade to Mac 13 and let us know if this still reproducing? When it was reproducing for me in the past, it would reproduce highly consistently with a script like this:

for i in {1..20}; do earthly +t; done

Where +t is a basic target with just RUN true in it.

@vladaionescu
Copy link
Member Author

Oh I also just got it on earthly/earthly/earthly on my Mac OS 13.0.1. It just happens more rarely now or something.

@vladaionescu
Copy link
Member Author

Some more experimentation - it seems that this is related to CGO_ENABLED which is on only for homebrew builds, but not for the official Earthly binaries.

I verified this fix by running Earthly 500 times - with CGO_ENABLED=0 it doesn't hang, whereas without it, building natively (not via the official Earthly build), when CGO_ENABLED defaults to enabled, I get it about 1/100 times.

Script I used for reference:

for i in {1..500}; do ./build/earthly +t; if [ "$?" != 0 ]; then break; fi; done

vladaionescu added a commit to vladaionescu/homebrew-core that referenced this issue Nov 10, 2022
vladaionescu added a commit to earthly/homebrew-earthly that referenced this issue Nov 10, 2022
BrewTestBot pushed a commit to Homebrew/homebrew-core that referenced this issue Nov 10, 2022
Fixes earthly/earthly#2247

Closes #115350.

Signed-off-by: Sean Molenaar <1484494+SMillerDev@users.noreply.github.com>
Signed-off-by: BrewTestBot <1589480+BrewTestBot@users.noreply.github.com>
vladaionescu added a commit to earthly/homebrew-earthly that referenced this issue Nov 10, 2022
@jazzdan
Copy link
Contributor

jazzdan commented Nov 10, 2022

CGO strikes again, great find!

vladaionescu added a commit that referenced this issue Nov 10, 2022
This is normally not necessary, but in some cases, it does default to 1,
and that can be problematic sometimes (an example:
#2247).

Co-authored-by: Vlad A. Ionescu <vladaionescu@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug Something isn't working
Projects
None yet
2 participants