Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Docker Desktop is stopping" on macOS M1 Max and hanging #6472

Closed
2 of 3 tasks
avioli opened this issue Sep 5, 2022 · 129 comments
Closed
2 of 3 tasks

"Docker Desktop is stopping" on macOS M1 Max and hanging #6472

avioli opened this issue Sep 5, 2022 · 129 comments

Comments

@avioli
Copy link

avioli commented Sep 5, 2022

  • I have tried with the latest version of Docker Desktop
  • I have tried disabling enabled experimental features
  • I have uploaded Diagnostics
  • Diagnostics ID: F2606449-114C-4887-8E78-7926B365C0EE/20220905023134

Expected behavior

Docker Desktop starts and I can use the cli tool.

Actual behavior

I cannot run the cli tool, nor Docker Desktop is usable - it shows "Docker Desktop is stopping" and the whole UI is unusable, except for the bug report/reset tool (and various not so useful views).

Information

  • macOS Version: 12.5.1 (21G83)
  • Intel chip or Apple chip: Apple chip - M1 Max
  • Docker Desktop Version: 4.12.0 (85629)

This behaviour is not reproducible on any other Mac computer I have access to other than the one I used to produce the diagnostics.

The problem is new - Docker Desktop used to work last week (This is Friday and today is Monday), which was running a previous version - either 4.11 or 4.7 or something close, since I did not note it except that I have a memory of seeing a version number when trying to re-install via homebrew.

The problem did not appear with an update, but with a restart of the computer and an update to the latest Docker Desktop did not resolve it.

Output of /Applications/Docker.app/Contents/MacOS/com.docker.diagnose check

$ /Applications/Docker.app/Contents/MacOS/com.docker.diagnose check
Starting diagnostics

[PASS] DD0027: is there available disk space on the host?
[PASS] DD0028: is there available VM disk space?
[FAIL] DD0031: does the Docker API work? Cannot connect to the Docker daemon at unix://docker.raw.sock. Is the docker daemon running?
[FAIL] DD0004: is the Docker engine running? Get "http://ipc/docker": dial unix lifecycle-server.sock: connect: no such file or directory
[2022-09-05T02:33:16.780765000Z][com.docker.diagnose][I] ipc.NewClient: 08ac81ec-com.docker.diagnose -> lifecycle-server.sock VMDockerdAPI
[linuxkit/pkg/desktop-host-tools/pkg/client.NewClientForPath(...)
[	linuxkit/pkg/desktop-host-tools/pkg/client/client.go:59
[linuxkit/pkg/desktop-host-tools/pkg/client.NewClient({0x100ad5e20, 0x13})
[	linuxkit/pkg/desktop-host-tools/pkg/client/client.go:53 +0x90
[common/pkg/diagkit/gather/diagnose.isDockerEngineRunning()
[	common/pkg/diagkit/gather/diagnose/dockerd.go:21 +0x28
[common/pkg/diagkit/gather/diagnose.(*test).GetResult(0x1011e3940)
[	common/pkg/diagkit/gather/diagnose/test.go:46 +0x40
[common/pkg/diagkit/gather/diagnose.Run.func1(0x1011e3940)
[	common/pkg/diagkit/gather/diagnose/run.go:17 +0x40
[common/pkg/diagkit/gather/diagnose.walkOnce.func1(0x1011e3a40?, 0x1011e3940)
[	common/pkg/diagkit/gather/diagnose/run.go:140 +0x80
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x2, 0x1011e3940, 0x1400059f718)
[	common/pkg/diagkit/gather/diagnose/run.go:146 +0x38
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x1, 0x1011e3a40?, 0x1400059f718)
[	common/pkg/diagkit/gather/diagnose/run.go:149 +0x74
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x0, 0x20?, 0x1400059f718)
[	common/pkg/diagkit/gather/diagnose/run.go:149 +0x74
[common/pkg/diagkit/gather/diagnose.walkOnce(0x100ca1660?, 0x1400059f890)
[	common/pkg/diagkit/gather/diagnose/run.go:135 +0x8c
[common/pkg/diagkit/gather/diagnose.Run(0x1011e3cc0, 0x140002a21e0?, {0x1400059fb08, 0x1, 0x1})
[	common/pkg/diagkit/gather/diagnose/run.go:16 +0x160
[main.checkCmd({0x140001aa010?, 0x6?, 0x4?}, {0x0, 0x0})
[	common/cmd/com.docker.diagnose/main.go:133 +0xdc
[main.main()
[	common/cmd/com.docker.diagnose/main.go:99 +0x30c
[2022-09-05T02:33:16.781725000Z][com.docker.diagnose][I] (4aba8ee2) 08ac81ec-com.docker.diagnose C->S VMDockerdAPI GET /docker
[2022-09-05T02:33:16.782051000Z][com.docker.diagnose][W] (4aba8ee2) 08ac81ec-com.docker.diagnose C<-S NoResponse GET /docker (321.334µs): Get "http://ipc/docker": dial unix lifecycle-server.sock: connect: no such file or directory
[2022-09-05T02:33:16.782277000Z][com.docker.diagnose][I] (4aba8ee2-1) 08ac81ec-com.docker.diagnose C->S VMDockerdAPI GET /ping
[2022-09-05T02:33:16.782547000Z][com.docker.diagnose][W] (4aba8ee2-1) 08ac81ec-com.docker.diagnose C<-S NoResponse GET /ping (269.125µs): Get "http://ipc/ping": dial unix lifecycle-server.sock: connect: no such file or directory
[2022-09-05T02:33:17.783651000Z][com.docker.diagnose][I] (4aba8ee2-2) 08ac81ec-com.docker.diagnose C->S VMDockerdAPI GET /ping
[2022-09-05T02:33:17.786448000Z][com.docker.diagnose][W] (4aba8ee2-2) 08ac81ec-com.docker.diagnose C<-S NoResponse GET /ping (2.790875ms): Get "http://ipc/ping": dial unix lifecycle-server.sock: connect: no such file or directory
[2022-09-05T02:33:18.787894000Z][com.docker.diagnose][I] (4aba8ee2-3) 08ac81ec-com.docker.diagnose C->S VMDockerdAPI GET /ping
[2022-09-05T02:33:18.790422000Z][com.docker.diagnose][W] (4aba8ee2-3) 08ac81ec-com.docker.diagnose C<-S NoResponse GET /ping (2.51675ms): Get "http://ipc/ping": dial unix lifecycle-server.sock: connect: no such file or directory
[2022-09-05T02:33:19.791560000Z][com.docker.diagnose][I] (4aba8ee2-4) 08ac81ec-com.docker.diagnose C->S VMDockerdAPI GET /ping
[2022-09-05T02:33:19.793782000Z][com.docker.diagnose][W] (4aba8ee2-4) 08ac81ec-com.docker.diagnose C<-S NoResponse GET /ping (2.285791ms): Get "http://ipc/ping": dial unix lifecycle-server.sock: connect: no such file or directory
[2022-09-05T02:33:20.794599000Z][com.docker.diagnose][I] (4aba8ee2-5) 08ac81ec-com.docker.diagnose C->S VMDockerdAPI GET /ping
[2022-09-05T02:33:20.797034000Z][com.docker.diagnose][W] (4aba8ee2-5) 08ac81ec-com.docker.diagnose C<-S NoResponse GET /ping (2.432708ms): Get "http://ipc/ping": dial unix lifecycle-server.sock: connect: no such file or directory
[2022-09-05T02:33:21.797765000Z][com.docker.diagnose][I] (4aba8ee2-6) 08ac81ec-com.docker.diagnose C->S VMDockerdAPI GET /ping
[2022-09-05T02:33:21.800243000Z][com.docker.diagnose][W] (4aba8ee2-6) 08ac81ec-com.docker.diagnose C<-S NoResponse GET /ping (2.470583ms): Get "http://ipc/ping": dial unix lifecycle-server.sock: connect: no such file or directory
[2022-09-05T02:33:22.801217000Z][com.docker.diagnose][I] (4aba8ee2-7) 08ac81ec-com.docker.diagnose C->S VMDockerdAPI GET /ping
[2022-09-05T02:33:22.803775000Z][com.docker.diagnose][W] (4aba8ee2-7) 08ac81ec-com.docker.diagnose C<-S NoResponse GET /ping (2.547917ms): Get "http://ipc/ping": dial unix lifecycle-server.sock: connect: no such file or directory
[2022-09-05T02:33:23.806518000Z][com.docker.diagnose][I] (4aba8ee2-8) 08ac81ec-com.docker.diagnose C->S VMDockerdAPI GET /ping
[2022-09-05T02:33:23.809398000Z][com.docker.diagnose][W] (4aba8ee2-8) 08ac81ec-com.docker.diagnose C<-S NoResponse GET /ping (2.888416ms): Get "http://ipc/ping": dial unix lifecycle-server.sock: connect: no such file or directory

[FAIL] DD0011: are the LinuxKit services running? failed to ping VM diagnosticsd with error: Get "http://ipc/ping": dial unix diagnosticd.sock: connect: no such file or directory
[2022-09-05T02:33:23.811509000Z][com.docker.diagnose][I] ipc.NewClient: 049ec18d-diagnose -> diagnosticd.sock diagnosticsd
[common/pkg/diagkit/gather/diagnose.glob..func12()
[	common/pkg/diagkit/gather/diagnose/linuxkit.go:18 +0x8c
[common/pkg/diagkit/gather/diagnose.(*test).GetResult(0x1011e38c0)
[	common/pkg/diagkit/gather/diagnose/test.go:46 +0x40
[common/pkg/diagkit/gather/diagnose.Run.func1(0x1011e38c0)
[	common/pkg/diagkit/gather/diagnose/run.go:17 +0x40
[common/pkg/diagkit/gather/diagnose.walkOnce.func1(0x1011e3940?, 0x1011e38c0)
[	common/pkg/diagkit/gather/diagnose/run.go:140 +0x80
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x3, 0x1011e38c0, 0x140000d1718)
[	common/pkg/diagkit/gather/diagnose/run.go:146 +0x38
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x2, 0x1011e3940?, 0x140000d1718)
[	common/pkg/diagkit/gather/diagnose/run.go:149 +0x74
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x1, 0x1011e3a40?, 0x140000d1718)
[	common/pkg/diagkit/gather/diagnose/run.go:149 +0x74
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x0, 0x20?, 0x140000d1718)
[	common/pkg/diagkit/gather/diagnose/run.go:149 +0x74
[common/pkg/diagkit/gather/diagnose.walkOnce(0x100ca1660?, 0x1400059f890)
[	common/pkg/diagkit/gather/diagnose/run.go:135 +0x8c
[common/pkg/diagkit/gather/diagnose.Run(0x1011e3cc0, 0x140002a21e0?, {0x1400059fb08, 0x1, 0x1})
[	common/pkg/diagkit/gather/diagnose/run.go:16 +0x160
[main.checkCmd({0x140001aa010?, 0x6?, 0x4?}, {0x0, 0x0})
[	common/cmd/com.docker.diagnose/main.go:133 +0xdc
[main.main()
[	common/cmd/com.docker.diagnose/main.go:99 +0x30c
[2022-09-05T02:33:23.813326000Z][com.docker.diagnose][I] (9b3bad8b) 049ec18d-diagnose C->S diagnosticsd GET /ping
[2022-09-05T02:33:23.814174000Z][com.docker.diagnose][W] (9b3bad8b) 049ec18d-diagnose C<-S NoResponse GET /ping (841.584µs): Get "http://ipc/ping": dial unix diagnosticd.sock: connect: no such file or directory

[FAIL] DD0016: is the LinuxKit VM running? vm is not running: failed to open kmsg.log: open log/vm/kmsg.log: no such file or directory
[PASS] DD0001: is the application running?
[PASS] DD0018: does the host support virtualization?
[FAIL] DD0017: can a VM be started? vm has not started: failed to open kmsg.log: open log/vm/kmsg.log: no such file or directory
[PASS] DD0015: are the binary symlinks installed?

The tool never finishes!

Steps to reproduce the behavior

  1. ...
  2. ...

I cannot provide steps, since I'm unable to reproduce it on any other machine I've got access to.

I'll create a new user account and see if Docker Desktop loads within that.

@avioli
Copy link
Author

avioli commented Sep 5, 2022

Quick update. When I quit Docker Desktop from the menubar - the com.docker.diagnose check tool resumed outputting data:

[PASS] DD0015: are the binary symlinks installed?
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
[FAIL] DD0003: is the Docker CLI working? exit status 1
[PASS] DD0013: is the $PATH ok?
[FAIL] DD0007: is the backend responding? failed to ping com.docker.backend with error: Get "http://ipc/ping": dial unix backend.sock: connect: no such file or directory
[2022-09-05T02:50:40.788714000Z][com.docker.diagnose][I] ipc.NewClient: ee494a31-diagnose -> backend.sock BackendAPI
[common/pkg/backend.NewClientForPath({0x100acd4b6?, 0x0?}, {0x14000492140?, 0x100bd4560?})
[	common/pkg/backend/client.go:170 +0x3c
[common/pkg/backend.NewClient({0x100acd4b6, 0x8})
[	common/pkg/backend/client.go:165 +0x54
[common/pkg/diagkit/gather/diagnose.glob..func8()
[	common/pkg/diagkit/gather/diagnose/ipc.go:25 +0x28
[common/pkg/diagkit/gather/diagnose.(*test).GetResult(0x1011e3d40)
[	common/pkg/diagkit/gather/diagnose/test.go:46 +0x40
[common/pkg/diagkit/gather/diagnose.Run.func1(0x1011e3d40)
[	common/pkg/diagkit/gather/diagnose/run.go:17 +0x40
[common/pkg/diagkit/gather/diagnose.walkOnce.func1(0x2?, 0x1011e3d40)
[	common/pkg/diagkit/gather/diagnose/run.go:140 +0x80
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x1, 0x1011e3d40, 0x14000717718)
[	common/pkg/diagkit/gather/diagnose/run.go:146 +0x38
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x0, 0x20?, 0x14000717718)
[	common/pkg/diagkit/gather/diagnose/run.go:149 +0x74
[common/pkg/diagkit/gather/diagnose.walkOnce(0x100ca1660?, 0x1400059f890)
[	common/pkg/diagkit/gather/diagnose/run.go:135 +0x8c
[common/pkg/diagkit/gather/diagnose.Run(0x1011e3cc0, 0x140002a21e0?, {0x1400059fb08, 0x1, 0x1})
[	common/pkg/diagkit/gather/diagnose/run.go:16 +0x160
[main.checkCmd({0x140001aa010?, 0x6?, 0x4?}, {0x0, 0x0})
[	common/cmd/com.docker.diagnose/main.go:133 +0xdc
[main.main()
[	common/cmd/com.docker.diagnose/main.go:99 +0x30c

[FAIL] DD0014: are the backend processes running? 5 errors occurred:
	* qemu-system-aarch64 is not running
	* com.docker.backend is not running
	* vpnkit-bridge is not running
	* com.docker.vpnkit is not running
	* com.docker.driver.amd64-linux is not running

... but it stopped at that place and has not finished still.

Activity Monitor still reports a few items that have not been killed when I initiated the "Quit Docker Desktop" command.

Screen Shot 2022-09-05 at 12 54 26 pm

@avioli
Copy link
Author

avioli commented Sep 5, 2022

Quick update. Starting under a separate user on the same machine didn't help. Also tried after a restart.

@anpr
Copy link

anpr commented Sep 5, 2022

I have the same problem. Disabling the new virtualization framework made docker for mac start again.

  1. Open activity monitor, search for "docker" and Force Quit docker for desktop and all other processes.
  2. Edit the file ~/Library/Group Containers/group.com.docker/settings.json:
    useVirtualizationFramework": false
  3. Restart the docker process (and wait a few minutes).

This fixed it, at least on my machine.

@avioli
Copy link
Author

avioli commented Sep 6, 2022

Thanks @anpr. That fixed it for me too.

Feel free to close this issue.

@anpr
Copy link

anpr commented Sep 6, 2022

Good to know. I still would like to know why this fails - I would actually prefer to use the virtualization framework.

@joeforshaw
Copy link

Also ran into the same issue after updating to the latest Docker Desktop. I downgraded back to 4.11.1 and the "stopping" issue disappeared and I was able to enable the virtualisation framework again. Would be great to get a solution for this.

@djs55
Copy link
Contributor

djs55 commented Sep 6, 2022

@avioli thanks for the diagnostics. The logs have something interesting:

[2022-09-05T02:29:49.708705000Z][com.docker.backend][I] com.docker.vpnkit with pid: 32673 shutdown by signal: segmentation fault

(This may be related to #6435 (comment) )

The latest development build has a fix for this issue, could you try:

If it still doesn't work, could you capture diagnostics while it's in the broken state with

/Applications/Docker.app/Contents/MacOS/com.docker.diagnose gather -upload

and quote the new ID here? Thanks!

@DiegoGiovany
Copy link

DiegoGiovany commented Sep 6, 2022

Hi, docker diagnose stops on:
[PASS] DD0007: is the backend responding?

Screenshot 2022-09-06 at 18 20 43

stays there for more than 30 minutes...

@avioli
Copy link
Author

avioli commented Sep 7, 2022

My working version stopped working when I quit Docker Desktop and shut it down. Once started later on today - it didn't become operational afterwards.

@djs55 I downloaded the Apple Silicon version and replaced 85629 with the new build 86088, but I'm still get the same endless "stopping".

I also got the following Fatal Error window, saying connecting to VPNKit on vpnkit.eth.sock: EOF:

Screen Shot 2022-09-07 at 2 09 11 pm

I did a "Reset Docker to factory defaults", but that didn't improve anything.

Diagnostics ID: F2606449-114C-4887-8E78-7926B365C0EE/20220907040930

I've removed $HOME/Library/Group Containers/group.com.docker/ and $HOME/Library/Containers/com.docker.docker/ and re-started it, but got the same VPNKit Fatal Error as above.

Here's another Diagnostics ID: 6BDF8BC7-5DC2-4E87-8D86-0B8F72FC6B5C/20220907041813

@djs55
Copy link
Contributor

djs55 commented Sep 7, 2022

@avioli thanks for trying the build and the diagnostics. Could you see if you have any Apple diagnostics reports from Docker processes (in particular com.docker.vpnkit) in ~/Library/Logs/DiagnosticReports? If so could you attach them here? Unfortunately the current diagnostics capture doesn't include them.

@avioli
Copy link
Author

avioli commented Sep 8, 2022

There are 18 from yesterday which have the name of com.docker.vpnkit-2022-09-07-HHMMSS.ips, but there are also three, named Docker-2022-09-07-HHMMSS.ips which I've also included in the below archive.

DiagnosticReports - 2022-09-07.zip

Edit: Just a note - if these logs are generated when an executable is force-killed and not only when it crashes, then many of them will be because of that - I've force-killed them when I was wiping everything out (usually before a computer restart).

Edit 2: I've also tried running Docker 4.7.11, so some of the most recent ones could be from that.

@djs55
Copy link
Contributor

djs55 commented Sep 8, 2022

@avioli thanks for the reports, they look very interesting. For the com.docker.vpnkit crashes there are 2 kinds:

  1. the old version of com.docker.vpnkit crashes like this:
VM Region Info: 0x16ef5bff8 is in 0x16b758000-0x16ef5c000;  bytes after start: 58736632  bytes before end: 7
      REGION TYPE                    START - END         [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      MALLOC_MEDIUM (reserved)    147800000-148000000    [ 8192K] rw-/rwx SM=NUL  ...(unallocated)
      GAP OF 0x23758000 BYTES
--->  STACK GUARD                 16b758000-16ef5c000    [ 56.0M] ---/rwx SM=NUL  ... for thread 0
      Stack                       16ef5c000-16f758000    [ 8176K] rw-/rwx SM=PRV  thread 0

Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   com.docker.vpnkit             	       0x100cb9a8c camlStdlib__list__map_236 + 4
1   com.docker.vpnkit             	       0x100cb9ac8 camlStdlib__list__map_236 + 64
2   com.docker.vpnkit             	       0x100d64d60 caml_callback2_exn + 60
3   com.docker.vpnkit             	       0x100d247cc fs_callback + 232
4   com.docker.vpnkit             	       0x100d2fe64 uv__work_done + 192
5   com.docker.vpnkit             	       0x100d32b98 uv__async_io + 260
6   com.docker.vpnkit             	       0x100d41218 uv__io_poll + 904
7   com.docker.vpnkit             	       0x100d32fc8 uv_run + 372
8   com.docker.vpnkit             	       0x100d17df8 uwt_run_loop + 280
9   com.docker.vpnkit             	       0x100d6fe5c caml_c_call + 28
10  com.docker.vpnkit             	       0x1007609e0 camlUwt__run_3961 + 304

-- it looks like it overflows the stack running a (not tail-recursive List.map in the filesystem watch callback).

  1. the new version of com.docker.vpnkit crashes like this:
VM Region Info: 0x16b3fbff8 is in 0x167bf8000-0x16b3fc000;  bytes after start: 58736632  bytes before end: 7
      REGION TYPE                    START - END         [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      MALLOC_MEDIUM (reserved)    157800000-158000000    [ 8192K] rw-/rwx SM=NUL  ...(unallocated)
      GAP OF 0xfbf8000 BYTES
--->  STACK GUARD                 167bf8000-16b3fc000    [ 56.0M] ---/rwx SM=NUL  ... for thread 0
      Stack                       16b3fc000-16bbf8000    [ 8176K] rw-/rwx SM=PRV  thread 0

Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   com.docker.vpnkit             	       0x104871e04 camlStdlib__list__map_236 + 4
1   com.docker.vpnkit             	       0x104871e40 camlStdlib__list__map_236 + 64
2   ???                           	       0x148324d70 ???

So a very similar List.map call hitting a stack guard page. It might be something as simple as updating the compiler toolchain used to build these binaries, or to adjust the stack size settings. I'll investigate.

Edit: I think I'm building with OCaml 4.12, but there's a useful change in 4.13+: ocaml/ocaml#10549

djs55 added a commit to djs55/vpnkit that referenced this issue Sep 8, 2022
In particular we would like ocaml/ocaml#10549
to make diagnosing docker/for-mac#6472 (comment)
easier.

Signed-off-by: David Scott <dave@recoil.org>
@djs55
Copy link
Contributor

djs55 commented Sep 8, 2022

@avioli would you mind trying a development build of vpnkit for me? I've made a PR for moby/vpnkit and built it locally. I've attached the binary to this comment. It has the following shasum:

94f54180e6918cd3dad9072fbcb7736665608c20  com.docker.vpnkit

vpnkit-ocaml-4.14.zip

It can be installed by:

  1. stop Desktop
  2. run from a terminal
# take a backup
mv /Applications/Docker.app/Contents/Resources/bin/com.docker.vpnkit /Applications/Docker.app/Contents/Resources/bin/com.docker.vpnkit.orig
# overwrite file
cp com.docker.vpnkit /Applications/Docker.app/Contents/Resources/bin/com.docker.vpnkit
  1. start Desktop

Hopefully it will work better. If it doesn't, hopefully there will be an exception in the logs rather than a segfault 😅 If it fails, could you upload a diagnostic and quote the ID:

/Applications/Docker.app/Contents/MacOS/com.docker.diagnose gather -upload

(Just in case it goes wrong, it's worth looking for a new DiagnosticResport too)

@avsm
Copy link

avsm commented Sep 8, 2022

Just took a look at this with @djs55 and can confirm that the OCaml 4.14 build should have a better exception on the M1 Macs than the old (4.12-based) build. Linking ocaml/ocaml#10549 (comment) for the record.

@avioli
Copy link
Author

avioli commented Sep 8, 2022

@djs55 🥳 The new vpnkit did the trick and Docker (4.12.0 85629) is now running on my rig as expected. I'll monitor how things go in the following few days. I'll do some restarts just to be sure and let you know if there are any hiccups.

Thank you

@anpr
Copy link

anpr commented Sep 9, 2022

At least on my machine, using the new vpnkit did not help for being able to active the virtualization framework again...

@djs55
Copy link
Contributor

djs55 commented Sep 9, 2022

@anpr could you upload some diagnostics and quote the ID here? Also take a look to see if there are any crash reports.

There's also a known issue switching the framework on and off: perhaps try turning it on, then stopping and starting Docker (avoid the "restart" menu item, completely stop and start)

@shff
Copy link

shff commented Sep 15, 2022

I was still having issues with 4.12.0 85629. It only shows "Docker Desktop stopping..." for several minutes and nothing happens. vmnetd seems crashed and unresponsive.

My problem seems to be due to blocking of the domain sessions.bugsnag.com domain in my DNS server. After unblocking it seems to work, but that seems like a non-solution to me.

@halfpastfouram
Copy link

halfpastfouram commented Sep 19, 2022

Experiencing the same issue here, just uploaded diagnostics: 7B32AB05-AFDB-42F9-AD70-2059500A3485/20220919085834.

After replacing vpnkit as suggested above I see no improvement. Uploaded diagnostics again after this: 7B32AB05-AFDB-42F9-AD70-2059500A3485/20220919090940

What I have tried so far:

  • I've checked if my network or DNS server blocks sessions.bugsnag.com and that is not the case.
  • Manually stopping and starting Docker does not work. I have to forcefully quit the process by performing sudo killall docker and then quit Docker Desktop using Activity Monitor.
  • Reïnstalling Docker Desktop does not change anything.

The only way for me to use Docker right now is with all the experimental features off, resulting in a very slow experience compared to using VirtioFS that I was able to use before last update.

@ghost
Copy link

ghost commented Sep 19, 2022

I was still having issues with 4.12.0 85629. It only shows "Docker Desktop stopping..." for several minutes and nothing happens. vmnetd seems crashed and unresponsive.

My problem seems to be due to blocking of the domain sessions.bugsnag.com domain in my DNS server. After unblocking it seems to work, but that seems like a non-solution to me.

That ... worked for me too, this domain (sessions.bugsnag.com) is in a common kill-file I use. That's the issue, srsly?

@andyleanlibrary
Copy link

@halfpastfouram I am experiencing the same. Tried all the above but still hanging and can only run with experimental features off.

@ghost
Copy link

ghost commented Sep 21, 2022

I repeated my experiment @andyleanlibrary ...

  • enabled my DNS blocker
  • Docker went into "stopping..." stall
  • killed all Docker processes in Activity monitor
  • disabled DNS blocker
  • restarted Docker
  • went into "stopped" state and then after roughly a minute fell back onto its feet

Tried that twice. Repeatable. However it does not seem to be that bugsnag.com entry in particular. Removed just that one and still got the stall.

@ddbrodsky
Copy link

For me, 4.12.0 was pretty stable, but 4.13.0 started to exhibit the issue above when I upgraded to it.

@ghost
Copy link

ghost commented Oct 23, 2022

For me, 4.12.0 was pretty stable, but 4.13.0 started to exhibit the issue above when I upgraded to it.

Here it's reversed, 4.12 was having the startup issue (and only running with the DNS blocker off), while latest 4.13 is coming up fine.

@avioli
Copy link
Author

avioli commented Oct 24, 2022

Just upgraded to 4.13.0 and it seems to be working fine for me. I'll keep you posted if anything changes.

@halfpastfouram
Copy link

My trouble also seems to have been fixed in 4.13.0.

@cassus
Copy link

cassus commented Oct 24, 2022

For me, 4.12.0 was pretty stable, but 4.13.0 started to exhibit the issue above when I upgraded to it.

Same here.

Even after installing the development build of vpnkit from #6472 (comment)

@ianjukes
Copy link

ianjukes commented Jan 6, 2023

@verluci @ddbrodsky Another bit of evidence pointing to it being in the realm of a network/update issue. I am running 4.15 and have turned off Automatically check for updates and Always download updates and so far its been stable.

@kennethredler
Copy link

An interesting data point. I run Docker Desktop on a system that uses either AnyConnect or ZScalar to connect to a VPN. I have been mostly using ZScalar and since 4.12 I have been seeing the above issue. In the last couple of days I switched to AnyConnect and Docker Desktop has been stable for over 24 hours now. Could the issue be a networking problem?

💡Zscalar is also in play for me. 🤔 I'm assuming that may be why I saw the diagnostic failure captured in the screenshot I shared above.

I might try disabling the checks @ianjukes mentioned to see if that stabilizes things for me as well.

@codethought
Copy link

I've tried both Rancher desktop and colima- and both completely reset all of my configurations when a restart occurs.. it's like they have no idea that I ever configured any of my containers and the ones I have tried so far all act like they are brand new container configurations...

I'll stick to and fight with Docker Desktop a bit longer..

@ianjukes
Copy link

ianjukes commented Jan 9, 2023

I might try disabling the checks @ianjukes mentioned to see if that stabilizes things for me as well.

@kennethredler Did disabling updates work for you? I can confirm, when I re-eneabled automatically checking for updates, that the problem returned.

EDIT: I can also confirm when I disabled the automatic check for updates, Docker Desktop is completely stable again. I have a Mac Studio M1 Max running Ventura 13.1.

@zhiweio
Copy link

zhiweio commented Jan 10, 2023

I might try disabling the checks @ianjukes mentioned to see if that stabilizes things for me as well.

@kennethredler Did disabling updates work for you? I can confirm, when I re-eneabled automatically checking for updates, that the problem returned.

EDIT: I can also confirm when I disabled the automatic check for updates, Docker Desktop is completely stable again. I have a Mac Studio M1 Max running Ventura 13.1.

It works for me 👍

@ianjukes
Copy link

Thank you @zhiweio - that's good to know.

So to the Docker Desktop team: it would appear that something about the automated update process, particularly when it tries to call back across the Internet, is causing Docker Desktop to fail in certain conditions.

@kennethredler
Copy link

I might try disabling the checks @ianjukes mentioned to see if that stabilizes things for me as well.

@kennethredler Did disabling updates work for you? I can confirm, when I re-eneabled automatically checking for updates, that the problem returned.
EDIT: I can also confirm when I disabled the automatic check for updates, Docker Desktop is completely stable again. I have a Mac Studio M1 Max running Ventura 13.1.

It works for me 👍

Yeah, I haven't seen Docker go out to lunch since disabling auto version check. ZScalar is still in play for me, FWIW. FYI @djs55

@derekperkins
Copy link

I upgraded to v4.16.1 (95567), disabled auto version checks and downloads, and it still dies on me the same as before

@ianjukes
Copy link

I upgraded to v4.16.1 (95567), disabled auto version checks and downloads, and it still dies on me the same as before

@derekperkins Do you have any extensions enabled by any chance?

@derekperkins
Copy link

Yes, I had Disk Usage and Performance extensions from Docker installed and running. I just disabled extensions, so we'll see how long I go before Docker dies again.

@ianjukes
Copy link

Yes, I had Disk Usage and Performance extensions from Docker installed and running. I just disabled extensions, so we'll see how long I go before Docker dies again.

@derekperkins I suspect that will be the issue, as I enabled extensions and the problem returned. I think the extensions system also makes a network call back to Docker to check if there are updates to the extensions. My hunch is something is broken with the update system. I have also disabled extensions to see if its stable again. Fingers crossed! 🤞

@wslawski-printify
Copy link

Seems like disabling automatic updates fixed issue for me as well.

@fortran01
Copy link

fortran01 commented Jan 17, 2023

An interesting data point. I run Docker Desktop on a system that uses either AnyConnect or ZScalar to connect to a VPN. I have been mostly using ZScalar and since 4.12 I have been seeing the above issue. In the last couple of days I switched to AnyConnect and Docker Desktop has been stable for over 24 hours now. Could the issue be a networking problem?

I have Tailscale. I haven't tested yet by disabling Tailscale but putting this info just in case others have the same environment.

Update: disabling automatic updates did not fix it for me.

@jfbourne
Copy link

Im on a Mac Mini Intel i7 2012, disabling the update also seem to have fixed the issue for me. 🤞

Fyi: Additionally I disabled extension and everything else I don’t need lol

@Cromm
Copy link

Cromm commented Jan 17, 2023

Mac M1 Mini running Ventura 13.0.1
Docker Desktop 4.15
I switched off auto-update following reading this thread a few days ago. Ran smoothly for 3-4 days then the '..is stopping' issue recurred.

I'm now updating to 4.16.1 and disabling extensions...I'll report back.
Thanks to all for documenting issue and possible fixes!

@HawaiiRyan
Copy link

4.16.1 with auto-update disabled with the Portainer + Logs Explorer extensions installed works fine on Ventura 13.1 w/ Intel processor.

@derekperkins
Copy link

@ianjukes 4.16.1 with extensions disabled and with auto-updates disabled still died on me today

@ianjukes
Copy link

ianjukes commented Jan 18, 2023

@ianjukes 4.16.1 with extensions disabled and with auto-updates disabled still died on me today

@derekperkins That's a real shame - I figured that would be it. Just to verify, you have uninstalled all extensions, and then unchecked the 'Enable Docker Extensions' option in the settings?

I have uninstalled and disabled extensions, and I'm stable again. At least so far.

Screenshot 2023-01-18 at 9 18 45 am

@andyleanlibrary
Copy link

My company recently uninstalled Symantec from the mac. It's been a lot better since then but not sure if its related.
Do you all have strict firewalls/protection set up? If so, could it be that it's interfering with the auto-update?

@ianjukes
Copy link

@andyleanlibrary I'm at home with quite a simple network set up, and do not have any antivirus software installed. Docker should be stable. 🤷‍♂️

@Cromm
Copy link

Cromm commented Jan 18, 2023

@andyleanlibrary yes, similar setup to ianjukes - domestic install without antivirus. Extensions was activated but I don't believe there were any installed at any point before I de-activated extensions - I can check later.

@andyleanlibrary
Copy link

@ianjukes thanks - good to rule it out, if only for me :-)

@jfbourne
Copy link

Even with extensions disabled, mine still stoped. Ran well for 22 hours Only. I’m on 4.15.1

Apparently is supposed to be fixed in 4.16 but I have not tried it yet

#6606 (comment)

@nicks
Copy link

nicks commented Jan 18, 2023

I'm going to close this issue because the OP reported it fixed in #6472 (comment). I'm a little worried it's becoming a basketcase of unrelated issues, which makes it hard for us to separate out who is still having problems and who's not.

Docker Desktop 4.16 has some major fixes to network stability. If you're still having problems after using 4.16, please file an issue (preferably with diagnostics and/or repro steps). Thanks!

@nicks nicks closed this as completed Jan 18, 2023
@derekperkins
Copy link

FWIW, experimental features were enabled for me on 4.16.1. I don't remember enabling them recently, but it's very possible I manually did it 1+ years ago. It's now turned off to see if maybe that was causing it to crash.

image

@nicks
Copy link

nicks commented Jan 18, 2023

@derekperkins re: "there's no repro steps other than it just dying after being open for a while." here's a good page on how to upload diagnostics when you don't have repro steps - https://docs.docker.com/desktop/troubleshoot/overview/#diagnose

re: "why would you close this issue with plenty of people still reporting issues even on 4.16?" - that's a very philosophical question! why would we close one bug when there are still other bugs that cause the same symptoms? I like this old-internet post - https://www.chiark.greenend.org.uk/~sgtatham/bugs.html

@docker docker locked as resolved and limited conversation to collaborators Jan 18, 2023
@docker-robott
Copy link
Collaborator

Closed issues are locked after 30 days of inactivity.
This helps our team focus on active issues.

If you have found a problem that seems similar to this, please open a new issue.

/lifecycle locked

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests