Stopping a container blocks all further stop attempts for the same container #8030

juanfresia · 2024-04-17T10:24:04Z

What happened?

We've noticed that with version 1.29.1 we no longer can issue multiple crictl stop commands for the same container. If there is a StopContainer operation active, any other StopContainer operation will timeout at the client side waiting for the server to take a lock that is hold by the initial stop container operation.

Context

In a scenario where you have one container that is unwilling to die gracefully over a long period of time (e.g. a big database that ignores SIGTERM for a long period of time while closing connections), the following regression happens:

Sending a crictl stop -t 300 <container>, will send a SIGTERM and after 300s send a SIGKILL. This command hangs as it should while we wait for the container to die gracefully or get forcefully terminated after 300s.
The container ignores SIGTERM, and won't die on its own. So we need to wait for the full timeout.
We decided that we don't want to wait for 300s, and we now want to issue crictl stop -t 0 <container> to immediately kill it.

Currently, the second command will not take effect, and crictl will get a context timeout from crio server.

The following logs show the symptoms :

$ crictl stop dd5d5449948de
E0417 09:57:18.003776  750032 remote_runtime.go:366] "StopContainer from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" containerID="dd5d5449948de"
FATA[0002] stopping the container "dd5d5449948de": rpc error: code = DeadlineExceeded desc = context deadline exceeded

This behavior was not present on version 1.27.1 and multiple stop container operations can be issued.

What did you expect to happen?

I'd expect to have the same behavior we did in 1.27; where the second crictl stop command goes through and manages to kill the container. The container runtime should honor all stopping requests, even if there is one currently waiting in a longer timeout, we should be able to stop a container forcefully even if we mistakenly or purposely tried with a longer timeout before.
I think this might be a regression introduced around the time of this refactor.

How can we reproduce it (as minimally and precisely as possible)?

To reproduce the issue, in a host running k8s with cri-o as container runtime:

Create the following pod:

apiVersion: v1
kind: Pod
metadata:
  name: sleepy-dummy
spec:
  terminationGracePeriodSeconds: 3600
  containers:
    - name: sleepy-dummy
      image: docker.io/library/busybox
      command: ["sleep", "600"]
      imagePullPolicy: IfNotPresent

This uses a busibox image to run a sleep command for 600 seconds. And has a terminationGracePeriodSeconds of 3600.

Check that the pod is running

$ kubectl get pods -n default sleepy-dummy
NAME           READY   STATUS    RESTARTS   AGE
sleepy-dummy   1/1     Running   0          14s

Attempt to delete the pod using kubectl. This will triger a StopContiainer call with a timeout o 3600; which will held the stop lock on crio server preventing any future stops to work.

$ kubectl delete pod -n infra-test sleepy-dummy
pod "sleepy-dummy" deleted
<hangs>

Attempt to stop the container using crictl stop.

$ crictl ps | grep sleepy
8c528df1cddbe       ba5dc23f65d4cc4a4535bce55cf9e63b068eb02946e3422d3587e8ce803b6aab                                                                                          About a minute ago   Running             sleepy-dummy            0                   05cfc9fc3bfed       sleepy-dummy

$ sudo -i crictl stop 8c528df1cddbe
E0417 10:14:59.915882  798939 remote_runtime.go:366] "StopContainer from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" containerID="8c528df1cddbe"
FATA[0002] stopping the container "8c528df1cddbe": rpc error: code = DeadlineExceeded desc = context deadline exceeded

Attempting to stop with explicit timeouts:

$ time sudo -i crictl -t 5s stop -t 0 8c528df1cddbe
E0417 10:15:20.765337  800099 remote_runtime.go:366] "StopContainer from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" containerID="8c528df1cddbe"
FATA[0005] stopping the container "8c528df1cddbe": rpc error: code = DeadlineExceeded desc = context deadline exceeded

real    0m5.071s
user    0m0.049s
sys     0m0.031s

Note: the same can be achieved using crictl stop -t 3600 <container> on step 3.

Anything else we need to know?

I've run a goroutine dump of the state of my cri-o server during step 4 of the reproduction steps and confirmed that there is at least one goroutine in the method WaitOnStopTimeout while all the others cannot enter the stop main method.

This is the first crictl stop issued, with the high timeout, waiting in this timout channel if I'm not mistaken. It is holding the stopLock.

goroutine 61352 [select, 1 minutes]:
github.com/cri-o/cri-o/internal/oci.(*Container).WaitOnStopTimeout(0xc00201ec00, {0x2455340, 0xc000ed9e90}, 0xe10)
        github.com/cri-o/cri-o/internal/oci/container.go:591 +0x1ec
github.com/cri-o/cri-o/internal/oci.(*runtimeOCI).StopContainer(0xc001de5520, {0x2455340?, 0xc000ed9e30?}, 0xc00201ec00, 0xc000ed9e00?)
        github.com/cri-o/cri-o/internal/oci/runtime_oci.go:815 +0x2e5
github.com/cri-o/cri-o/internal/oci.(*Runtime).StopContainer(0x2455340?, {0x2455340?, 0xc000ed9da0?}, 0x0?, 0x0?)
        github.com/cri-o/cri-o/internal/oci/oci.go:308 +0xff
github.com/cri-o/cri-o/server.(*Server).stopContainer(0xc000280800, {0x2455340?, 0xc000ed9b30?}, 0xc00201ec00, 0x40?)
        github.com/cri-o/cri-o/server/container_stop.go:60 +0x40b
github.com/cri-o/cri-o/server.(*Server).StopContainer(0xc000280800, {0x2455340?, 0xc000ed9950?}, 0xc000790e00)
        github.com/cri-o/cri-o/server/container_stop.go:25 +0x2aa
k8s.io/cri-api/pkg/apis/runtime/v1._RuntimeService_StopContainer_Handler.func1({0x2455340, 0xc000ed9950}, {0x20224a0?, 0xc000790e00})
        k8s.io/cri-api@v0.29.0/pkg/apis/runtime/v1/api.pb.go:11205 +0x75
main.main.func2.UnaryInterceptor.func6({0x2455340, 0xc000ed98c0}, {0x20224a0, 0xc000790e00}, 0xc000790e20, 0xc002a68c78)
        github.com/cri-o/cri-o/server/otel-collector/interceptors.go:64 +0x198
k8s.io/cri-api/pkg/apis/runtime/v1._RuntimeService_StopContainer_Handler({0x20dbf20?, 0xc000280800}, {0x2455340, 0xc000ed98c0}, 0xc001b19380, 0x21fece8)
        k8s.io/cri-api@v0.29.0/pkg/apis/runtime/v1/api.pb.go:11207 +0x135
google.golang.org/grpc.(*Server).processUnaryRPC(0xc000240000, {0x2455340, 0xc000ed9710}, {0x245ef00, 0xc000c604e0}, 0xc0017027e0, 0xc0001fb3b0, 0x33ffae0, 0x0)
        google.golang.org/grpc@v1.60.1/server.go:1372 +0xe03
google.golang.org/grpc.(*Server).handleStream(0xc000240000, {0x245ef00, 0xc000c604e0}, 0xc0017027e0)
        google.golang.org/grpc@v1.60.1/server.go:1783 +0xfec
google.golang.org/grpc.(*Server).serveStreams.func2.1()
        google.golang.org/grpc@v1.60.1/server.go:1016 +0x59
created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 30695
        google.golang.org/grpc@v1.60.1/server.go:1027 +0x115

Here is one of the following crictl stops calls, attempting to get the same lock for ShouldBeStopped function.

goroutine 61810 [sync.RWMutex.RLock]:
sync.runtime_SemacquireRWMutexR(0xc003389ce0?, 0xae?, 0x0?)
        runtime/sema.go:82 +0x25
sync.(*RWMutex).RLock(...)
        sync/rwmutex.go:71
github.com/cri-o/cri-o/internal/oci.(*Container).State(0xc00201ec00)
        github.com/cri-o/cri-o/internal/oci/container.go:408 +0x52
github.com/cri-o/cri-o/internal/oci.(*Container).ShouldBeStopped(...)
        github.com/cri-o/cri-o/internal/oci/container.go:549
github.com/cri-o/cri-o/internal/oci.(*runtimeOCI).StopContainer(0xc001de5520, {0x2455340?, 0xc003389ce0?}, 0xc00201ec00, 0xc003389cb0?)
        github.com/cri-o/cri-o/internal/oci/runtime_oci.go:798 +0x153
github.com/cri-o/cri-o/internal/oci.(*Runtime).StopContainer(0x2455340?, {0x2455340?, 0xc003389bc0?}, 0x0?, 0x0?)
        github.com/cri-o/cri-o/internal/oci/oci.go:308 +0xff
github.com/cri-o/cri-o/server.(*Server).stopContainer(0xc000280800, {0x2455340?, 0xc0033899b0?}, 0xc00201ec00, 0xd?)
        github.com/cri-o/cri-o/server/container_stop.go:60 +0x40b
github.com/cri-o/cri-o/server.(*Server).StopContainer(0xc000280800, {0x2455340?, 0xc003389740?}, 0xc001f380a0)
        github.com/cri-o/cri-o/server/container_stop.go:25 +0x2aa
k8s.io/cri-api/pkg/apis/runtime/v1._RuntimeService_StopContainer_Handler.func1({0x2455340, 0xc003389740}, {0x20224a0?, 0xc001f380a0})
        k8s.io/cri-api@v0.29.0/pkg/apis/runtime/v1/api.pb.go:11205 +0x75
main.main.func2.UnaryInterceptor.func6({0x2455340, 0xc003389680}, {0x20224a0, 0xc001f380a0}, 0xc001f380c0, 0xc001ac43c0)
        github.com/cri-o/cri-o/server/otel-collector/interceptors.go:64 +0x198
k8s.io/cri-api/pkg/apis/runtime/v1._RuntimeService_StopContainer_Handler({0x20dbf20?, 0xc000280800}, {0x2455340, 0xc003389680}, 0xc001c10600, 0x21fece8)
        k8s.io/cri-api@v0.29.0/pkg/apis/runtime/v1/api.pb.go:11207 +0x135
google.golang.org/grpc.(*Server).processUnaryRPC(0xc000240000, {0x2455340, 0xc00114c6c0}, {0x245ef00, 0xc00117ba00}, 0xc00149de60, 0xc0001fb3b0, 0x33ffae0, 0x0)
        google.golang.org/grpc@v1.60.1/server.go:1372 +0xe03
google.golang.org/grpc.(*Server).handleStream(0xc000240000, {0x245ef00, 0xc00117ba00}, 0xc00149de60)
        google.golang.org/grpc@v1.60.1/server.go:1783 +0xfec
google.golang.org/grpc.(*Server).serveStreams.func2.1()
        google.golang.org/grpc@v1.60.1/server.go:1016 +0x59
created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 61795
        google.golang.org/grpc@v1.60.1/server.go:1027 +0x115

CRI-O and Kubernetes version

$ crio --version
crio version 1.29.1
Version:        1.29.1
GitCommit:      78e179ba8dd3ce462382a17049e8d1f770246af1
GitCommitDate:  2024-01-11T21:36:06Z
GitTreeState:   clean
BuildDate:      1970-01-01T00:00:00Z
GoVersion:      go1.21.1
Compiler:       gc
Platform:       linux/amd64
Linkmode:       static
BuildTags:
  static
  netgo
  osusergo
  exclude_graphdriver_btrfs
  exclude_graphdriver_devicemapper
  seccomp
  apparmor
  selinux
LDFlags:          unknown
SeccompEnabled:   true
AppArmorEnabled:  false

$ kubectl version --output=json
{
  "clientVersion": {
    "major": "1",
    "minor": "29",
    "gitVersion": "v1.29.2",
    "gitCommit": "4b8e819355d791d96b7e9d9efe4cbafae2311c88",
    "gitTreeState": "clean",
    "buildDate": "2024-02-14T10:40:49Z",
    "goVersion": "go1.21.7",
    "compiler": "gc",
    "platform": "linux/amd64"
  },
  "kustomizeVersion": "v5.0.4-0.20230601165947-6ce0bf390ce3",
  "serverVersion": {
    "major": "1",
    "minor": "29",
    "gitVersion": "v1.29.2",
    "gitCommit": "4b8e819355d791d96b7e9d9efe4cbafae2311c88",
    "gitTreeState": "clean",
    "buildDate": "2024-02-14T10:32:40Z",
    "goVersion": "go1.21.7",
    "compiler": "gc",
    "platform": "linux/amd64"
  }
}

OS version

# On Linux:
$ cat /etc/os-release
NAME="Oracle Linux Server"
VERSION="8.9"
ID="ol"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="8.9"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Oracle Linux Server 8.9"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:oracle:linux:8:9:server"
HOME_URL="https://linux.oracle.com/"
BUG_REPORT_URL="https://github.com/oracle/oracle-linux"

ORACLE_BUGZILLA_PRODUCT="Oracle Linux 8"
ORACLE_BUGZILLA_PRODUCT_VERSION=8.9
ORACLE_SUPPORT_PRODUCT="Oracle Linux"
ORACLE_SUPPORT_PRODUCT_VERSION=8.9

$ uname -a
Linux <redacted> 4.18.0-513.24.1.el8_9.x86_64 #1 SMP Wed Apr 10 08:10:12 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux

Additional environment details (AWS, VirtualBox, physical, etc.)

The text was updated successfully, but these errors were encountered:

haircommander · 2024-04-19T03:47:36Z

interesting, I would expect this to work given we have a test to check it would: https://github.com/cri-o/cri-o/blob/main/test/ctr.bats#L1150

though, we don't verify we don't wait on the lock...

I will try to look at this tomorrow or next week. please poke me if I forget 🙏

juanfresia · 2024-04-19T11:42:14Z

Hey @haircommander, thanks for the reply!

I've looked a bit at the tests you shared and it looks like the same scenario I'm describing. However, if I understand correctly, every crictl command in the tests is ran with 10m client-side timeout; which is high enough to don't have any of the crictl stop commands in the test to fail due to timeout (the first SIGKILL will come after 150s, which will unblock the rest of the stop commands).

If that is the case, I suspect reducing the client timeout (or increasing SIGTERM timeout) will cause the test to fail. That's my take as far as I understand.

juanfresia · 2024-05-13T16:38:50Z

Hi @haircommander, any updates on this?

juanfresia added the kind/bug Categorizes issue or PR as related to a bug. label Apr 17, 2024

haircommander self-assigned this Apr 19, 2024

sohankunkerkar self-assigned this May 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stopping a container blocks all further stop attempts for the same container #8030

Stopping a container blocks all further stop attempts for the same container #8030

juanfresia commented Apr 17, 2024 •

edited

haircommander commented Apr 19, 2024

juanfresia commented Apr 19, 2024

juanfresia commented May 13, 2024

Stopping a container blocks all further stop attempts for the same container #8030

Stopping a container blocks all further stop attempts for the same container #8030

Comments

juanfresia commented Apr 17, 2024 • edited

What happened?

Context

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

CRI-O and Kubernetes version

OS version

Additional environment details (AWS, VirtualBox, physical, etc.)

haircommander commented Apr 19, 2024

juanfresia commented Apr 19, 2024

juanfresia commented May 13, 2024

juanfresia commented Apr 17, 2024 •

edited