Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

efs-utils making the stunnel enter zombie state. #175

Open
Sharukh95 opened this issue Sep 12, 2023 · 5 comments
Open

efs-utils making the stunnel enter zombie state. #175

Sharukh95 opened this issue Sep 12, 2023 · 5 comments

Comments

@Sharukh95
Copy link

We're facing an issue where the stunnel process running on our AWS Ec2 enters a zombie state. This results in the nfs server timeout (We're using EFS). We're running Amazon Linux 2 (Kernel version 4.14.318-241.531) with stunnel version 5.6.4. The EFS utils version we're using is 3.1.33

@paurosello
Copy link

We are facing the same issue on kubernetes, when the CSI driver reboots, connection to the EFS drive timeouts

@Ashley-wenyizha
Copy link

Hi, thanks for reporting this issue, assuming you are using 1.33.3? The EFS utils version we're using is 3.1.33

We fixed the zombie stunnel issue in a later version, could you upgrade to latest version and see if it still persists?

thanks

@whites11
Copy link

Hi, thanks for reporting this issue, assuming you are using 1.33.3? The EFS utils version we're using is 3.1.33

We fixed the zombie stunnel issue in a later version, could you upgrade to latest version and see if it still persists?

thanks

What version exactly was it fixed?
We're still seeing this in 1.35.0:

# ./amazon-efs-mount-watchdog --version
./amazon-efs-mount-watchdog Version: 1.35.0

(I'm @paurosello's teammate)

@luca-rui
Copy link

@Ashley-wenyizha Has there been any news on this topic?

(also a teammate of @paurosello and @whites11 😅)

@phmeier-nubank
Copy link

I have observed zombie processes as well with 1.35.0. This is a patch which fixed it for me in a test system.

From 87c1f0169a003e7ef5a0297e1d7aaaacdb19e91b Mon Sep 17 00:00:00 2001
From: Philipp Meier <philipp.meier@nubank.com.br>
Date: Thu, 1 Feb 2024 16:32:07 +0100
Subject: [PATCH] Read subprocess status to prevent zombies

---
 src/watchdog/__init__.py | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/src/watchdog/__init__.py b/src/watchdog/__init__.py
index a419f80..05c23b0 100755
--- a/src/watchdog/__init__.py
+++ b/src/watchdog/__init__.py
@@ -1262,6 +1262,9 @@ def check_stunnel_health(
         # process after the timeout.
         #
         process.kill()
+    finally:
+        # read proc state to prevent zombies
+        process.wait()


 # Retrieve the nfs mountpoint with the port information in the mount option
@@ -1316,6 +1319,8 @@ def check_child_procs(child_procs):
                 proc.pid,
                 proc.returncode,
             )
+            # read proc state to prevent zombies
+            proc.wait()
             child_procs.remove(proc)


--
2.43.0

Use git am < this-file.path to apply.

To verify I have conducted some tests including killing and halting the stunnel process. Before the patch I could observe zombie processes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants