-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Old EFS certificates not removed #124
Comments
Hey, Thanks for the report. The certs are stored on We do have cleanup logic running in our watchdog (https://github.com/aws/efs-utils/blob/master/src/watchdog/__init__.py#L824-L834). If the file system is umounted and then mount again, the certs should be cleaned up then. Can you elaborate on the
|
At least on Debian-based systems, /var/run is a symlink to /run. Thus, /var/run/efs is effectively in the /run tmpfs filesystem.
That's correct for one filesystem mount. I have 3 EFS mounts on the machine, and together with some system files normally in /run they take up all 47MB that are available on the 512MB memory machine. This is output from the current run. It's after cleaning up the pem files. You can see it can house up to 39M of more pem files.
Yes.
I guess I was ambiguous. There was no umount attempt on the current 3 EFS mounts there. These were running for 6 months, after which the EFS watchdog failed to create new pem certificates, and couldn't fetch them to stunnel. Stunnel failed to re-establish the link, and the mounts became stale. It was also impossible to re-mount the EFS mounts. I guess you can reproduce the problem by filling up /run on a Debian machine with random data and waiting for another re-keying attempt from the EFS watchdog. |
Thanks, got your point. While we have someone investigating the issue, can you for now unmount the file system on a monthly frequency so that watchdog can clean up the state file directory? |
Thanks for taking this seriously. I'll work around the issue for the time being. |
We are running an EC2 instance with 512MB memory with 3 EFS mounts, using the EFS helper.
After 6 months of instance's uptime, the machine failed the mounts and got a number of issues caused by full /run filesystem.
du shows
13556 ./fs-3ac8f8f3.efs.ROTATED_OUT.20137+/certs
certs# ls -l |wc -l
3390
The directory holds hourly certificates for the last 6 months. There are 3 EFS mounts on the machine, so all of those filled up the 47MB /run filesystem that is there.
Please implement garbage collector for the certs.
OS: Debian 10.11 (has /var/run symlinked to /run on ramfs)
EFS helper version: 1.30.2 (but I checked the latest doesn't have cleanup, either)
The text was updated successfully, but these errors were encountered: