Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect and handle stale locks #124

Open
Luzifer opened this issue May 15, 2016 · 6 comments
Open

Detect and handle stale locks #124

Luzifer opened this issue May 15, 2016 · 6 comments

Comments

@Luzifer
Copy link

Luzifer commented May 15, 2016

As a system administrator I want to be able to trust my backup mechanism to run even if it failed once without having to manually check it every time

Observed behavior

  • Backup script runs
  • Script gets killed from whatever reason, is unable to remove its lock
  • Consecutive backups fails with lock held by XXXX
  • No notification is sent using slack channel though it's configured

Expected behavior

  • The stale lock is detected because of the registered PID is no longer running
  • Stale lock is removed, backup process is continued

OR

  • Lock is detected
  • Deadlock (PID no longer running) is detected
  • Slack channel gets notification about deadlock

Logs

# cat /var/log/duplicity/duplicity-2016-05-15_01-12.txt
--------    START DUPLICITY-BACKUP SCRIPT for docker01   --------

Attempting to acquire lock /var/log/duplicity/backup.lock
lock failed, could not acquire /var/log/duplicity/backup.lock
lock held by 3124
# ps aux | grep 3124
root      7661  0.0  0.0  11712   668 pts/3    S+   02:10   0:00 grep --color=auto 3124
#
@zertrin
Copy link
Owner

zertrin commented May 15, 2016

You're right, this would be a very useful enhancement. Not sure I will be able to look into implementing it soon. Anyone feel free to propose a pull request before I do 😉

@jarondl
Copy link

jarondl commented May 17, 2016

Regarding the second expected behavior, the script does send an email but not the other notification methods (e.g. slack). @zertrin, maybe we should simply add send_notification next to email_logfile at https://github.com/zertrin/duplicity-backup/blob/b92d60f028dffb94dc3aff2cd674dce4d5a9f48c/duplicity-backup.sh#L436?
Actually there 10 appearances of exit in the script, maybe they should be replaced by some notificiation-sending function? (at least if the configuration was correct enough to set it up).

@zertrin
Copy link
Owner

zertrin commented May 17, 2016

I fully agree. I'll look into this soon sometime since that's easier.

@jrbenito
Copy link

@zertrin

I did just what @jarondl suggested above and nothing more. I have two enhancements in mind:

  1. Figure out way to notification carrier a message that identifies the error (in this case, the stale lock)
  2. Handle the stale lock would be nice as suggested by @Luzifer

I let those two for later. However, regarding item 1 I don´t figured out the best way to do this, I think it may require a refactoring of send_notification in order to accept some optional parameter. Any thoughts?

@camjac251
Copy link

How do you deal with rebooting the server you're backing up? Each time that I do, it's halfway through the last backup causing it to never start back up since the lockfile still exists.

@zertrin
Copy link
Owner

zertrin commented Sep 23, 2016

It doesn't happen to me since my backup doesn't last that long and I'm never rebooting around the time where my backup is running.

Locking mechanisms are hard to get right and can be annoying. Still didn't found the time to implement a solution, but I welcome contributions that aim at doing locking "the right way" (probably with a PID check somewhere).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants