Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add shared lock support #15

Merged
merged 1 commit into from Jan 24, 2018
Merged

Add shared lock support #15

merged 1 commit into from Jan 24, 2018

Conversation

theory
Copy link
Contributor

@theory theory commented Jan 11, 2018

The Windows version is a guess, really. Also, you'll need to fix the path in the test file to point to your repo again.

@theory
Copy link
Contributor Author

theory commented Jan 11, 2018

Closes #13.

Copy link
Member

@theckman theckman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@theory Thank you for doing this. I added Windows CI and fixed some issues I discovered in the process. However, in doing so I ended up breaking the ability for your branch to be merged cleanly. I'm sorry about that.

When you have a moment, would you please rebase your branch against and squash the commit to a single one providing the details of the change? I usually take inspiration from this post.

@theory theory force-pushed the readLock branch 2 times, most recently from e8f761f to 8c6916e Compare January 12, 2018 15:04
@theory
Copy link
Contributor Author

theory commented Jan 12, 2018

Rebased and squashed.

@theory
Copy link
Contributor Author

theory commented Jan 12, 2018

Crap, I just noticed that setFh() creates or truncates the file. That doesn't seem right. For shared locks, in particular, I generally want to lock them because I'm going to read them. If flock truncates the file, there's nothing to read.

Creating the file might make sense, though. Thoughts?

@theory
Copy link
Contributor Author

theory commented Jan 12, 2018

Removed the file truncation and pushed a new squashed commit.

Copy link
Member

@theckman theckman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. I finally managed to get a VirtualBox Windows VM spun up so I can dig in. I managed to find two things to fix which I think will make this build green.

When you have a moment can you please amend your commit with the fixes? If you're kinda busy, I'd be happy to add a second commit just so you retain the full authorship on your contribution.

flock_windows.go Outdated
// lock, the function will return false instead of waiting for the lock. If we
// get the lock, we also set the *Flock instance as being shared-locked.
func (f *Flock) TryRLock() (bool, error) {
return f.try(&f.r, winLockfileExclusiveLock)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the winLockfileSharedLock constant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in ffd5ea4.

flock_test.go Outdated
c.Check(locked, Equals, false)

// timeout
t.flock.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please call .Unlock() before locking. Not doing so breaks the tests on Windows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in ffd5ea4.

@theckman
Copy link
Member

@theory I missed your earlier comment. Removing the truncation sounds like a good approach. This package was originally written to provide an on-disk mutex for a job runner, so the file it locked wasn't related to the state of the job runner. This meant its contents weren't important so truncation was the default I went with.

Done by adding *RLock* variants of the Lock* methods and removing the file
truncation. The two sets of methods rely the same code by dispatching to
shared methods, only varying the lock type. The exception is RLocked(), which
checks its own, separate boolean. All the lock operations share the same
mutex, though, so any all operations can block each other. The docs have been
updated to reflet that fact. Also fixed the Windows Unlock() docs to refer to
UnlockFileEx() instead of syscall.LOCK_UN.
@theory
Copy link
Contributor Author

theory commented Jan 22, 2018

Hope this does the trick.

@theory
Copy link
Contributor Author

theory commented Jan 22, 2018

All checks passed yay!

@theckman theckman dismissed their stale review January 23, 2018 04:41

Changes were made.

Copy link
Member

@theckman theckman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@theory I've started to think about this approach of using the shared lock function, and I think in its current form it has a bit of a race condition issue because we have only one Unlock() method but two different ways to unlock the mutex.

It seems that in some cases, the OS is smart enough to know whether or not a shared lock can become an exclusive lock and it may give the exclusive lock after giving us the shared lock. This is why the test was working on Linux, but stalling on Windows until we called .Unlock() before locking again.

The case that concerns me is when multiple things are using this, and maybe using different lock types. If one goroutine takes a shared lock, and something else takes an exclusive lock, it seems like the lock could transparently become an exclusive lock and then the shared locker could accidentally drop the exclusive lock without knowing it. Because lock() takes one of the locked boolean values, it never knows if we have an exclusive or shared lock and can't prevent the possibility of hitting this failure mode.

I'm not totally sure if this is possible, but I'm thinking it could potentially cause some unexpected behaviors / data corruption. What do you think?

Copy link
Member

@theckman theckman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've written a quick Go program that hits a case where the code thinks it has an exclusive lock and it's released without it knowing. I apologize for the code being pretty rough:

I think we need to rethink this approach to avoid the locks being swapped-out underneath. I think being more explicit, with two different unlock functions, would help solve the issue. Maybe we should disallow taking two locks at the same time?

I'd love to hear what you think about this @theory.

@theory
Copy link
Contributor Author

theory commented Jan 23, 2018

It's a feature that you can switch between locks. From the flock manpage on macOS:

 A shared lock may be upgraded to an exclusive lock, and vice versa, simply by
 specifying the appropriate lock type; this results in the previous lock being
 released and the new lock applied (possibly after other processes have gained
 and released the lock).

There's more, as documented in this SO answer. I think the simplest thing to say is “don't share file locks across goroutines.“ It defeats the whole purpose, frankly, to have, say, two threads that both think they have an exclusive lock on a file.

I realize the description of this package is “Thread-safe file locking library in Go”, but given the shared file descriptor, it really isn't thread-safe, in the sense that it's not safe to share it across goroutines. This challenge exists even without adding shared-locks: If you create a flock object in one goroutine, then get an exclusive lock in another, both goroutines will think they have an exclusive lock.

@theory
Copy link
Contributor Author

theory commented Jan 23, 2018

Ignoring the threading problem, if you want to disallow switching between locks, perhaps we should just use a single flag regardless of the lock?

@theckman
Copy link
Member

theckman commented Jan 23, 2018

@theory Hm, yeah. This is a tough problem. Thank you for taking the time to think through this with me.

You're right that with a shared descriptor it isn't safe. To make it actually safe we'd probably need better accounting around locks, such as how many have been taken and what type they were. Maybe even handling the transition from a shared to exclusive lock. That would be a bit of a change in how this code works, and I won't ask for that here.

I think your change is good and can be merged as-is. I'll make a separate commit after merging to update some of the comments/docs to call out the sharp edges around sharing locks, because it's not a new problem and has existed before.

If issues come up with the current implementation, or I feel like scratching the itch, we can look at enhancing the accounting to be more correct.

@theckman theckman merged commit ffd5ea4 into gofrs:master Jan 24, 2018
theckman added a commit that referenced this pull request Jan 24, 2018
Add shared lock support

Signed-off-by: Tim Heckman <t@heckman.io>
@theckman
Copy link
Member

@theory Thank you again. Once the merge build finishes (and passes) I'll cut 0.4.0.

@theckman
Copy link
Member

@theory 0.4.0 is released.

@theory theory deleted the readLock branch January 24, 2018 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants