Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in FileCache that can result in empty body 200 response #332

Open
thatch opened this issue Apr 25, 2024 · 1 comment · May be fixed by #335
Open

Race condition in FileCache that can result in empty body 200 response #332

thatch opened this issue Apr 25, 2024 · 1 comment · May be fixed by #335
Labels

Comments

@thatch
Copy link

thatch commented Apr 25, 2024

I don't have a test case or suggested fix (yet), but documenting that this exists:

  1. Poetry (unlike pip) uses cachecontrol's FileCache with no modifications.
  2. If there is already a cached response, and two different processes (or threads too, probably) revalidate at the same time getting a 304, the first one goes through _cache_set and the second one can find no body (returning a response with empty body because _secure_open_write deletes before creating).

Potential ideas:

  • Check for None body, and try harder (maybe .get(locked=True). Pro: simple, targeted change that only affects the rare case, Con: api change, deletion can still race
  • Protect get (and delete) with the same lock that set uses, every time. Pro: simple, Con: performance, race can still return mismatched body (similar to The new SeparateBodyFileCache design makes updates/reads non-atomic #324 with SeparateBodyFileCache)
  • Stop using bespoke mkstemp that has a race, use os.replace. Pro: less code, file isn't missing during the set, Con: commit history doesn't make it obvious why this exists, seems defensive against problems in world-writable cache dirs or maybe a windows issue?
@woodruffw
Copy link
Member

  • Stop using bespoke mkstemp that has a race, use os.replace. Pro: less code, file isn't missing during the set, Con: commit history doesn't make it obvious why this exists, seems defensive against problems in world-writable cache dirs or maybe a windows issue?

This seems like a clean approach to me. I don't have any additional insight to add on why we don't already use os.replace, except that it might be because os.replace wasn't added until 3.3 and the older os.rename doesn't have the same consistent POSIX-esque semantics on Windows (whereas os.replace does).

@woodruffw woodruffw added the bug label Apr 25, 2024
thatch added a commit to thatch/cachecontrol that referenced this issue May 8, 2024
As explained in psf#332, there was previously a small window of time where the
file is deleted before its new contents get written.  Because reads don't
happen with the lock held, this resulted in an empty-body cache hit when it
shouldn't.
thatch added a commit to thatch/cachecontrol that referenced this issue May 8, 2024
As explained in psf#332, there was previously a small window of time where the
file is deleted before its new contents get written.  Because reads don't
happen with the lock held, this resulted in an empty-body cache hit when it
shouldn't.

Fixes psf#332
@thatch thatch linked a pull request May 8, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants