Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limiting cache size? #185

Open
OrangeDog opened this issue Apr 12, 2018 · 5 comments
Open

Limiting cache size? #185

OrangeDog opened this issue Apr 12, 2018 · 5 comments

Comments

@OrangeDog
Copy link
Contributor

It appears that neither DictCache nor FileCache provide any way to evict entries that aren't being accessed - neither by age nor a max cache size.

It's only possible with Redis by configuring the database expiration externally.

@jaap3
Copy link
Contributor

jaap3 commented May 22, 2018

When using the FileCache a workaround might be to set up a cronjob that just deletes all "old" cache files:

find /path/to/requests_cache/ -type f -mtime +14 -delete && find /path/to/requests_cache/ -type d -empty -delete

Note that this does not check if the files are actually expired and also doesn't limit the actual disk usage, so your mileage may vary.

@ionrock
Copy link
Contributor

ionrock commented May 22, 2018

My anticipation is that when someone reaches this sort of problem, that it is time to take better ownership of that cached data and consider using something like an external store. It would be nice to support many different types of caches along with the usage, but as that gets really complicated, I've avoided it in CacheControl proper.

With that said, I would think it could be a good idea to have a separate caches package that includes different implementations and can share common code such as a worker that can examine the cache implementation for stale entries, allowing folks to focus on support for specific storage systems instead.

@jaap3 jaap3 mentioned this issue Jun 4, 2018
@jaap3
Copy link
Contributor

jaap3 commented Jun 19, 2018

I did some testing, it seems that python-diskcache can be used as a drop in replacement for FileCache:

import requests

from cachecontrol import CacheControl
from diskcache import FanoutCache

class MyFanoutCache(FanoutCache):
    # Workaround until either grantjenks/python-diskcache#77 or #195 is fixed
    def __bool__(self): 
        return True
    __nonzero__ = __bool__

cache = MyFanoutCache('./tmp', size_limit=2 ** 30, eviction_policy='least-recently-used')
session = CacheControl(requests.Session(), cache=cache)

Then you could periodically call cache.cull() to get the size back down.

However, it's not possible to remove expired items, because the cache itself is not aware of the expiry date of the response.

@tedivm
Copy link

tedivm commented Jul 27, 2018

I got the FileCache working but decided to try the diskcache FanoutCache because I wanted the cull functionality, but when testing it appears that the FanoutCache is not actually being fully utilized. Files (cache.db) are being created in the appropriate directories but they aren't being populated with data. I went back to the FileCache for now, as it is working fine.

@jaap3
Copy link
Contributor

jaap3 commented Sep 11, 2018

@tedivm You are right, I checked and while the cache.db files are being created they never store any data. It turns out that the cache object from diskcache implements a __len__ method which returns 0 if there are no cache entries. CacheControl checks the incoming cache argument and falls back to DictCache if it's falsy.

I've created pull requests for both projects to correct this. (grantjenks/python-diskcache#77, #195)

In the mean time, if you're still interested to try out the FanoutCache you could subclass it and patch it's boolean conversion behavior.

i.e.:

class MyFanoutCache(FanoutCache):
    def __bool__(self):  # Python 3
        return True

    __nonzero__ = __bool__  # Python 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants