Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More speedup via mtime-base caching. #290

Merged
merged 6 commits into from
Mar 27, 2021
Merged

More speedup via mtime-base caching. #290

merged 6 commits into from
Mar 27, 2021

Conversation

jaraco
Copy link
Member

@jaraco jaraco commented Mar 8, 2021

Extension of #274.

anntzer and others added 3 commits February 21, 2021 19:06
Caching based on mtime is similar to the one done on importlib's
FileFinder.

Locally, on a large-ish environment, this speeds up repeated calls to
`distribution("pip")` ~10x.
@jaraco
Copy link
Member Author

jaraco commented Mar 27, 2021

In an attempt to troubleshoot, I added the following patch:

importlib_metadata fs-cache-v2 $ git diff -- importlib_metadata
diff --git a/importlib_metadata/__init__.py b/importlib_metadata/__init__.py
index 349de2e..f16d516 100644
--- a/importlib_metadata/__init__.py
+++ b/importlib_metadata/__init__.py
@@ -603,6 +603,7 @@ class FastPath:
 
     @functools.lru_cache()  # type: ignore
     def __new__(cls, root):
+        print("New FastPath for", root)
         return super().__new__(cls)
 
     def __init__(self, root):
@@ -636,6 +637,7 @@ class FastPath:
 
     @functools.lru_cache()
     def lookup(self, mtime):
+        print(self.lookup.cache_info())
         return Lookup(self)

And then running the cache test, it emits output like this:

perf run-test: commands[1] | python -m timeit -s 'import importlib_metadata; importlib_metadata.distribution("ipython")' -- 'importlib_metadata.distribution("ipython")'
New FastPath for .
CacheInfo(hits=0, misses=1, maxsize=128, currsize=0)
New FastPath for /Users/jaraco/code/public/importlib_metadata
CacheInfo(hits=0, misses=2, maxsize=128, currsize=1)
New FastPath for /Library/Frameworks/Python.framework/Versions/3.9/lib/python39.zip
CacheInfo(hits=0, misses=1, maxsize=128, currsize=0)
New FastPath for /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9
CacheInfo(hits=0, misses=2, maxsize=128, currsize=1)
New FastPath for /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/lib-dynload
CacheInfo(hits=0, misses=3, maxsize=128, currsize=2)
New FastPath for /Users/jaraco/code/public/importlib_metadata/.tox/perf/lib/python3.9/site-packages
CacheInfo(hits=0, misses=4, maxsize=128, currsize=3)
CacheInfo(hits=0, misses=5, maxsize=128, currsize=4)
CacheInfo(hits=0, misses=6, maxsize=128, currsize=5)
CacheInfo(hits=0, misses=1, maxsize=128, currsize=0)
CacheInfo(hits=0, misses=2, maxsize=128, currsize=1)
CacheInfo(hits=0, misses=3, maxsize=128, currsize=2)
CacheInfo(hits=0, misses=4, maxsize=128, currsize=3)
CacheInfo(hits=0, misses=5, maxsize=128, currsize=4)
CacheInfo(hits=0, misses=6, maxsize=128, currsize=5)
CacheInfo(hits=0, misses=1, maxsize=128, currsize=0)
CacheInfo(hits=0, misses=2, maxsize=128, currsize=1)
...
CacheInfo(hits=0, misses=4, maxsize=128, currsize=3)
CacheInfo(hits=0, misses=5, maxsize=128, currsize=4)
CacheInfo(hits=0, misses=6, maxsize=128, currsize=5)
CacheInfo(hits=0, misses=1, maxsize=128, currsize=0)
CacheInfo(hits=0, misses=2, maxsize=128, currsize=1)
CacheInfo(hits=0, misses=3, maxsize=128, currsize=2)
CacheInfo(hits=0, misses=4, maxsize=128, currsize=3)
200 loops, best of 5: 2.1 msec per loop

So even though only one FastPath is constructed for each root, the 'lookup' cache never has any hits. I've confirmed also that the mtime is unchanged from call to call. I just do not understand why the lookup is not being cached.

@jaraco
Copy link
Member Author

jaraco commented Mar 27, 2021

Aha. I've figured it out. The cache_clear is getting called when os.mtime raises OSError.

@jaraco
Copy link
Member Author

jaraco commented Mar 27, 2021

Looking at the alternate implementation in #274, I see the difference lies in how the lookup cache is cleared. In 274, the cache is per-FastPath, but in this PR, the cache on lookup, although it takes into account self in the key, when cleared gets cleared for all instances of FastPath.

@jaraco jaraco merged commit 6360916 into main Mar 27, 2021
@jaraco jaraco deleted the fs-cache-v2 branch March 27, 2021 15:32
jaraco added a commit that referenced this pull request Dec 21, 2023
Fixed handling of namespace packages in zip files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants