Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

basename_natural: Improve performance #2943

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mingmingrr
Copy link
Contributor

@mingmingrr mingmingrr commented Jan 28, 2024

ISSUE TYPE

  • Improvement/feature implementation
  • (minor) Breaking changes

RUNTIME ENVIRONMENT

  • Operating system and version: NixOS 24.05
  • Terminal emulator and version: WezTerm 20230712-072601-f4abf8fd
  • Python version: 2.6.9 2.7.10 3.11.6
  • Ranger version/commit: 136416c
  • Locale: en_US.UTF-8

CHECKLIST

  • The CONTRIBUTING document has been read [REQUIRED]
  • All changes follow the code style [REQUIRED]
  • All new and existing tests pass [REQUIRED]
  • Changes require config files to be updated
    • Config files have been updated
  • Changes require documentation to be updated
    • Documentation has been updated
  • Changes require tests to be updated
    • Tests have been updated

DESCRIPTION

Various optimizations to basename_natural functions.

Breaking change: The regex was from \D to \D+ so that all non-digits are grouped together and less elements are generated in basename_list, this results in the sorting order being changed so that numbers come before non-numbers after non-number prefixes. For example:

Original:
"a" < "a b" < "a#" < "a1" < "ab"
After changes:
"a" < "a1" < "a b" < "a#" < "ab"

Tests have been updated to reflect this. If this change is undesirable then I do have a version that keeps the original behavior, however it's slightly slower and much more complex.

MOTIVATION AND CONTEXT

Helps with #1173, but doesn't completely address it.

In certain scenarios with file sorting method left as the default natural sort, a large chunk of time used in viewing large directories is spent on sorting. Running ranger --profile on a directory with 76k files on an NVMe drive shows 16 seconds out of the 24 seconds startup time is spent generating the natural sort keys.

TESTING

Ran tests with make test_py on Python 3.11.

Results of profiling with ranger --profile on my /nix/store directory, which has 76k files:

Original:
       1    0.002   24.282  fm.py:388(loop)
   76079   14.642   16.219  fsobject.py:167(basename_natural_lower)
After changes:
       1    0.002   10.809  fm.py:388(loop)
   76079    1.826    2.966  fsobject.py:167(basename_natural_lower)

Testing only basename_natural with timeit shows an improvement from 15.0s to 1.3s on the same directory. (Alternate version that keeps the sorting behavior runs in 2.2s).

IMAGES / VIDEOS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant