Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre_Scan ignore takes more than two minutes for 2000 files even with configuring more processes #3745

Open
DoktorLenz opened this issue Apr 23, 2024 · 0 comments
Labels

Comments

@DoktorLenz
Copy link

Description

The pre_scan ignore takes to much time to complete.
For 2000 files it takes more than two minute.
With 2000 files, the pre_scan should actually be finished quickly.
Which scales up if the scanned repository is even larger.
Checking the processes i noticed that the pre_scan ignore step does not use more than one process even though i specified 16.

How To Reproduce

Scan a large repository like dotnet/runtime

git clone https://github.com/dotnet/runtime.git
cd runtime
scancode -cli --strip-root --timeout 120 --ignore ** --processes 16 --verbose --license-references --json-pp /temp/result.json

It will take a long time for scancode to finish. To speed it up, delete some files from the repository until you have about 2000 files left.
Now it took 134s to complete, but the scan itself only took 0.5s the rest was used by the pre-scan step.

System configuration

OS: Windows
Scancode v.32.1.0
Installation Method: "Installation as an application"

@DoktorLenz DoktorLenz added the bug label Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant