Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable/Disable archive scanning from commandline #2505

Open
0x736E opened this issue Feb 24, 2024 · 10 comments
Open

Enable/Disable archive scanning from commandline #2505

0x736E opened this issue Feb 24, 2024 · 10 comments

Comments

@0x736E
Copy link

0x736E commented Feb 24, 2024

Please review the Community Note before submitting

Description

Users should be able to enable or disable scanning file archives from the commandline.

There are situations where it is not desirable to scan file archives at all, and at present the only method to effectively disable this behaviour is by setting archive configurations which are impossible (e.g. --archive-max-size=1B).

Additionally due to #2506, which demonstrates that the file archive scanning capability is inconsistently enabled depending upon data source, users should have the ability to enable or disable this behaviour. As it currently stands, file archive scanning is defined internally, and not configurable meaning that some data sources (git) will produce results in scans but others will not. Users should be able to configure this behaviour.

Preferred Solution

A new commandline flag such as --no-archive which disables the default behaviour of scanning file archives.

Additional Context

N/A

References

@rgmz
Copy link
Contributor

rgmz commented Feb 24, 2024

Related: #2257

@0x736E
Copy link
Author

0x736E commented Feb 24, 2024

Related: #2506

@0x736E 0x736E changed the title Disable archive scanning from commandline Enable/Disable archive scanning from commandline Feb 26, 2024
@clonsdale-canva
Copy link
Contributor

clonsdale-canva commented Mar 4, 2024

Yes please, somewhere a few versions ago archive scanning changed and it breaks consistently all my scans. I try the workaround to only scan archives with max size 1B but still doesn't succeed. I can't use the latest version of trufflehog until I can fully skip scanning all archives.

@0x736E
Copy link
Author

0x736E commented Mar 6, 2024

Can you please elaborate how your scans are still failing?

Perhaps also try setting the max timeout to 1ms:

--archive-max-timeout=1ms

@clonsdale-canva
Copy link
Contributor

I haven't fully debugged the issue, since I'm running on GitHub Actions and it simply says it disconnects. Either the timeout or max size is ineffective. It could be chewing up too many resources to process the archives

@0x736E
Copy link
Author

0x736E commented Mar 9, 2024

@clonsdale-canva It sounds like you are enountering another issue in addition to the archive scanning issues.

From my testing, archive scanning does not affect whether the scan succeeds or not, only the scope of what is scanned and the subsequent output as a result.

@dustin-decker dustin-decker added help wanted contributions welcomed Signal for help from the community! and removed help wanted contributions welcomed Signal for help from the community! labels Mar 9, 2024
@dustin-decker
Copy link
Contributor

We are working on plans to centralize archive handling which will make it easy to toggle on/off for all sources.

@clonsdale-canva
Copy link
Contributor

@0x736E Spent some time debugging and you're correct. I am running trufflehog on many repositories at once in a highly parallel system. Some change in v3.64.0 caused it to tip over the edge with resource exhaustion which I mistakingly attributed to archive processing.

@brendan-wiz
Copy link

I would love a flag to just skip archives altogether.

@ahrav
Copy link
Collaborator

ahrav commented May 16, 2024

@dustin-decker, with all the recent changes to archive handling, I think we might be in a good position to support toggling archive scanning on/off holistically across all sources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

6 participants