-
Notifications
You must be signed in to change notification settings - Fork 439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scan is not getting completed for file containing unicode characters like - 北京朝阳区 #641
Comments
@KBiru Hi. Thank you for reporting this. What is the file type of the file? |
The file type is normal XML, but the encoding is utf-8, so anything on utf-8 and containing characters like I mentioned, breaks scan process as detect-secrets take only default encoding of the OS it is running on, for example in windows it only tries to decode using cp1252 though the file is in utf-8. |
@KBiru Can you give me an example of a snippet of this file? For example trim the file down enough and sanitize it so there is not sensitive information while still causing the error. So I can attempt to reproduce this? |
@jpdakran I tried to create the same behavior as before but it seems like now it gives the following error - So, to be more clear the same content does work if the file encoding is same as system's default encoding but gives out error if it is different one -
What I wanted to discuss is that is the file encoding getting handled dynamically or this is not a feature for the tool yet. |
[This is an extension for issue no. #626 ]
Hi Team,
The scan is crashing for some files [without any errors or scan reports], I went through each line and found out that - if the code contains some string like - 北京朝阳区, detect-secret does not scan the file it exits without any errors. Is there some plugin or filters I should use to avoid this?
[Note - it is known that the particular file contains secret]
I mean are unicode strings getting handled properly? Also if I want to have should_exclude_secret filter for certain unicode regexes, then how to add it in the transient settings?
So far I could not do it.
Using the python package of the detect-secrets (python 3.10)
detect-secrets version = 1.4.0
OS = Windows 10
Please let me know if there is any information.
Thanks,
Bireswar
The text was updated successfully, but these errors were encountered: