Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds logger filter for TOKENS when logging in DEBUG MODE #539

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

jtroussard
Copy link
Contributor

@jtroussard jtroussard commented Mar 23, 2024

This PR is a derivative of another #532

This PR addresses the same issue but aims to demonstrate a broader perspective on implementation. While the original PR highlighted a significant concern, the proposed solution here follows an existing patten of configuring the logger and leveraging the features within the logger itself. Following this pattern reinforces and encourages contributions in a more uniform and predictable way for future logger configurations.

This approach also enables a more comprehensive cleansing of sensitive information from logs, extending beyond headers to all areas of the code base.

The necessity to create a new PR stems from the considerable deviation in approach, which couldn't be effectively communicated within the framework of #532's PR discussion. To acknowledge the foundational work laid by the original proposal, this submission incorporates co-authoring commits, reinforcing a collaborative effort towards a scalable and secure logging strategy.

Relevant links

python logger docs

SO

Jacques Troussard and others added 2 commits March 23, 2024 17:02
Co-authored-by: Erlend vollset <erlendvollset@gmail.com>
@coveralls
Copy link

coveralls commented Mar 23, 2024

Coverage Status

coverage: 90.503% (+0.4%) from 90.097%
when pulling 54676d1 on 536-access-tokens-leaked-in-logs-when-using-debug-level
into 77905a4 on master.

@jtroussard
Copy link
Contributor Author

jtroussard commented Mar 23, 2024

@erlendvollset : Github wouldn't let me ping you in the reviewers windows so I'm doing it here

@XMoose25X : Thought you might want to take a peek/observe/comment/correct/whatever-your-heart-desires xD

@jtroussard jtroussard force-pushed the 536-access-tokens-leaked-in-logs-when-using-debug-level branch from 313fe40 to a77b392 Compare March 23, 2024 22:33
@jtroussard jtroussard marked this pull request as draft March 26, 2024 15:41
@jtroussard
Copy link
Contributor Author

Converting to a draft until I have time to thoroughly test the regex, and provide evidence.

Copy link

@erlendvollset erlendvollset left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for picking this up 🚀 I think this approach addresses the issue in a better way than in my PR - so I'm all for it! I left a few suggestions that you may want to take into consideration.

"""
if record.levelno == logging.DEBUG:
if self.mode == "MASK":
record.msg = re.sub(r'Bearer\s+([A-Za-z0-9\-._~+\/]+)', '[MASKED]', record.getMessage())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I would suggest compiling the regex pattern once to avoid doing that for every log message.
  2. I would suggest putting this behind an LRU cache as we'll be logging a lot of the same statements repeatedly. re.sub is expensive enough that we would want to avoid doing it each time.

if self.mode == "MASK":
record.msg = re.sub(r'Bearer\s+([A-Za-z0-9\-._~+\/]+)', '[MASKED]', record.getMessage())
elif self.mode == "SUPPRESS":
return False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe only suppress log messages which match the above regex? Otherwise this equivalent to disabling the logger entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am hoping a single space would suffice as a replacement? Or maybe just pass None to the msg?

environment variable.
"""
super().__init__()
self.mode = os.getenv('DEBUG_MODE_TOKEN_FILTER', 'DEFAULT').upper()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest prefixing the envvar with the library name to reduce likelihood of collisions with other environment variables.

Suggested change
self.mode = os.getenv('DEBUG_MODE_TOKEN_FILTER', 'DEFAULT').upper()
self.mode = os.getenv('REQUESTS_OAUTHLIB_DEBUG_MODE_TOKEN_FILTER', 'DEFAULT').upper()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this comment. In fact I was running into this issue in another project. This project then variable name pattern is clean. I'm going to borrow and apply this pattern there as well. 🙏 Thank you!

@jtroussard
Copy link
Contributor Author

@erlendvollset This is great feedback! I'll try and carve out some time this weekend to address these comments. Thanks again!

@jtroussard
Copy link
Contributor Author

jtroussard commented Apr 21, 2024

Pending TODOs

  1. verify regex
  2. compile regex pattern
  3. testing for regex pattern?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Access tokens leaked in logs when using DEBUG level
3 participants