Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid reloading root certificates to improve concurrent performance #6667

Merged
merged 4 commits into from May 15, 2024

Conversation

agubelu
Copy link
Contributor

@agubelu agubelu commented Mar 20, 2024

Reproducing the problem

Let's consider the following script. It runs a bunch of concurrent requests against a URL, both with certificate verification enabled and disabled, and outputs the time it takes to do it in both cases.

from time import time
from threading import Thread
import requests
import urllib3

urllib3.disable_warnings()

def do_request(verify):
    requests.get('https://example.com', verify=verify)

def measure(verify):
    threads = [Thread(target=do_request, args=(verify,)) for _ in range(30)]

    start = time()
    for t in threads: t.start()
    for t in threads: t.join()
    end = time()

    print(end - start)

measure(verify=True)
measure(verify=False)

What's the time difference between the two? It turns out it is highly dependent on your local configuration. On my local machine, with a relatively modern config (Python 3.12 + OpenSSL 3.0.2), the times are ~1.2s for verify=True and ~0.5s for verify=False.

It's a >100% difference, but we initially blamed it on cert verification taking some time. However, we observed even larger differences (>500%) in some environments, and decided to find out what was going on.

Problem description

Our main use case for requests is running lots of requests concurrently, and we spent some time bisecting this oddity to see if there was room for a performance optimization.

The issue is a bit more clear if you profile the concurrent executions. When verifying certs, these are the top 3 function calls by time spent in them:

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
30/1    0.681    0.023    0.002    0.002 {method 'load_verify_locations' of '_ssl._SSLContext' objects}
30/1    0.181    0.006    0.002    0.002 {method 'connect' of '_socket.socket' objects}
60/2    0.180    0.003    1.323    0.662 {method 'read' of '_ssl._SSLSocket' objects}

Conversely, this is how the top 3 looks like without cert verification:

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
30/1    0.233    0.008    0.001    0.001 {method 'do_handshake' of '_ssl._SSLSocket' objects}
30/1    0.106    0.004    0.002    0.002 {method 'connect' of '_socket.socket' objects}
60/2    0.063    0.001    0.505    0.253 {method 'read' of '_ssl._SSLSocket' objects}

In the first case, a full 0.68 seconds are spent in the load_verify_locations() function of the ssl module, which configures a SSLContext object to use a set of CA certificates for validation. Inside it, there is a C FFI call to OpenSSL's SSL_CTX_load_verify_locations() which is known to be quite slow. This happens once per request (hence the 30 on the left).

We believe that, in some cases, there is even some blocking going on, either because each FFI call locks up the GIL or because of some thread safety mechanisms in OpenSSL itself. We also think that this is more or less pronounced depending on internal changes between OpenSSL's versions, hence the variability between environments.

When cert validation isn't needed, these calls are skipped which speeds up concurrent performance dramatically.

Submitted solution

It isn't possible to skip loading root CA certificates entirely, but it isn't necessary to do it on every request. More specifically, a call to load_verify_locations() happens when:

  • A new urllib3.connectionpool.HTTPSConnectionPool is created.

  • On connection, by urllib3's ssl_wrap_socket(), when the connection's ca_certs or ca_cert_dir attributes are set (see the relevant code).

The first case doesn't need to be addressed anymore after the latest addition of _get_connection(). Since it now passes down pool_kwargs, this allows urllib3 to use a cached pool with the same settings every time, instead of creating one per request.

The second one is addressed in this PR. If a verified connection is requested, _urllib3_request_context() already makes it so that a connection pool using a SSLContext with all relevant certificates loaded is always used. Hence, there is no need to trigger a call to load_verify_locations() again.

You can test against https://invalid.badssl.com to check that verify=True and verify=False still behave as expected and are now equally fast.

I'd like to mention that there have been a few changes in Requests since I started drafting this, and I'm not sure that setting conn.ca_certs or conn.ca_certs = cert_loc in cert_verify() is even still needed, since I think that the logic could be moved to _urllib3_request_context() and benefit from using a cached context in those cases too.

@mm-matthias
Copy link

We've been hit by the slowness of load_verify_locations as well. After turning request.get into session.get performances improves a lot. But we still face the performance hit, because:

  • it happens on the first connection (vs. module init time). This creates noise in our live profiles.
  • after connections in the PoolManager/HTTPConnectionPool time out, the SSLContext is recreated over and over
  • HTTPConnectionPools are keyed by host/scheme/... which means a new SSLContext is created for each pool
  • we use session.get(proxies={...}) which leads to even more pools and SSLContext initializations (this part might not yet be covered by this PR)

Is there anything I can do to advance this PR?

@sigmavirus24
Copy link
Contributor

Given this breaks the behavior of the module for a whole class of users as it's written today, there's not much to do to advance it.

Even still, I'm pretty sure SSLComtext is not itself thread safe but I need to find a reference for that so loading it at the module will likely cause issues

@mm-matthias
Copy link

Even still, I'm pretty sure SSLComtext is not itself thread safe but I need to find a reference for that so loading it at the module will likely cause issues

Here's your quote: python/cpython#95031 (comment)

SSLContext is thread and async-safe.

Can't speak for the reliability of that quote though. Given that a lot of time is spent in pthread_rwlock it doesn't sound completely unbelievable though.

@mm-matthias
Copy link

mm-matthias commented May 5, 2024

Here is a more reliable reference from the openssl guys themselves about SSL_CTX. The info is from 2017 though.

EDIT: Here are the official docs for openssl.

Unfortunately the python SSLContext docs don't mention thread safety at all.

Found this in the Python 3.13.0a5 release docs:

ssl.SSLContext.cert_store_stats and ssl.SSLContext.get_ca_certs now correctly lock access to the certificate store, when the ssl.SSLContext is shared across multiple threads.

Asked for official clarification on the thread-safety of SSLContext here.

@tiran
Copy link
Contributor

tiran commented May 5, 2024

SSLContext is designed to be shared and used for multiple connections. It is thread safe as long as you don't reconfigure it once it is used by a connection. Adding new certs to the internal trust store is fine, but changing ciphers, verification settings, or mTLS certs can lead to surprising behavior. The problem is unrelated to threads and can even occur in a single-threaded program.

If you don't trust me, then please trust David Benjamin's statement on threading. He is the main lead behind BoringSSL and did a lot of TLS stuff in Chrome browser.

@mm-matthias
Copy link

@tiran Thank you very much for your detailed commentary!

Given this breaks the behavior of the module for a whole class of users as it's written today, there's not much to do to advance it.

@sigmavirus24 What exactly are you unhappy about with the current solution and what would an acceptable solution for you look like? Would you prefer something like introducing a new "ssl_context" parameter somehow that could be set by the library user and be set to something like None for backwards compatibility? Or would you not want to change the request api in any way?

@tiran
Copy link
Contributor

tiran commented May 5, 2024

See #2118

I recommend that you either use urllib3 directly or switch to httpx. Most of the secret sauce of requests is in urllib3. httpx has HTTP/2 support.

@sigmavirus24
Copy link
Contributor

@tiran Thank you very much for your detailed commentary!

Given this breaks the behavior of the module for a whole class of users as it's written today, there's not much to do to advance it.

@sigmavirus24 What exactly are you unhappy about with the current solution and what would an acceptable solution for you look like? Would you prefer something like introducing a new "ssl_context" parameter somehow that could be set by the library user and be set to something like None for backwards compatibility? Or would you not want to change the request api in any way?

Requests supports being run (with certifi) inside a zip file created by a tool like pants, pyinstaller, etc. The code here removes that support in loading the trust stores.

@sigmavirus24
Copy link
Contributor

Ah, I see that the PR was updated and moved the extraction. I think the last blocker is the context being "public" in how it is named. That will encourage people to modify it in a way we don't want to be supporting. People will attempt to modify it anyway but when things go wrong, it will be clear that they weren't intended to modify it.

To be clear, I expect people to use this to work around libraries that aren't written correctly that leverage requests but do not allow people to specify a trust store or customize a session. Either way, if we rename the default I'm in favor of merging this. Unless @nateprewitt has objections.

(And yes, the caveats around changing ciphers or loading other trust stores is what I recalled being a problem that has odd behavior but even so, making this private will alleviate that likely spike in issues.)

@agubelu
Copy link
Contributor Author

agubelu commented May 7, 2024

Thanks for the follow-up @sigmavirus24. I renamed the default context as requested, please let me know if you'd like any further changes.

Copy link
Contributor

@sigmavirus24 sigmavirus24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want a second set of eyes from @nateprewitt or @sethmlarson

Copy link
Member

@nateprewitt nateprewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, this seems reasonable to me. I was curious if we'd be better off doing this per-Adapter instance instead of globally but it seems like that may not be a concern with Christian's response.

@nateprewitt nateprewitt merged commit 9a40d12 into psf:main May 15, 2024
26 checks passed
@nateprewitt nateprewitt added this to the 2.32.0 milestone May 15, 2024
@cyberw
Copy link

cyberw commented May 15, 2024

Thank you so much!

This made a big improvement for my use case, which is load generation (https://github.com/locustio/locust), so lots of concurrent request.Sessions. Especially on Windows, where Python 3.12 was completely broken for for my users...

Any chance 2.32 will be upon us soon? If not, a pre-release version would also be very helpful.

@nateprewitt
Copy link
Member

We don't have solidified dates for 2.32.0 but I would wager it's sometime in mid-late June. As for a pre-release, time permitting I'll take a look at it but can't guarantee anything currently.

@agubelu
Copy link
Contributor Author

agubelu commented May 16, 2024

@cyberw I remember doing some profiling on Windows as well and finding out that it was particularly infamous there, because the certificates were being reloaded not one, but two or three times per request. I don't recall the specific reason since I ran most of the experiments on Linux afterwards, but I think it had to do with the CA store being a folder instead of a bundle file.

In any case, hopefully this mitigates the issue now that it's merged. If not, feel free to ping me and I'll try to follow up with something more specific for Windows.

@cyberw
Copy link

cyberw commented May 16, 2024

On Windows (3.12.2) the example (modified to run against github.com) takes ~1.5s without the fix and ~0.2 with it (which is actually faster than the verify=False version, at ~0.5s. Odd)

@nateprewitt nateprewitt mentioned this pull request May 20, 2024
@florianlink
Copy link

This patch has the side-effect, that it is no longer possible to pass in a custom ssl_context via the PoolManager.
This is a problem because now it does not seem possible anymore to enable/use a custom ssl_context to reduce the OpenSSL seclevel used by request. The following worked in 2.31.0 and stopped working now:

from urllib3.util import create_urllib3_context
from urllib3 import PoolManager
from requests.adapters import HTTPAdapter
from requests import Session

class AddedCipherAdapter(HTTPAdapter):
  def init_poolmanager(self, *args, **kwargs):
    print("using poolmanager")
    ctx = create_urllib3_context(ciphers="ALL:@SECLEVEL=0")
    kwargs['ssl_context'] = ctx
    return super(AddedCipherAdapter, self).init_poolmanager(*args, **kwargs)

# use our own post session with lowered seclevel adapter
def post(url, cert=None, data=None, verify=True, headers=None):
  with Session() as s:
    s.adapters.pop("https://", None)
    s.mount("https://", AddedCipherAdapter())
    return s.post(url, cert=cert, data=data,verify=verify, headers=headers)

This worked in 2.31.0 and stopped working, because custom 'ssl_context' is overwritten by default 'ssl_context' with this merge request.

I think it would be good to support a custom ssl_context again, since otherwise it does not seem to be possible to change the seclevel settings of python requests anymore and that is required if you are working with legacy code and old certificates that can't easily be renewed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants