Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCSP support #242

Open
kmahar opened this issue Sep 21, 2020 · 6 comments
Open

OCSP support #242

kmahar opened this issue Sep 21, 2020 · 6 comments

Comments

@kmahar
Copy link

kmahar commented Sep 21, 2020

Hi SwiftNIO folks!

One thing I have been thinking about as we plan out rewriting mongo-swift-driver's internals in Swift based on NIO is TLS and specifically OCSP support.

As of MongoDB 4.4, the database server enables OCSP by default and supports OCSP stapling as well as the OCSP must-staple extension. Therefore, TLS library permitting, our drivers should now enable OCSP by default (specification).

It's not critical for us to support this yet, but as described here MongoDB Atlas (our DBaaS) is moving to use LetsEncrypt for its certs, which only supports OCSP.

It doesn't look to me like this capability is exposed via NIOSSL, although (I think? based on a quick search through the vendored source) it looks like BoringSSL supports it. Is this something you'd consider exposing?

@Lukasa
Copy link
Contributor

Lukasa commented Sep 22, 2020

Howdy @kmahar! Thanks for the feature request.

I want to draw some delineations between some terms because I think we should clarify what exactly we’re talking about. That will let us guarantee we’re talking about the same things! I’ll add a few extra definitions for interested outsiders too. I think we want definitions for:

  • revocation checking

    This is the process of validating that an X.509 certificate has not been revoked by its issuer. There are lots of strategies for doing this. For non-browsers the major two mechanisms are CRLs and the various flavours of OCSP. Browsers have a few extra mechanisms that involve shipping giant databases: that’s not too practicable for us.

  • online revocation checking

    This is the process for validating that an X.509 certificate is valid by means of a live web request. Traditionally this is how CRLs are fetched, and how OCSP responses are obtained. When the TLS handshake begins, the client will analyse the server’s certificates for AIA fields that provide URLs from which the client can obtain CRLs or OCSP responses. The client will then issue requests to those URLs and validate the responses to validate the certificate.

  • offline revocation checking

    This is the inverse of the above process: performing revocation checks without making extra HTTPS requests. Browsers can do this with things like OneCRL. For us, the only option is OCSP stapling.

  • OCSP

    The Online Certificate Status Protocol. The more modern means of querying for revocation: allows clients to ask “is this certificate valid”, instead of asking “what are all the invalid certificates”.

  • CRLs

    Certificate Revocation Lists. These are the old-school way of finding out about revocation: a CA maintains a big-ol’ list of all revoked certs, and we go and download it. This is not great: the data involved is large, and it has awkward privacy concerns.

  • OCSP Stapling

    A TLS extension where the Server fetches the OCSP responses for its certificates ahead of time, and stores them. It then “staples” the OCSP response to the handshake, allowing the client to avoid needing to make its own separate OCSP request. This works because OCSP responses are cryptographically validated, so the server can’t tamper with them, and because they have short validity periods, so the server can’t serve you an old one.

  • OCSP Must-Staple

    This is an extension to the X.509 certificate format. When set, this extension says that servers must staple an OCSP response for the certificate to the TLS handshake, and that clients must validate it or distrust the certificate.

With that set of definitions, let me explain my position on most of these things. Firstly, I think there is no value in adding support for online revocation checking. Online revocation checking suffers from the question of what to do in the face of outages in the OCSP responder. As these outages are reasonably frequent, and as users hate for their services to fail because the CA’s systems are broken, OCSP responders tend to have to “fail open”. This isn’t great because it makes hiding a revocation straightforward: prevent the OCSP response getting to the client.

OCSP stapling is a different beast. Because it requires that the server fetch the OCSP response, and the server may cache it (usually for a number of days), outages are less critical. However, OCSP stapling without Must-Staple is of pretty minimal utility because, again, there is an easy workaround if the cert is revoked: just stop serving the OCSP response.

This means the minimum feature set we’d need to implement would be: parsing and validating OCSP responses, OCSP stapling support, and OCSP Must-Staple support, on the client. We could definitely do these things. BoringSSL has most of the crypto functionality we need, so we’d just have to glue it together and produce appropriate APIs.

A subsequent extension to the work would be to build an OCSP responder whose primary purpose is to support the server side of OCSP stapling. This, combined with APIs to set the stapled OCSP response, would allow NIO servers to support Must-Staple TLS certificates as well. This is also an acceptable thing to do.

As a note, OCSP stapling is also falling out of favour due to the many infrastructural problems associated with it. The industry is trending towards just standardising on short-lived certificates with good infrastructure for rolling them. For that reason I’m wary of us spending too much time doing OCSP work if we can avoid it.

Nonetheless, I think supporting the client side of OCSP stapling and must-staple is probably a good idea.

@kmahar
Copy link
Author

kmahar commented Sep 23, 2020

Thanks for the speedy and thorough response, @Lukasa!

I'm only just catching up on what exactly OCSP is this week, so the clear definitions are very helpful.

Some questions -

  1. Regarding online certificate verification: I definitely see how assuming success in the face of an OCSP responder outage is problematic. However, I'm not sure if I follow why not implementing online certificate revocation checking at all is better than implementing it in a more strict manner where a lack of response is treated the same as an invalid cert. Would this be bad because it's inconsistent with what other libraries do? Or bad because users might then choose to use this unreliable mechanism, and get annoyed with how often it fails?

  2. Say a client with support for what you describe above receives a certificate that is not marked must-staple, but does have a valid stapled response. This would be accepted by the client, right? (I don't think there's a reason for a client to prefer a must-staple certificate vs a non-must-staple certificate so long as a valid response is stapled to it, but I may be wrong.)

@Lukasa
Copy link
Contributor

Lukasa commented Sep 23, 2020

Regarding online certificate verification: I definitely see how assuming success in the face of an OCSP responder outage is problematic. However, I'm not sure if I follow why not implementing online certificate revocation checking at all is better than implementing it in a more strict manner where a lack of response is treated the same as an invalid cert. Would this be bad because it's inconsistent with what other libraries do? Or bad because users might then choose to use this unreliable mechanism, and get annoyed with how often it fails?

The reason not to do it is because it's a lot of work for a strategy that doesn't really successfully defend the user. Users have to decide how they will configure this: will they allow an OCSP responder failure to prevent the TLS connection from completing? If they do, then the liveness of their system is now limited both by their own code and by the OCSP responder for every certificate in the server chain. If any of those OCSP servers is misbehaving, the handshake will fail. If they do not allow OCSP responder failure to prevent the connection, then they have no more security than they had before, but they have a slower TLS handshake.

Given that OCSP responder outages are reasonably common, this is a non-theoretical question. Note that if the OCSP responder is misbehaving, then no number of retries will fix the problem: the system is completely unavailable until the OCSP response returns.

OCSP works a bit better if you have system-wide caches which can hold on to older OCSP responses, including those originally fetched on behalf of other processes. We don't have much access to that kind of functionality on Linux, so adding on-by-default online OCSP validation just makes our systems fail more and take longer to perform TLS handshakes.

Say a client with support for what you describe above receives a certificate that is not marked must-staple, but does have a valid stapled response. This would be accepted by the client, right? (I don't think there's a reason for a client to prefer a must-staple certificate vs a non-must-staple certificate so long as a valid response is stapled to it, but I may be wrong.)

Sure, a stapled response would be accepted by the client even if the cert didn't have must-staple enabled. It's just not a high-value signal of validity because if the cert had been revoked a malicious server could simply choose not to send the OCSP response. That's why you need stapling and Must-Staple to get real value out of the system.

@kmahar
Copy link
Author

kmahar commented Sep 23, 2020

Thanks very much for those clarifications! All you've said makes a lot of sense. I believe for our purposes in the driver what you propose supporting on the client side would be sufficient.

@Lukasa
Copy link
Contributor

Lukasa commented Oct 8, 2020

Ok, with further digging the shape of this has come into view. The TL;DR is that most of this is easy, and some of it is annoying.

The easy bits are finding the URL to request the stapled OCSP response from, to request stapled OCSP responses from the server, to attach stapled OCSP responses as the server, and to get the stapled response. That's all easy-peasy.

Unfortunately the hard parts are to do anything useful with the OCSP responses. BoringSSL has removed the OCSP_REQUEST/OCSP_RESPONSE structures as well as their associated ASN.1 parsing code, so we'll have to bring them back in some form. We probably don't want their full complexity, of course, but it's a bit sad to do that. For expediency reasons we'll probably do this by just calling the BoringSSL ASN.1 code, though we could potentially investigate the limited Swift ASN.1 code in Swift Crypto for completeness.

@Lukasa
Copy link
Contributor

Lukasa commented Oct 8, 2020

Note that we can keep our Darwin-based evaluator working with stapled OCSP responses using SecTrustSetOCSPResponse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants