Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexepected Exception with SSLHandler #13993

Open
sbernard31 opened this issue Apr 22, 2024 · 10 comments
Open

Unexepected Exception with SSLHandler #13993

sbernard31 opened this issue Apr 22, 2024 · 10 comments

Comments

@sbernard31
Copy link

(I'm using netty 4.1.107.Final)

I configure SSLHandler with clientAuth(ClientAuth.REQUIRE).

And sometime I face this kind of exception :

java.lang.IllegalStateException: Unable to get Principal
	at org.eclipse.leshan.transport.javacoap.server.coaptcp.transport.CoapsTcpTransportResolver.apply(CoapsTcpTransportResolver.java:52)
	at org.eclipse.leshan.transport.javacoap.server.coaptcp.transport.CoapTcpTransportResolver.apply(CoapTcpTransportResolver.java:1)
	at org.eclipse.leshan.transport.javacoap.server.coaptcp.transport.TransportContextHandler.attachContextToChannel(TransportContextHandler.java:52)
	at org.eclipse.leshan.transport.javacoap.server.coaptcp.transport.TransportContextHandler.channelActive(TransportContextHandler.java:42)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:260)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:238)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelActive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.ChannelInboundHandlerAdapter.channelActive(ChannelInboundHandlerAdapter.java:69)
	at org.eclipse.leshan.transport.javacoap.server.coaptcp.transport.NettyCoapTcpTransport$ChannelRegistry$1.userEventTriggered(NettyCoapTcpTransport.java:117)
	at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:398)
	at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:376)
	at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:368)
	at io.netty.handler.ssl.SslHandler.setHandshakeSuccess(SslHandler.java:1940)
	at io.netty.handler.ssl.SslHandler.wrapNonAppData(SslHandler.java:1021)
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1511)
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1338)
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1387)
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:750)
Caused by: javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated
	at sun.security.ssl.SSLSessionImpl.getPeerPrincipal(SSLSessionImpl.java:723)
	at org.eclipse.leshan.transport.javacoap.server.coaptcp.transport.CoapsTcpTransportResolver.apply(CoapsTcpTransportResolver.java:50)
	... 35 common frames omitted

In my custom code, I try to get remote peer principal but it seems it is null and so raise an SSLPeerUnverifiedException.

But how it is possible, because looking at the stacktrace we can see that this happens on :

  • io.netty.handler.ssl.SslHandler.setHandshakeSuccess(SslHandler.java:1940)
  • and client auth is Required so how could we face a situation where there is no principal/certificate.

To be fully complete, I see this issue in my unit tests sometime but this sounds not really reproductible each time. (Does it smell the race condition ? )

I'm new netty user so maybe I did something wrong. Just in case I put a link to my code : https://github.com/eclipse-leshan/leshan/tree/java-coap-tcp-tls/leshan-tl-javacoap-server-coaptcp/src/main/java/org/eclipse/leshan/transport/javacoap/server/coaptcp/transport

Not directly link but I also face this kind of issue sporadically :

java.lang.NullPointerException: null
   at io.netty.handler.ssl.SslHandler.wrapAndFlush(SslHandler.java:811)
   at io.netty.handler.ssl.SslHandler.flush(SslHandler.java:803)
   at io.netty.handler.ssl.SslHandler.flush(SslHandler.java:2074)
   at io.netty.handler.ssl.SslHandler.closeOutbound0(SslHandler.java:678)
   at io.netty.handler.ssl.SslHandler.access$500(SslHandler.java:171)
   at io.netty.handler.ssl.SslHandler$1.run(SslHandler.java:667)
   at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
   at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
   at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
   at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
   at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
   at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
   at java.lang.Thread.run(Thread.java:750)
@chrisvest
Copy link
Contributor

I see this issue in my unit tests sometime but this sounds not really reproductible each time. (Does it smell the race condition ? )

It could also be session resumption that sometimes does, or does not happen. Something to look out for. Netty 4.1.108 had a number of fixes to the TLS related code, so try upgrading and see if it helps. If it doesn't, you'll at least get the latest line numbers in the stack traces.

@sbernard31
Copy link
Author

I tested with last Netty version I found : 4.1.109.Final.

I still face the issue here is the stack traces :

java.lang.IllegalStateException: Unable to get Principal
	at org.eclipse.leshan.transport.javacoap.server.coaptcp.transport.CoapsTcpTransportResolver.apply(CoapsTcpTransportResolver.java:52)
	at org.eclipse.leshan.transport.javacoap.server.coaptcp.transport.CoapTcpTransportResolver.apply(CoapTcpTransportResolver.java:1)
	at org.eclipse.leshan.transport.javacoap.server.coaptcp.transport.TransportContextHandler.attachContextToChannel(TransportContextHandler.java:52)
	at org.eclipse.leshan.transport.javacoap.server.coaptcp.transport.TransportContextHandler.channelActive(TransportContextHandler.java:42)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:260)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:238)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelActive(AbstractChannelHandlerContext.java:231)
	at io.netty.channel.ChannelInboundHandlerAdapter.channelActive(ChannelInboundHandlerAdapter.java:69)
	at org.eclipse.leshan.transport.javacoap.server.coaptcp.transport.NettyCoapTcpTransport$ChannelRegistry$1.userEventTriggered(NettyCoapTcpTransport.java:117)
	at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:398)
	at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:376)
	at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:368)
	at io.netty.handler.ssl.SslHandler.setHandshakeSuccess(SslHandler.java:1940)
	at io.netty.handler.ssl.SslHandler.wrapNonAppData(SslHandler.java:1021)
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1511)
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1338)
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1387)
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:750)
Caused by: javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated
	at sun.security.ssl.SSLSessionImpl.getPeerPrincipal(SSLSessionImpl.java:723)
	at org.eclipse.leshan.transport.javacoap.server.coaptcp.transport.CoapsTcpTransportResolver.apply(CoapsTcpTransportResolver.java:50)
	... 35 common frames omitted

(and for completeness the not related one 👇 )

java.lang.NullPointerException: null
	at io.netty.handler.ssl.SslHandler.wrapAndFlush(SslHandler.java:811)
	at io.netty.handler.ssl.SslHandler.flush(SslHandler.java:803)
	at io.netty.handler.ssl.SslHandler.flush(SslHandler.java:2074)
	at io.netty.handler.ssl.SslHandler.closeOutbound0(SslHandler.java:678)
	at io.netty.handler.ssl.SslHandler.access$500(SslHandler.java:171)
	at io.netty.handler.ssl.SslHandler$1.run(SslHandler.java:667)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:750)

(tested with java 8, 11, 17)

It could also be session resumption that sometimes does, or does not happen.

You're talking about TLS session resumption/abbreviated hanshake ? When a CLIENT_HELLO session ID is not empty, right ?
(not TLS renegociation ? by the way I didn't find any good way to disable renegociation)

Using wireshark, I checked if one of my test is using Session Resumption and I found only 1.
I comment that test and relaunch all tests, then not more Session Resumption in wireshark and still faces both exception.
So it does not seem to be session resumption related.

What had you in mind exactly because even if session resumption is used. If client authentication is required then an handshake completion success (abbreviated or full handshake) should still lead to a principal/certificate associate to remote peer ? or I miss something ?

@chrisvest
Copy link
Contributor

In TLSv1.3 session resumption can also look like a very long pre_shared_key, rather than a session_id.

@sbernard31
Copy link
Author

sbernard31 commented Apr 23, 2024

Currently my tests are about TLSv1.2.

@chrisvest
Copy link
Contributor

Does the test work if you temporarily switch to the InsecureTrustManagerFactory?

@sbernard31
Copy link
Author

sbernard31 commented Apr 23, 2024

Actually, all my tests currently pass but I get those exception in log that I can not understand.
When I launch test 1 by 1 unable to reproduce it.

I can not really use InsecureTrustManagerFactory because some of my test is about checking that server doesn't allow client with bad credentials, so lot of tests will failed if I use it and so I'm not sure we will see anything.

Do you have any idea when an handshake could be successful but there is no principal/certificate for remote peer when using ClientAuth.REQUIRE ?

@chrisvest
Copy link
Contributor

With ClientAuth.REQUIRE the server will definitely ask the client to send a list of certificates, but the list could be empty if the client is cheeky. It's on the trust manager to reject the handshake with such clients by throwing a CertificateException.

I don't know if that scenario comes out as "peer not authenticated" when asking for the principal, if the trust manager did not reject the handshake.

@chrisvest
Copy link
Contributor

When I launch test 1 by 1 unable to reproduce it.

This also sounds like it could be a test isolation failure. Where there's somehow some mutable state shared between tests that shouldn't be.

@sbernard31
Copy link
Author

With ClientAuth.REQUIRE the server will definitely ask the client to send a list of certificates, but the list could be empty if the client is cheeky.
I don't know if that scenario comes out as "peer not authenticated" when asking for the principal, if the trust manager did not reject the handshake.

I'm pretty sure that my client send at least 1 certificate in all my tests.
But I can try to see what happens if I try a client to connect with an empty cert chain.

It's on the trust manager to reject the handshake with such clients by throwing a CertificateException.

Yep lot of my test execute code where trust manager raise a CertificateException and it seems to works well. This generally lead to a SslHanshakeException which is expected (and no hanshake success)

This also sounds like it could be a test isolation failure. Where there's somehow some mutable state shared between tests that shouldn't be.

I understand that you think about that.
I'm pretty sure I did not share state between my test.
I create a new client and a new server for each tests.
I run my test with maven/surefire with :

<parallel>classes</parallel>
<threadCount>1</threadCount>

which limits interaction between tests. (all my test about netty are in 1 class)

@sbernard31
Copy link
Author

sbernard31 commented Apr 24, 2024

I did the test about client sending empty cert chain with a trustmanager which allow any certificate but with ClientAuth.REQUIRE

I get this stacktrace at server side:

io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: Empty server certificate chain
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:500)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:750)
Caused by: javax.net.ssl.SSLHandshakeException: Empty server certificate chain
	at sun.security.ssl.Alert.createSSLException(Alert.java:131)
	at sun.security.ssl.Alert.createSSLException(Alert.java:117)
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:318)
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:274)
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:265)
	at sun.security.ssl.CertificateMessage$T12CertificateConsumer.onCertificate(CertificateMessage.java:390)
	at sun.security.ssl.CertificateMessage$T12CertificateConsumer.consume(CertificateMessage.java:375)
	at sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:377)
	at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:444)
	at sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:981)
	at sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:968)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:915)
	at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1651)
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1497)
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1338)
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1387)
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469)
	... 17 common frames omitted

which seems to be the expected behavior. (Maybe just a little SSLHandshakeException message should be : "Empty client certificate chain" but looking at OpenJDK code and it seems this is a bug on their side) :

        @Override
        public void consume(ConnectionContext context,
                ByteBuffer message) throws IOException {
            // The consuming happens in handshake context only.
            HandshakeContext hc = (HandshakeContext)context;

            // clean up this consumer
            hc.handshakeConsumers.remove(SSLHandshake.CERTIFICATE.id);

            T12CertificateMessage cm = new T12CertificateMessage(hc, message);
            if (hc.sslConfig.isClientMode) {
                if (SSLLogger.isOn && SSLLogger.isOn("ssl,handshake")) {
                    SSLLogger.fine(
                        "Consuming server Certificate handshake message", cm);
                }
 /// ===> same code is used when we consume client or server certificate
                onCertificate((ClientHandshakeContext)context, cm);
            } else {
                if (SSLLogger.isOn && SSLLogger.isOn("ssl,handshake")) {
                    SSLLogger.fine(
                        "Consuming client Certificate handshake message", cm);
                }
 /// ===> same code is used when we consume client or server certificate
                onCertificate((ServerHandshakeContext)context, cm);
            }
        }

        private void onCertificate(ServerHandshakeContext shc,
                T12CertificateMessage certificateMessage )throws IOException {
            List<byte[]> encodedCerts = certificateMessage.encodedCertChain;
            if (encodedCerts == null || encodedCerts.isEmpty()) {
                // For empty Certificate messages, we should not expect
                // a CertificateVerify message to follow
                shc.handshakeConsumers.remove(
                        SSLHandshake.CERTIFICATE_VERIFY.id);
                if (shc.sslConfig.clientAuthType !=
                        ClientAuthType.CLIENT_AUTH_REQUESTED) {
                    // unexpected or require client authentication
// ==> this check seems to be about handling client certificate message so error message seems wrong.
// ==> "unexpected" or "requred" client auth should not lead to same message.
                    throw shc.conContext.fatal(Alert.BAD_CERTIFICATE,
                        "Empty server certificate chain");
                } else {
                    return;
                }
            }

I also double check with wireshark that my client Certificate message is really empty (and my server certificate message not)

With ClientAuth.REQUIRE the server will definitely ask the client to send a list of certificates, but the list could be empty if the client is cheeky. It's on the trust manager to reject the handshake with such clients by throwing a CertificateException.

I don't know if that scenario comes out as "peer not authenticated" when asking for the principal, if the trust manager did not reject the handshake.

So it seems it does not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants