Client never recovers/disconnects lost connection from server restart #132

hjames9 · 2019-10-27T22:54:21Z

If a client is continuously sending messages on a properly established DTLS connection, if the server dies and restarts, the client never detects the server is down or attempt to reestablish the connection.

Here's an example of a client:

package main

import (
    "log"
    "net"
    "time"

    "github.com/pion/dtls"
)

func main() {
    addr := &net.UDPAddr{IP: net.ParseIP("127.0.0.1"), Port: 6553}
    certificate, privateKey, err := dtls.GenerateSelfSigned()
    if err != nil {
        log.Fatal(err)
        return
    }   

    config := &dtls.Config{
        Certificate:          certificate,
        PrivateKey:           privateKey,
        ExtendedMasterSecret: dtls.RequireExtendedMasterSecret,
        ConnectTimeout:       dtls.ConnectTimeoutOption(30 * time.Second),
        InsecureSkipVerify:   true,
    }   

    for {
        conn, err := dtls.Dial("udp", addr, config)
        if err != nil {
            log.Fatal(err)
            return
        }
        defer conn.Close()
        log.Println("Established connection...")

        for {
            message := []byte("Sup youngin...")
            amt, err := conn.Write(message)
            if err != nil {
                log.Println(err)
                break
            }
            log.Printf("Successfully wrote %d bytes", amt)

            time.Sleep(3 * time.Second)
        }
    }   
}

Here's an example of a server receiving the message:

package main

import (
    "log"
    "net"
    "time"

    "github.com/pion/dtls"
)

func main() {
    addr := &net.UDPAddr{IP: net.ParseIP("127.0.0.1"), Port: 6553}
    certificate, privateKey, err := dtls.GenerateSelfSigned()
    if err != nil {
        log.Fatal(err)
        return
    }   

    config := &dtls.Config{
        Certificate:          certificate,
        PrivateKey:           privateKey,
        ExtendedMasterSecret: dtls.RequireExtendedMasterSecret,
        ConnectTimeout:       dtls.ConnectTimeoutOption(30 * time.Second),
        InsecureSkipVerify:   true,
        ClientAuth:           dtls.RequireAnyClientCert,
    }   

    listener, err := dtls.Listen("udp", addr, config)
    if err != nil {
        log.Fatal(err)
        return
    }   
    defer func() {
        listener.Close(5 * time.Second)
    }() 

    for {
        conn, err := listener.Accept()
        if err != nil {
            log.Fatal(err)
            continue
        }

        go communicateData(conn)
    }   
}

func communicateData(conn net.Conn) {
    buffer := make([]byte, 1024)
    for {
        defer conn.Close()
        amt, err := conn.Read(buffer)
        if err != nil {
            break
        }
        log.Printf("%s\n", string(buffer[:amt]))
    }
}

Would expect some way for the client to realize that it has to resend a message because the previous DTLS session is broken, but currently reading and writing on the session is still successful.

Your environment.

Version: Latest GIT tag

The text was updated successfully, but these errors were encountered:

daenney · 2019-10-28T09:46:11Z

UDP doesn't really have connections, so we'd basically have to have a mechanism client side that's configured with a value for a timeout, after which we consider the server to be gone and attempt to re-establish a DTLS session. I'm not sure this needs to be part of the DTLS library, this is a reality of using UDP that you need to handle yourself, as what we consider to be a broken "connection" in UDP is highly application specific.

You need to do something like this in your client:

deadline := time.Now().Add(*timeout)
conn.SetReadDeadline(deadline)

nRead, addr, err := conn.ReadFrom(buffer)
if err != nil {
    ...
}

This only works if your client expects to always get a reply though, which is not necessarily the case with UDP.

daenney · 2019-10-28T09:55:12Z

I would imagine that if the server restarts you'd be getting errors though, as the TLS session is no longer valid. Though I suppose you'll only ever see that if you call Read() on the conn b/c that's the point where we try to decrypt. If the server just dies, as in goes away but never comes back, you'd have to use to a timeout on the conn or have some application level specific ping-pong mechanism that lets your client detect the server is gone.

daenney · 2019-10-28T11:00:06Z

Turns out I'm wrong on being able to detect a restart. DTLS implementations silently ignore invalid records, so there's no way to detect that.

Unlike TLS, DTLS is resilient in the face of invalid records (e.g.,
invalid formatting, length, MAC, etc.). In general, invalid records
SHOULD be silently discarded, thus preserving the association;
however, an error MAY be logged for diagnostic purposes.
Implementations which choose to generate an alert instead, MUST
generate fatal level alerts to avoid attacks where the attacker
repeatedly probes the implementation to see how it responds to
various types of error. Note that if DTLS is run over UDP, then any
implementation which does this will be extremely susceptible to
denial-of-service (DoS) attacks because UDP forgery is so easy.
Thus, this practice is NOT RECOMMENDED for such transports.

Basically, DTLS only introduced reliability for the purpose of being able to complete a (modified) TLS handshake over a datagram transport. After that, it's back to being as reliable as the transport itself, so in the case of UDP not at all.

You'll need an application level mechanism, i.e a request-response cycle that you expect to complete within a certain window of time, to detect an issue like a server having gone away or having restarted and lost the session.

hjames9 · 2019-10-28T11:50:26Z

Got it, makes sense. Thanks for analysis. Agreed that it seems that this should be more handled by the application side as per the nature of DTLS/UDP. However, do you think the library should expose more protocol details to aid in that? For instance should the sequence numbers on the DTLS packets be exposed in order for the application to detect gaps or maybe provide callbacks for a library gap detection? It seems like the sequence number was designed to aid in these scenarios.

hjames9 · 2019-10-28T12:04:00Z

@daenney Also, do you think alerting (generating and receiving), should be exposed at all?

Sean-Der · 2019-11-10T09:28:07Z

Hey @hjames9

We can expose statistics if that is helpful! I think that would be a great contribution, I am not sure what that would look exactly like yet though. If we can copy things that crypto/tls or OpenSSL do would be a great start.

If not we can start with basics like bytes transmitted/received.

One thing you could do right now is set a custom logger and look at all the log messages we emit. You can pass in the constructor your own Logger instance and then process every message we generate (and react to them)

hjames9 · 2019-11-10T16:42:26Z

@daenney Understood your explanation on alerting i.e. should only be diagnostic and should be fatal for the session if used, but in general is not a best practice.

@Sean-Der Keeping live statistics might be helpful in this situation. I'll educate myself on how crypto/tls handles it and see if anything useful can be replicated.

Thanks!

hjames9 · 2020-05-11T04:44:48Z

@Sean-Der Can you point me to where an API like this exists in crypto/tls? I can try and replicate something similar here.

tigersean · 2021-03-04T01:44:55Z

ref: https://www.cs.ru.nl/bachelors-theses/2018/Niels_Drueten___4496604___Security_analysis_of_DTLS_1_2_implementations.pdf
page 14:
"Closing a DTLS connection the implementation should send a close notify alert. This way, the peer knows the connection is ending"

Sean-Der · 2021-03-05T03:25:36Z

Hey @tigersean

We close when we get a Close Alert

We also send a Close Alert when the user closes the connection.

Sean-Der · 2024-05-21T16:54:36Z

I don't believe we have anything actionable here. I am going to resolve, but feel free to re-open if you believe more should be done.

Users can detect timeouts, and it is up to them how long until the application has timed out.

igolaizola mentioned this issue Jun 16, 2020

Hybrid DTLS client: server goes down #266

Closed

kegsay mentioned this issue Mar 18, 2021

Add AcceptFilterWithResponse to ListenConfig pion/udp#36

Open

Sean-Der closed this as not planned Won't fix, can't repro, duplicate, stale May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Client never recovers/disconnects lost connection from server restart #132

Client never recovers/disconnects lost connection from server restart #132

hjames9 commented Oct 27, 2019 •

edited

daenney commented Oct 28, 2019

daenney commented Oct 28, 2019 •

edited

daenney commented Oct 28, 2019

hjames9 commented Oct 28, 2019

hjames9 commented Oct 28, 2019

Sean-Der commented Nov 10, 2019 •

edited

hjames9 commented Nov 10, 2019

hjames9 commented May 11, 2020

tigersean commented Mar 4, 2021

Sean-Der commented Mar 5, 2021

Sean-Der commented May 21, 2024

Client never recovers/disconnects lost connection from server restart #132

Client never recovers/disconnects lost connection from server restart #132

Comments

hjames9 commented Oct 27, 2019 • edited

Your environment.

daenney commented Oct 28, 2019

daenney commented Oct 28, 2019 • edited

daenney commented Oct 28, 2019

hjames9 commented Oct 28, 2019

hjames9 commented Oct 28, 2019

Sean-Der commented Nov 10, 2019 • edited

hjames9 commented Nov 10, 2019

hjames9 commented May 11, 2020

tigersean commented Mar 4, 2021

Sean-Der commented Mar 5, 2021

Sean-Der commented May 21, 2024

hjames9 commented Oct 27, 2019 •

edited

daenney commented Oct 28, 2019 •

edited

Sean-Der commented Nov 10, 2019 •

edited