Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client never recovers/disconnects lost connection from server restart #132

Closed
hjames9 opened this issue Oct 27, 2019 · 11 comments
Closed

Client never recovers/disconnects lost connection from server restart #132

hjames9 opened this issue Oct 27, 2019 · 11 comments

Comments

@hjames9
Copy link
Contributor

hjames9 commented Oct 27, 2019

If a client is continuously sending messages on a properly established DTLS connection, if the server dies and restarts, the client never detects the server is down or attempt to reestablish the connection.

Here's an example of a client:

package main

import (
    "log"
    "net"
    "time"

    "github.com/pion/dtls"
)

func main() {
    addr := &net.UDPAddr{IP: net.ParseIP("127.0.0.1"), Port: 6553}
    certificate, privateKey, err := dtls.GenerateSelfSigned()
    if err != nil {
        log.Fatal(err)
        return
    }   

    config := &dtls.Config{
        Certificate:          certificate,
        PrivateKey:           privateKey,
        ExtendedMasterSecret: dtls.RequireExtendedMasterSecret,
        ConnectTimeout:       dtls.ConnectTimeoutOption(30 * time.Second),
        InsecureSkipVerify:   true,
    }   

    for {
        conn, err := dtls.Dial("udp", addr, config)
        if err != nil {
            log.Fatal(err)
            return
        }
        defer conn.Close()
        log.Println("Established connection...")

        for {
            message := []byte("Sup youngin...")
            amt, err := conn.Write(message)
            if err != nil {
                log.Println(err)
                break
            }
            log.Printf("Successfully wrote %d bytes", amt)

            time.Sleep(3 * time.Second)
        }
    }   
}

Here's an example of a server receiving the message:

package main

import (
    "log"
    "net"
    "time"

    "github.com/pion/dtls"
)

func main() {
    addr := &net.UDPAddr{IP: net.ParseIP("127.0.0.1"), Port: 6553}
    certificate, privateKey, err := dtls.GenerateSelfSigned()
    if err != nil {
        log.Fatal(err)
        return
    }   

    config := &dtls.Config{
        Certificate:          certificate,
        PrivateKey:           privateKey,
        ExtendedMasterSecret: dtls.RequireExtendedMasterSecret,
        ConnectTimeout:       dtls.ConnectTimeoutOption(30 * time.Second),
        InsecureSkipVerify:   true,
        ClientAuth:           dtls.RequireAnyClientCert,
    }   

    listener, err := dtls.Listen("udp", addr, config)
    if err != nil {
        log.Fatal(err)
        return
    }   
    defer func() {
        listener.Close(5 * time.Second)
    }() 

    for {
        conn, err := listener.Accept()
        if err != nil {
            log.Fatal(err)
            continue
        }

        go communicateData(conn)
    }   
}

func communicateData(conn net.Conn) {
    buffer := make([]byte, 1024)
    for {
        defer conn.Close()
        amt, err := conn.Read(buffer)
        if err != nil {
            break
        }
        log.Printf("%s\n", string(buffer[:amt]))
    }
}

Would expect some way for the client to realize that it has to resend a message because the previous DTLS session is broken, but currently reading and writing on the session is still successful.

Your environment.

  • Version: Latest GIT tag
@daenney
Copy link
Member

daenney commented Oct 28, 2019

UDP doesn't really have connections, so we'd basically have to have a mechanism client side that's configured with a value for a timeout, after which we consider the server to be gone and attempt to re-establish a DTLS session. I'm not sure this needs to be part of the DTLS library, this is a reality of using UDP that you need to handle yourself, as what we consider to be a broken "connection" in UDP is highly application specific.

You need to do something like this in your client:

deadline := time.Now().Add(*timeout)
conn.SetReadDeadline(deadline)

nRead, addr, err := conn.ReadFrom(buffer)
if err != nil {
    ...
}

This only works if your client expects to always get a reply though, which is not necessarily the case with UDP.

@daenney
Copy link
Member

daenney commented Oct 28, 2019

I would imagine that if the server restarts you'd be getting errors though, as the TLS session is no longer valid. Though I suppose you'll only ever see that if you call Read() on the conn b/c that's the point where we try to decrypt. If the server just dies, as in goes away but never comes back, you'd have to use to a timeout on the conn or have some application level specific ping-pong mechanism that lets your client detect the server is gone.

@daenney
Copy link
Member

daenney commented Oct 28, 2019

Turns out I'm wrong on being able to detect a restart. DTLS implementations silently ignore invalid records, so there's no way to detect that.

Unlike TLS, DTLS is resilient in the face of invalid records (e.g.,
invalid formatting, length, MAC, etc.). In general, invalid records
SHOULD be silently discarded, thus preserving the association;
however, an error MAY be logged for diagnostic purposes.
Implementations which choose to generate an alert instead, MUST
generate fatal level alerts to avoid attacks where the attacker
repeatedly probes the implementation to see how it responds to
various types of error. Note that if DTLS is run over UDP, then any
implementation which does this will be extremely susceptible to
denial-of-service (DoS) attacks because UDP forgery is so easy.
Thus, this practice is NOT RECOMMENDED for such transports.

Basically, DTLS only introduced reliability for the purpose of being able to complete a (modified) TLS handshake over a datagram transport. After that, it's back to being as reliable as the transport itself, so in the case of UDP not at all.

You'll need an application level mechanism, i.e a request-response cycle that you expect to complete within a certain window of time, to detect an issue like a server having gone away or having restarted and lost the session.

@hjames9
Copy link
Contributor Author

hjames9 commented Oct 28, 2019

Got it, makes sense. Thanks for analysis. Agreed that it seems that this should be more handled by the application side as per the nature of DTLS/UDP. However, do you think the library should expose more protocol details to aid in that? For instance should the sequence numbers on the DTLS packets be exposed in order for the application to detect gaps or maybe provide callbacks for a library gap detection? It seems like the sequence number was designed to aid in these scenarios.

@hjames9
Copy link
Contributor Author

hjames9 commented Oct 28, 2019

@daenney Also, do you think alerting (generating and receiving), should be exposed at all?

@Sean-Der
Copy link
Member

Sean-Der commented Nov 10, 2019

Hey @hjames9

We can expose statistics if that is helpful! I think that would be a great contribution, I am not sure what that would look exactly like yet though. If we can copy things that crypto/tls or OpenSSL do would be a great start.

If not we can start with basics like bytes transmitted/received.

One thing you could do right now is set a custom logger and look at all the log messages we emit. You can pass in the constructor your own Logger instance and then process every message we generate (and react to them)

@hjames9
Copy link
Contributor Author

hjames9 commented Nov 10, 2019

@daenney Understood your explanation on alerting i.e. should only be diagnostic and should be fatal for the session if used, but in general is not a best practice.

@Sean-Der Keeping live statistics might be helpful in this situation. I'll educate myself on how crypto/tls handles it and see if anything useful can be replicated.

Thanks!

@hjames9
Copy link
Contributor Author

hjames9 commented May 11, 2020

@Sean-Der Can you point me to where an API like this exists in crypto/tls? I can try and replicate something similar here.

@tigersean
Copy link

ref: https://www.cs.ru.nl/bachelors-theses/2018/Niels_Drueten___4496604___Security_analysis_of_DTLS_1_2_implementations.pdf
page 14:
"Closing a DTLS connection the implementation should send a close notify alert. This way, the peer knows the connection is ending"

@Sean-Der
Copy link
Member

Sean-Der commented Mar 5, 2021

Hey @tigersean

We close when we get a Close Alert

We also send a Close Alert when the user closes the connection.

@Sean-Der
Copy link
Member

I don't believe we have anything actionable here. I am going to resolve, but feel free to re-open if you believe more should be done.

Users can detect timeouts, and it is up to them how long until the application has timed out.

@Sean-Der Sean-Der closed this as not planned Won't fix, can't repro, duplicate, stale May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants