Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add handshake callback/monitor #463

Closed
wants to merge 1 commit into from

Conversation

alvarowolfx
Copy link
Contributor

@alvarowolfx alvarowolfx commented May 11, 2022

Description

This issue is a draft and I wanted to discuss a need that we had here in our project. We use the DTLS library in the IoT context, for IoT devices to authenticate to our system. One issue that we started seeing is that the auth/handshake can fail in some cases, and we want to keep track of when that happens. We have 3 major scenarios:

  • Devices on a constrained network and might take too long to finish the handshake ( basically times out )
  • The device informed the wrong set of credentials ( for now just PSK ID and PSK, but later we want to use certificates )
    • Right now we can identify when the PSK ID is wrong, but when the PSK is not, even though I see crypto errors internally in the library, I could not capture that and It still falls under the previous "timeout" category.

With this PR, I added a way to keep track of the Handshake as it progress, tracking the current Flight, Handshake State, and the connection itself ( so we can trace back to which client/device is using)

OnHandshakeState: func(conn *dtls.Conn, f dtls.FlightVal, s dtls.HandshakeState, err error) {
			fmt.Printf("Handshake state %d: %s\n", f, s)
			if err != nil {
				pskID := conn.ConnectionState().IdentityHint
				fmt.Printf("Handshake error for device `%s`: %s\n", pskID, err)
			}
		},

In our use case here, we save those error states as audit logs in our system so users can know when they have authentication errors on their system.

I ended up having to make the types and constants related FlightVal and HandshakeState public to be shared on this function, so this generated many changes to the codebase.

Again, I'm open to suggestions on better ways to track handshake errors and wanted to open this draft preview to see if this makes sense and can be merged or if we can do this differently.

Reference issue

No related issue

@codecov
Copy link

codecov bot commented May 11, 2022

Codecov Report

Merging #463 (05750e8) into master (2a9c68d) will decrease coverage by 0.05%.
The diff coverage is 85.24%.

@@            Coverage Diff             @@
##           master     #463      +/-   ##
==========================================
- Coverage   75.76%   75.70%   -0.06%     
==========================================
  Files          96       96              
  Lines        5586     5611      +25     
==========================================
+ Hits         4232     4248      +16     
- Misses       1022     1028       +6     
- Partials      332      335       +3     
Flag Coverage Δ
go 75.73% <85.24%> (-0.06%) ⬇️
wasm 66.09% <84.42%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
config.go 100.00% <ø> (ø)
flight5bhandler.go 72.72% <0.00%> (ø)
conn.go 79.86% <67.85%> (-0.84%) ⬇️
handshaker.go 75.95% <80.00%> (ø)
flight.go 92.30% <100.00%> (ø)
flight0handler.go 77.41% <100.00%> (ø)
flight1handler.go 87.85% <100.00%> (ø)
flight2handler.go 79.54% <100.00%> (ø)
flight3handler.go 76.29% <100.00%> (ø)
flight4bhandler.go 71.55% <100.00%> (ø)
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2a9c68d...05750e8. Read the comment docs.

@daenney
Copy link
Member

daenney commented May 11, 2022

Thanks for starting a discussion around this plus the code. I think this is a useful thing to have in many scenarios and we should have a way to make the internals more accessible for debugging purposes like this.

As you noted you had to make a number of things public now which does increase our API surface quite a bit, for things that otherwise really are internal to the library. That's something I'm going to have to mull over for a bit, because once we do this we can't easily go back without needing to release a v3.

I'm hoping @at-wat and @Sean-Der will have a look too over the next few weeks, maybe come up with feedback/suggestions of their own.

@stv0g stv0g changed the title feat: add handshake callback/monitor Add handshake callback/monitor Jan 24, 2023
@stv0g stv0g added the enhancement New feature or request label Jan 24, 2023
@Sean-Der
Copy link
Member

Sean-Der commented Jun 1, 2024

Hi @alvarowolfx

Sorry I didn't respond to this sooner. I have had this same question/problem and this is how I ended up solving it/my conclusion.

  • With pion/webrtc I care about particular events. Usually as a result of a bug/issue happening and we start tracking it.
  • I am able to capture the data by providing my own logger. Most things are logged that we care about.
  • I don't want to expose more APIs because it makes the library harder to use for the majority of users.

I think the same sentiment applies to this library. If a highly technical user (as yourself) needs something I would say pkg/protocol and using a custom logger gets you everything. However providing more APIs tends to make the internals brittle/confusing.

If you disagree I am happy to re-visit this though! Thanks for taking the time to write this up, sorry I have been away from the project so long.

@Sean-Der Sean-Der closed this Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants