-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MQTT: QoS2 tests and fixes #4406
Conversation
server/mqtt.go
Outdated
@@ -3859,6 +3859,7 @@ func (c *client) mqttProcessPubRec(pi uint16) error { | |||
|
|||
sess.mu.Lock() | |||
if sess.c != c { | |||
sess.mu.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that the new test you added would flap on my computer. What fixed it was to move the "jsa.storeMsg" under the lock and I think I understand why: otherwise it would race between the pubRelCb that would update the pending map and when a bit lower you lookup the ackSubject it would use the "wrong" one.
We may need to have another call to discuss better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kozlovic I think I understand the issue you're describing. 2/5 I don't like placing the storeMsg
roundtrip under the session lock as the solution.
There may be a way to fix this by re-arranging the order of execution/error handling - the only reason the storeMsg
call is first in the function is because we must respond, but this can be done explicitly in the earlier error return paths, and then after the Unlock
when in-memory map is already prepared for mqttDeliverPubRelCb
. Unfortunately, we may then be racing with another mqttDeliverMsgCbQoS12
. I don't like going down that path.
A more verbose but 1/5 cleaner solution is to separate the PI <> ACK mappings for PUBRELs altogether. I was trying to avoid it, but perhaps wrongly. Will try this now.
Since I have not seen the new tests flapping (yet?), can you please add the exact go/CPU/OS environment and the go test command? Then again, since we've ran into this, I should probably add an explicit wrapper function and a test that would forcefully wait for the PUBREL delivery attempt before returning from "storeMsg".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(as for a call - please see Slack)
Adds QoS2 related tests and 2 uncovered bug fixes to #4349.
Still WIP/to come:
@kozlovic I'd really appreciate suggestions on
[MQTT-3.10.4-3]
, see comments inTestMQTTQoS2InflightMsgsDeliveredAfterUnsubscribe