New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
the mqtt client has been hanging out,by dead lock #473
Comments
I assume the code you have published is connecting successfully? If the initial connection fails i believe you may have issues (any atttempt to use the client before the connection is up will cause an error; probably better to use the inbuilt ConnectRetry functionality) other than that I cant see any issue with the code provided. However I'm guessing you are publishing/subscribing at some point? Please show that code (working code would be ideal) and more of the logs (activity before the "pingresp not received"; that is the problem; the cause is likely to be earlier in the log). This issue is usually due to a handler function that blocks (does not return) within the users code (not saying this is the case here; currently there is insufficient information to allow much analysis). |
Have a similar issue, not really had the time to debug it further but: I don't seem to receive any data from subscriptions , but I get this debug log from the mqtt client
Then my message that I publish fails after a while with a
Also, disconnecting the MQTT client also hangs. I get the same behavior when I terminate my application with Ctrl-C, where in the signal handler I call a
When I switch back to 1.2.0, it all just works again. |
@bartmeuris - please raise this as a separate issue (without additional information its not possible to ascertain if these have the same cause so its simpler to assume they differ). Given that your code was working with
The section of the logs you included (from Ctrl-C press) gives some indications this may be the case (there is no |
Firstly, I want to say thank you to @bartmeuris @MattBrittan , I've seen your dialogue in #474 ,and learned a lot about the blocking of the client channel. And I think it's also maybe the channel buffering makes my client hang out. here is my code of subscription :
this client just a subscriber to sub a topic named by channel. And my handler function is a go routine function. I look for reason by review the code of mqtt client and find its been block here : Secondly, as my scenario needs to be in ordered. so i support the 'SetOrderMatters' default value is need to be true. appreciate for your suggestion @MattBrittan |
Can you please show your wrapper function (note that its very difficult to diagnose these issues without a full example and/or full logs). As you appear to be calling the handler in a go routine the comments regarding deadlocks should not apply. Note that the main impact of calling
So this just means that In order to look into this further I'm really going to need full logs (a minimal example would be even better). It definitely looks like there is a deadlock somewhere but I don't currently have enough information to investigate. Note that I'm heading into the mountains tomorrow for a couple of weeks and will have no internet access. |
@MattBrittan hi, Matt, thank you for your answer, I will try the new version client and here is my whole logs of the dead block. As I said my programs with mqtt now are running at different countries (sub client is in A country, and pub client / broker in another), the blocking scenario will appear coincidentally while running on my local laptop, the connection and message passing run normally, so i only can provide you the log and cant give you a demo of this.
|
Thanks - unfortunately I'm not going to have time to look at this in detail before heading away. It looks like your issue is here: 2020/12/28 18:55:02 [DEBUG] [net] logic waiting for msg on ibound
2020/12/28 18:55:02 [DEBUG] [net] startIncoming Received Message
2020/12/28 18:55:02 [DEBUG] [net] startIncomingComms: got msg on ibound
2020/12/28 18:55:02 [DEBUG] [net] startIncomingComms: received publish, msgId: 0
2020/12/28 18:55:02 [DEBUG] [client] enter Publish
2020/12/28 18:55:02 [DEBUG] [client] sending publish message, topic: stf-692AFE13-0877-5F58-8B5B-949EEACFAE61/5c922e1d
2020/12/28 18:55:02 [DEBUG] [client] #####Publish c.obound: ####### The fact that I am guessing that you are triggering a
|
Sorry for not reviewing this again sooner. I believe that my previous analysis stands (a handler in your code is probably blocking). I have updated the documentation (in @master) to clarify this. Please try calling |
ข้ามไปที่เนื้อหา นี่คือรหัสลูกค้าของฉัน mosquittoClient := &MosquittoClient{ ร่างกายใดช่วยฉันได้บ้างว่าการหยุดชะงักเกิดขึ้นได้อย่างไรและจะแก้ไขได้อย่างไร @MattBrittanทำตามคำแนะนำของคุณใน# 453ฉันแจ้งปัญหาใหม่และขอความช่วยเหลือจากคุณ @MattBrittan MattBrittan แสดงความคิดเห็น on 24 Dec 2020 • อย่างไรก็ตามฉันคาดเดาว่าคุณกำลังเผยแพร่ / สมัครรับข้อมูล ณ จุดใดจุดหนึ่ง? โปรดแสดงรหัสนั้น (รหัสที่ใช้งานได้จะดีที่สุด) และบันทึกอื่น ๆ (กิจกรรมก่อน "ไม่ได้รับ pingresp" นั่นคือปัญหาสาเหตุน่าจะเกิดก่อนหน้านี้ในบันทึก) ปัญหานี้มักเกิดจากฟังก์ชันตัวจัดการที่บล็อก (ไม่ส่งคืน) ภายในรหัสผู้ใช้ (ไม่ได้บอกว่าเป็นกรณีนี้ขณะนี้มีข้อมูลไม่เพียงพอที่จะให้วิเคราะห์ได้มาก) @bartmeuris bartmeuris แสดงความคิดเห็น on 25 Dec 2020 ดูเหมือนว่าฉันจะไม่ได้รับข้อมูลใด ๆ จากการสมัครสมาชิก แต่ฉันได้รับบันทึกการดีบักนี้จากไคลเอนต์ mqtt INFO[0000] [net] startIncoming Received Message INFO[0050] [pinger] pingresp not received, disconnecting INFO[0000] Terminating mqtt connection... timeout=10s @MattBrittan MattBrittan commented on 26 Dec 2020 • Given that your code was working with 1.2.0 its most likely your issue is due to the change mentioned in the release notes: Note that this commit changes internal message channels from buffered to unbuffered and may impact users who publish from within a message handler (the documentation has been updated to highlight the issue; running potentially blocking operations within a message handle has always been problematic). The section of the logs you included (from Ctrl-C press) gives some indications this may be the case (there is no matchAndDispatch exiting). Unfortunately you have not included enough of the log to draw any further conclusions (pingresp not received, disconnecting is a symptom; the cause will most likely be earlier in the logs). @bartmeuris bartmeuris mentioned this issue on 26 Dec 2020 z.MqttClient.SubMsgWithTopicAndQoS(func(client mqtt.Client, message mqtt.Message) { I look for reason by review the code of mqtt client and find its been block here : Secondly, as my scenario needs to be in ordered. so i support the 'SetOrderMatters' default value is need to be true. appreciate for your suggestion @MattBrittan @MattBrittan MattBrittan commented on 31 Dec 2020 Can you please show your wrapper function (note that its very difficult to diagnose these issues without a full example and/or full logs). As you appear to be calling the handler in a go routine the comments regarding deadlocks should not apply. Note that the main impact of calling SetOrderMatters(false) is to avoid the use of go routines thus ensuring that messages are processed in order (the ACK is only send when the handler exits) so, as it stands, your code may end up processing multiple messages simultaneously. I would also note that none of this actually guarantees in-order delivery; that will come down to your broker configuration (I generally accept the messages in any order and reassemble before processing because this is significantly faster over dubious network links). i've added some debug log local and only this "c.workers.Done()" is not run while the other two c.workers.Done() can be run normally. so I was wandering how this happens and how to recurrent it in local network So this just means that matchAndDispatch is not exiting (which I had assumed was the case based on the snippet of logs you provided). There was a separate issue which has now been fixed (released v1.3.1 yesterday); I think your problem may be different but definitely worth trying that fix (I was unable to duplicate the issue myself but could see how it could potentially happen). In order to look into this further I'm really going to need full logs (a minimal example would be even better). It definitely looks like there is a deadlock somewhere but I don't currently have enough information to investigate. Note that I'm heading into the mountains tomorrow for a couple of weeks and will have no internet access. @yunfuyiren yunfuyiren commented on 31 Dec 2020 2020/12/28 18:54:04 [DEBUG] [net] startIncomingComms: received publish, msgId: 3001 2020/12/28 18:55:02 [DEBUG] [net] logic waiting for msg on ibound I am guessing that you are triggering a Publish based on an incoming message. This should be fine if the handler is in a go routine but will cause a deadlock if not (your log is what I would expect to see in that case), cant give you a demo of this. @MattBrittan MattBrittan commented on 29 Jan @Suparerk8866 Suparerk8866 mentioned this issue 13 seconds ago Leave a comment ยังไม่มี @yunfuyiren yunfuyiren แก้ไขon 24 Dec 2020 นี่คือรหัสลูกค้าของฉัน mosquittoClient := &MosquittoClient{ ร่างกายใดช่วยฉันได้บ้างว่าการหยุดชะงักเกิดขึ้นได้อย่างไรและจะแก้ไขได้อย่างไร ร่างกายใดช่วยฉันได้บ้างว่าการหยุดชะงักเกิดขึ้นได้อย่างไรและจะแก้ไขได้อย่างไร @MattBrittanทำตามคำแนะนำของคุณใน # 453 ฉันแจ้งปัญหาใหม่และขอความช่วยเหลือจากคุณ |
Closing this issue due to inactivity (it appears likely that the issue is in the handlers but there is insufficient info to confirm this). No idea what the comment from @Suparerk8866 is... |
Hi,guys, I use this client sdk by version paho 1.3.0, the broker version is mosquitto 2.0.0.
this is my client code
but after a while , the connection of the mqtt is lost, without any logs of function 'SetConnectionLostHandler'. I check the debug logs.
logs are:
2020/12/20 21:08:26 [CRIT] [pinger] pingresp not received, disconnecting
2020/12/20 21:08:26 [DEBUG] [client] internalConnLost called
2020/12/20 21:08:26 [DEBUG] [client] stopCommsWorkers called
2020/12/20 21:08:26 [DEBUG] [client] startCommsWorkers output redirector finnished
2020/12/20 21:08:26 [DEBUG] [net] outgoing waiting for an outbound message
2020/12/20 21:08:26 [DEBUG] [net] outgoing waiting for an outbound message
2020/12/20 21:08:26 [DEBUG] [client] internalConnLost waiting on workers
2020/12/20 21:08:26 [DEBUG] [client] stopCommsWorkers waiting for workers
could any body help me, how the deadlock come out and how to fix it. @MattBrittan , follow your advise in #453 , i raise a new issue and appretiate your help
The text was updated successfully, but these errors were encountered: