Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BE469ZP Locks - S2 Issues? #3842

Closed
3 of 11 tasks
kars85 opened this issue Dec 3, 2021 · 13 comments
Closed
3 of 11 tasks

BE469ZP Locks - S2 Issues? #3842

kars85 opened this issue Dec 3, 2021 · 13 comments
Assignees
Projects

Comments

@kars85
Copy link

kars85 commented Dec 3, 2021

Is your problem within Home Assistant (Core or Z-Wave JS Integration)?

YES, BUT a Home Assistant developer has told me to come here

Is your problem within ZWaveJS2MQTT?

NO, my problem is NOT within ZWaveJS2MQTT

Checklist

Describe the bug

What causes the bug?
Sending an lock/unlock command to any of my 4 Schlage BE469ZP locks
What do you observe?
Unique, yet repeatable (per lock) errors that seem to revolve around S2 security
What did you expect to happen?
Appears that some commands cause the driver to crash, other times commands execute with a variety of errors

2021-12-03T20:34:15.060Z DRIVER « [Node 010] [REQ] [BridgeApplicationCommand] │ RSSI: -84 dBm └─[Security2CCMessageEncapsulation] [INVALID] error: Duplicate command

2021-12-03T20:33:38.308Z CNTRLR « [Node 015] received S2 nonce, not sure what to do with it

2021-12-03T20:34:27.029Z DRIVER Unexpected error during SerialAPI command: ZWaveError: Security S2 CC requires the receiver's nonce to be sent! (ZW1401) at Security2CCMessageEncapsulation.serialize (/usr/src/app/node_modules/zw ave-js/src/lib/commandclass/Security2CC.ts:913:10) at SendDataBridgeRequest.serialize (/usr/src/app/node_modules/zwave-js/src /lib/controller/SendDataBridgeMessages.ts:125:37) at getSerialAPICommandMachineConfig (/usr/src/app/node_modules/zwave-js/sr c/lib/driver/SerialAPICommandMachine.ts:143:18) at createSerialAPICommandMachine (/usr/src/app/node_modules/zwave-js/src/l ib/driver/SerialAPICommandMachine.ts:385:3) at executeSerialAPICommand (/usr/src/app/node_modules/zwave-js/src/lib/dri ver/CommandQueueMachine.ts:276:43)

Device information

Manufacturer: Schlage
Model name: BE469ZP
Node ID in your network: 10,12,15,32 (silly logging enabled, with these 4 nodes specified

How are you using node-zwave-js?

  • zwavejs2mqtt Docker image (latest)
  • zwavejs2mqtt Docker image (dev)
  • zwavejs2mqtt Docker manually built (please specify branches)
  • ioBroker.zwave2 adapter (please specify version)
  • HomeAssistant zwave_js integration (please specify version)
  • pkg
  • node-red-contrib-zwave-js (please specify version, double click node to find out)
  • Manually built from GitHub (please specify branch)
  • Other (please describe)

Which branches or versions?

zwavejs2mqtt: 6.0.2
zwave-js: 8.8.2

Did you change anything?

no

If yes, what did you change?

No response

Did this work before?

Don't know, this is a new device

If yes, where did it work?

No response

Attach Driver Logfile

zwavejs_2021-12-03.log

Here is a clean log capturing driver startup, followed by a one-by-one lock/unlock command 1 time to each individual lock.

@zwave-js-bot zwave-js-bot added this to Needs triage in Triage Dec 3, 2021
@kars85
Copy link
Author

kars85 commented Dec 3, 2021

I realized I was not on the latest 8.8.3 build, so here is a new log with an updated container with the latest image pulled. Looks much cleaner. The only node (32) in question is still throwing some errors, but is also on the fringe of my network. Is there any indication in the log that would support a signal strength issue here, where a repeater is necessary?

zwavejs_2021-12-03.log

This node should be going through mains powered node 21, which is only 25 feet away. Instead it's routing through node 25 which is close to 100ft. I have tried individually heals, as well as network wide heals, but no change in routing.

Snag_7abe0bb3

Snag_7abdabc1

@AlCalzone
Copy link
Member

The 8.8.2 log is actually pretty valuable (I hope) in understanding what's wrong. I've asked in the other issues related to this multiple times, but all I got was prose 👍🏻

Give me a few minutes to look at your logs.

@AlCalzone
Copy link
Member

The 8.8.3 log looks like there connection to node 32 is not the best. Unfortunately I also had to revert the change that parses transmission statistics, but this is the gist:

  • 2021-12-03T20:59:54.556Z: Node 32 sends NotificationCCReport, requests a confirmation
  • 2021-12-03T20:59:54.559Z: Driver sends confirmation
  • In the meantime, Node 32 re-sends the notification roughly every 70ms, although this is more likely a re-transmission from the network/routing
  • 2021-12-03T20:59:58.761Z: The confirmation has reached Node 32 and the confirmation of that has reached the driver.

The 8.8.2 log does explain what's failing. Essentially it is caused by the suboptimal mesh and the re-transmissions, but the driver needs to handle that better:

  • 2021-12-03T20:34:10.416Z: Unlock command to node 10
  • 2021-12-03T20:34:10.428Z: Node 10 reports the new state, requesting confirmation. The transmission report is not received yet though, so the driver may not respond yet
  • 2021-12-03T20:34:15.059Z: The same request is received a few more times
  • 2021-12-03T20:34:16.029Z: Node 10 sends a NotificationCCReport, again requesting confirmation. The transmission report is still not received yet, so the driver may still not respond yet

Because the NotificationCCReport was received, the shared encryption state between the driver and the node has advanced. The first message from node 10 is still being retransmitted, only now it cannot be decoded. As a result the driver thinks the node and driver are out of sync and resets the encryption state.

  • 2021-12-03T20:34:27.015Z: The transmission report is finally received after 16 seconds and 17 attempts.
  • 2021-12-03T20:34:27.026Z: The driver tries to send the confirmation for the first received message from 2021-12-03T20:34:10.428Z. At the time that message was created, the encryption state was still okay, but now it is missing, so serializing it fails.

@AlCalzone AlCalzone self-assigned this Dec 3, 2021
@kars85
Copy link
Author

kars85 commented Dec 3, 2021

That's very informative. Thanks for the response. I initiated heals like I mentioned, but had to walk away. The new log has those appended to it, starting at 2021-12-03T21:05:28.77
zwavejs_2021-12-03.log
2.

A large amount are failing to heal, but to avoid scope creep I'll disregard for the time being.
image

@AlCalzone
Copy link
Member

This node should be going through mains powered node 21, which is only 25 feet away. Instead it's routing through node 25 which is close to 100ft. I have tried individually heals, as well as network wide heals, but no change in routing.

Don't trust that map too much. It shows neighbors, not the actual routes. Also, judging from your previous log, you might be running into the same issues, others seem to have been having with the 700-series sticks. They don't always update the neighbors correctly, healing fails and they tend to take a long time sending in busy networks:
#3810
#3477
https://forums.homeseer.com/forum/homeseer-products-services/homeseer-z-wave-products/smartstick/1483440-does-anyone-have-a-solid-working-g3-system-at-this-time?p=1510687#post1510687

@kars85
Copy link
Author

kars85 commented Dec 3, 2021

In reading some of those issues/threads, is your conclusion that the underlying issue is something manufacturer's need to address in a firmware release, or is this something that the zwave-js platform will ultimately have to work around (if that's even possible?). It seems like 700 series have been around long enough and in a mature enough state that an issue like this should've reared it's head awhile ago, or is this routing/healing issue just being uncovered because of wider 700-series adoption of devices by the general public and is more "visible" to end users?

I guess, in short, what do you think is the short/long term approach for folks running 700 series devices?

@AlCalzone
Copy link
Member

That's something Silabs need to address in a firmware update, since all 700 sticks use the same firmware. I don't think this is an end device issue.
From the driver side, we cannot work around it. The stick controls the actual sending/routing, we only tell the stick what to send. We could abort the sending if it takes too long (like HomeSeer does), but that doesn't help if you want to send the messages.

what do you think is the short/long term approach for folks running 700 series devices sticks?

Short term: If you don't have too many devices, are affected by this and have a spare 500 series controller, rebuild the network with that.
Long term: wait for Silabs to update the firmware.

I'm working on an NVM migration tool that will allow you to more easily switch between 500 and 700 series sticks without losing the network information. But finalizing that will take another couple of days.

@kars85
Copy link
Author

kars85 commented Dec 4, 2021

Got it. Has to be frustrating knowing you're hamstrung with a buggy platform and can't work around it. Fingers crossed the other Z-wave players have reached out to Silabs, too, regarding this and have their attention. As I understand, 700 series platforms/networks are in between a rock and a hard place.

Have you heard any buzz around this/these issue(s) gaining attention at the firmware level?

I'll be doing some decision making on if I drop down to the Gen5 stick I have and rebuilding, or sit tight on what you come up with for an NVM backup option.

Thanks for all your work on this - I'll stop clogging up the Issues/Slack channel with noise until the dust settles a bit!

@kars85
Copy link
Author

kars85 commented Dec 5, 2021

I upgraded the FW of my Zooz stick to the latest 7.16 here. My network is no longer throwing errors that I can see when tailing, but I've attached the latest 16 hours regardless in case there is something I am missing.

Heals are 100% successful, and my network has been responsive through a nighttime and morning routine. Manual operations are responsive to to the edge device (node 32) referenced in this issue that was having timeout/S2 problems.
zwavejs_2021-12-04.log
zwavejs_2021-12-05.log

@AlCalzone
Copy link
Member

I don't see much in the log that's not related to healing, but good to hear. I do know of others who still have issues, even with 7.16.3.

@OmenWild
Copy link

OmenWild commented Dec 7, 2021

Hi @AlCalzone, not to be a pest, but I was wondering if there is somewhere I can watch for the NVM migration tool you mentioned? I have a 45 node network (with 15 more to add) on a Zooz 700 series controller that is ... problematic. I would love to roll to a 500 series controller I have on hand (a Z-Wave.Me), and your migration tool sounds easier than excluding and re-adding all my devices.

@AlCalzone
Copy link
Member

There is an open PR but not sure if that helps you.

@AlCalzone
Copy link
Member

Duplicate of #3906

@AlCalzone AlCalzone marked this as a duplicate of #3906 Dec 15, 2021
Triage automation moved this from Needs triage to Closed Dec 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Triage
Closed
Development

No branches or pull requests

3 participants