dth930 Posted January 15, 2019 Posted January 15, 2019 For quite some time, my ISY was working great so I haven't looked at it in several months. Recently, I had a few 2476Ds fail so I replaced them with 2477Ds. That seemed to go fine... new dimmers are working. The weird thing is when I load the admin console, the ISY fails to communicate with several dimmers, and I'm not talking about the ones I replaced (those seem to work fine). They come up with a red ! in the device list. However, if I try to query them I can turn them on and off. When I do that, the red ! changes to the green 1011. I've tried making updates to those individual devices and I've also restored all devices, but neither of those actions seem to help. I still get lots of communications failures when loading the admin console. Everything appears to be working fine, but I'm concerned because I've never seen this before. I'm using a 994 with firmware 4.7.3. Any advice on how to troubleshoot? - Dave
kclenden Posted January 15, 2019 Posted January 15, 2019 Try opening the Event Viewer (Tools->Diagnostics->Event Viewer). Set it to Level 3. Then select one of your dimmers and perform a "Query". In the Event Viewer you should see a [Std-Direct Ack] record that displays "Max Hops" and "Hops Left". Are those two values the same? If not, what are they? Do the "Query" several times for several different devices. If the "Hops Left" is regularly less than the "Max Hops" it means that there are consistent communication issues.
larryllix Posted January 15, 2019 Posted January 15, 2019 (edited) The 1011 means there is binary code to be written to your device yet. If the device is battery operated it must be put into linking mode first. Then you can right click on it and select "Write Updates" As kclenden posted above, you need good comms first or things get messed up. Edited January 15, 2019 by larryllix
dth930 Posted January 15, 2019 Author Posted January 15, 2019 I just did some testing and Max Hops is always 3 and Hops Left is always 1 or 2. I've never seen 3. I saw 0 a few times too. These are all wired devices. None are battery powered. Most of the devices are within 10 feet of each other. What are the next steps? Replace the PLM? - Dave
Techman Posted January 15, 2019 Posted January 15, 2019 57 minutes ago, dth930 said: I just did some testing and Max Hops is always 3 and Hops Left is always 1 or 2. I've never seen 3. I saw 0 a few times too. These are all wired devices. None are battery powered. Most of the devices are within 10 feet of each other. What are the next steps? Replace the PLM? - Dave If you're seeing 1 or 0 hops then you have a communication issue between the PLM and your device. Take a look at this troubleshooting guide. https://wiki.universal-devices.com/index.php?title=INSTEON:_Troubleshooting_Communications_Errors
dth930 Posted January 15, 2019 Author Posted January 15, 2019 I'm a little unclear on what "Hops Left" means. What indicates successful communication and what indicates failure? Does Max Hops=3 & Hops Left=0 mean it tried to communicate 3 times and failed? And Max Hops=3 & Hops Left=3 means it communicated on the first try? Is it even possible for both to be the same? And when you say that 1 or 0 hops means communications issues, are you talking about the Hops Left value or the difference between Max Hops and Hops Left? Also - I have a mixture of dual-band and legacy devices. Would that contribute to the problem? Thanks for the clarification. - Dave
Techman Posted January 15, 2019 Posted January 15, 2019 (edited) 2 hours ago, dth930 said: I'm a little unclear on what "Hops Left" means. What indicates successful communication and what indicates failure? Does Max Hops=3 & Hops Left=0 mean it tried to communicate 3 times and failed? And Max Hops=3 & Hops Left=3 means it communicated on the first try? Is it even possible for both to be the same? And when you say that 1 or 0 hops means communications issues, are you talking about the Hops Left value or the difference between Max Hops and Hops Left? Also - I have a mixture of dual-band and legacy devices. Would that contribute to the problem? Thanks for the clarification. - Dave Hops is basically the number of attempts the device makes to communicate. The default is "3", a "1" indicates that it took 2 attempts and you have fair or poor communication, a "0" is a failure. If you have older single band devices in your system then upgrading them to dual band will improve your communications. You also want to make sure that the two sides of your powerline are bridged. The most effective way to bridge the powerline is by using plug in dual band devices, one on each side of the powerline. You can then confirm that by using the 4 tap test on the dual band devices. You should also make sure that you don't have any electrical appliances that are creating noise on the powerline which can interfere with the Insteon signal. Devices such as TV's, UPS, computers, etc. should be isolated from the PLM using filterlincs. Read the link I sent you regarding troubleshooting as it offers a lot of solutions to powerline issues. What is the model number of your PLM? Edited January 15, 2019 by Techman
larryllix Posted January 15, 2019 Posted January 15, 2019 (edited) 2 hours ago, dth930 said: I'm a little unclear on what "Hops Left" means. What indicates successful communication and what indicates failure? Does Max Hops=3 & Hops Left=0 mean it tried to communicate 3 times and failed? And Max Hops=3 & Hops Left=3 means it communicated on the first try? Is it even possible for both to be the same? And when you say that 1 or 0 hops means communications issues, are you talking about the Hops Left value or the difference between Max Hops and Hops Left? Also - I have a mixture of dual-band and legacy devices. Would that contribute to the problem? Thanks for the clarification. - Dave Hops are not reties! When an Insteon transmission is heard, all capable devices synchronously repeat that transmission. When another device hears this, one hop is completed. This allows Insteon signals to transmit to the Initiating device---->hop---->second device----->hop----->third device----->hop----->PLM or reverse orde,r PLM---->end device This is not counted as a Hop Initiating device ----->PLM On top of all that if an ACK is not heard back or a NAK is received Insteon devices will/may try again with repeats of the whole process. Zero hops left is not a failure. It means it took three intermediate devices to get there. If it failed, ISY wouldn't know about the attempt and you would not likely see any recording of an event. Edited January 15, 2019 by larryllix
dth930 Posted January 15, 2019 Author Posted January 15, 2019 Sorry... I don't think I asked my question about hops clearly. I understand what a hop is. My question is about what "Max Hops" and "Hops Left" mean. Are Max Hops then max hops taken for the message or the max hops allowed by the system? Are Hops Left the difference between max hops and hops taken? In other words, hops left=2 indicates a direct connection from the PLM to the device (message used 1 hop), though hops left=2,1,0 are all successful, in decreasing order of communication performance. My PLM identifies itself as 9.37.6c v61 Is that new, old or problematic? I suspect it's on the older side, but not sure what improvements I could expect by upgrading. On an interesting note, when I queried my PLM to get the version, the admin console suddenly started writing to all of the devices that were waiting for an update. Is it possible that the PLM was locked up and querying got it working again? - Dave
larryllix Posted January 15, 2019 Posted January 15, 2019 (edited) 51 minutes ago, dth930 said: Sorry... I don't think I asked my question about hops clearly. I understand what a hop is. My question is about what "Max Hops" and "Hops Left" mean. Are Max Hops then max hops taken for the message or the max hops allowed by the system? Are Hops Left the difference between max hops and hops taken? In other words, hops left=2 indicates a direct connection from the PLM to the device (message used 1 hop), though hops left=2,1,0 are all successful, in decreasing order of communication performance. My PLM identifies itself as 9.37.6c v61 Is that new, old or problematic? I suspect it's on the older side, but not sure what improvements I could expect by upgrading. On an interesting note, when I queried my PLM to get the version, the admin console suddenly started writing to all of the devices that were waiting for an update. Is it possible that the PLM was locked up and querying got it working again? - Dave I believe the Max Hops are set by the system and used by every device. The retries are set by each device and many newer devices can be set from ISY options on each device. If the Max Hops bits are set to 3 and ISY shows Hops Left as 3, then no hops were used before reception by the PLM. = Direct between device and PLM If the Max Hops bits are set to 3 and ISY shows Hops Left as 0, then all the hop possibilities were used up. Devices stop repeating these messages so the system doesn't bog down. If the Max Hops bits are set to 3 and the PLM never received it shouldn't show up as an event in the logs. I believe it doesn't matter how many hops were used, the reception end has to wait until all three hops are bypassed (messages ignored) before it can respond with an ACK. If retries are involved the time it takes gets compounded. Now if the ACK has problems getting back also Insteon gets real slow. This is why a noise free power grid system is so important. Insteon usually gets there through a lot of garbage, but it can get real slow and ISY usually gets the blame. I just found two GDOs making noise (one since the beginning of time, one newer is noisier and gave the clue) and now my system is flying again!!!! My very first Insteon filterLincs ordered! Edited January 15, 2019 by larryllix
kclenden Posted January 16, 2019 Posted January 16, 2019 (edited) Insteon is a mesh network. This means that in theory the more devices you have the more reliable the network is. This reliability comes from the fact that most devices on the network repeat all commands on the network. For the devices to all repeat commands without talking over each other, there has to be some rules. That is where "Hops" come in. When a device sends out a command, it specifies how many "Hops" are allowed for the command. This tells all devices on the network how many times to repeat the command. When a device first receives a command, it subtracts one from the "Max Hops" and repeats the command. It continues repeating the command like this until the "Max Hops" becomes -1 at which point it stops repeating the command. While devices are repeating commands, they can't send out any commands of their own. Additionally, even devices that don't repeat commands (battery operated devices) aren't allowed to send out commands until the other devices are done repeating the original command. So a command sent out with "Max Hops" equal to 3 will use up four times as much of the network's time as a command sent out with "Max Hops" equal to 0. But that command will also be more likely to reach its intended target because each repeater adds a little power to the signal. What the Event Viewer log shows you is the earliest point that the device received a command. If it received the command when the originator initially sent it, the "Hops Left" will be 3. If it didn't hear the original command, but did hear it the first time other devices repeated the command then "Hops Left" will be 2. And so on. As larryllix says, a "Hops Left" of 0 means the device heard the command but not until the very last possible moment. Since you're seeing some "Hops Left" of 0, it means that there are likely times where commands are just lost and don't appear in the Event Viewer log at all. You definitely have communication issues. They could be the result of a failing PLM, but they also could be caused by noise or signal suckers on the powerline. The link (https://wiki.universal-devices.com/index.php?title=INSTEON:_Troubleshooting_Communications_Errors) posted by Techman is as good a place as any to start to understand some of the things that can hurt and help the powerline network. Edited January 16, 2019 by kclenden 2
dth930 Posted January 21, 2019 Author Posted January 21, 2019 Thanks. It now completely makes sense to me and is helpful in interpreting the data from the event log. I ended up replacing all of my 2476Ds with 2477Ds and now I consistently get hops left=2 and an occasional 1. I probably still have some noise in the system, but everything is working reliably and is much better than it was. - Dave
Recommended Posts