IndyMike Posted December 20, 2007 Posted December 20, 2007 Hello learned members, I've recently made a number of additions to my Insteon system and have been "re-tuning" things. After adding the devices (5 Icon switches and 2 Applianclincs) I was receiving multiple communication errors. After days of troubleshooting thinks appear to be back under control. I am, however, curious about the process the ISY uses for querying devices. Specifically, does the ISY change the Max Hop Count in an effort to "tune" communication? My test equipment consists of a Switchlinc wired to a power cord. I move the device from room to room in order to locate "weak" zones. Observations: 1) When the system is working properly a "query all" will be completed in a single pass (~1% increments on the status bar, 30 devices). 2) When the system is having problems communicating, the "query all" can take up to 5 passes. Subsequent query passes are much quicker than the first. Are these retries on specific devices?. 3) Devices that are flagged (error window) are not always listed as being "offline" in the my lighting window. Did these devices communicate properly on a subsequent (pass 2 or 3) query? 4) If a device is intentionally removed it will be diagnosed on the 1st query pass and no additional passes are performed. Is this interpreted as a "hard" failure and thus retries are not performed? 5) After encountering a query device failure, repeating the query with all devices responding properly will require multiple passes. If a third query is performed it will typically be a single pass. If I could get a better understanding of how the ISY is performing the queries, the process would be a lot simpler the "next time" I break my system. Thanks, IndyMike
Michel Kohanim Posted December 21, 2007 Posted December 21, 2007 Hello IndyMike, Good questions! First of all, we do not change the hop count as doing so previously made things much worse. We leave it at 3. Please find my comments below. With kind regards, Michel 1) When the system is working properly a "query all" will be completed in a single pass (~1% increments on the status bar, 30 devices). True, since if a device fails, ISY tries 3 times so the progress bar is going to be off since its predictions are no longer true 2) When the system is having problems communicating, the "query all" can take up to 5 passes. Subsequent query passes are much quicker than the first. Are these retries on specific devices?. Here's how the progress bar works: 1. Receives system busy from ISY 2. Asks ISY for an approximation of how long it thinks it's going to be busy 3. Starts the progress bar and counts down. If in the meantime, ISY sends a Ready message, Progress bar disappears 4. If the progress bar reaches the limit and ISY is still busy, it's going to ask ISY for another approximation (restarts the progress bar) and now if a device has failed, ISY is going to approximate the number of remaining operations + some heuristics ... this is why the progress bar keeps restarting every time you have one device that does not respond 3) Devices that are flagged (error window) are not always listed as being "offline" in the my lighting window. Did these devices communicate properly on a subsequent (pass 2 or 3) query? You will get the Error dialog on the FIRST error (three attempts). If the subsequent attempts on doing other operations succeeds, then ISY clears the error bit on the node but you have already gotten the first error. To us, if a device does not respond back to three attempts, that's a flag that something is wrong. So, for a query, ISY does the following: a. Get On Level b. Get Ramp Rate c. Get Status So, assume getting On Level failed but getting Ramp Rate succeeded. In this case, you will get the error dialog but you will not have the red exclamation mark besided the node since get Ramp Rate succeeded. If you see the error dialog, please do assume that something is not normal. 4) If a device is intentionally removed it will be diagnosed on the 1st query pass and no additional passes are performed. Is this interpreted as a "hard" failure and thus retries are not performed? If a device is removed intentionally, here's what happens: Get On Level fails but, who knows (see above), may be Get Ramp Rate may succeed. Or may be on level/ramp rate fail but getting the status succeeds. In short, we only give up during crawl and programming since writing to devices that do not respond = disaster. But, for query, we do not give up because one of the 3 may succeed. In short, the read operations are optimistic (something good may eventually come about) while write operations are very pessimistic (if fail 3 times, stop ==> request failed) 5) After encountering a query device failure, repeating the query with all devices responding properly will require multiple passes. If a third query is performed it will typically be a single pass. Please see above ... If I could get a better understanding of how the ISY is performing the queries, the process would be a lot simpler the "next time" I break my system. Thanks, IndyMike
IndyMike Posted December 21, 2007 Author Posted December 21, 2007 Michel, Thank you for the detailed reply. I can see that I was "off" in a few of my assumptions. This will definitely help future troubleshooting expeditions. IM
Recommended Posts
Archived
This topic is now archived and is closed to further replies.