Jump to content

Intermittenly unable to communicate with ISY


DaveStLou

Recommended Posts

Posted (edited)

Issue: My ISY suddenly stops communicating over its wired network connection. I've tried replacing the network cable but no change. Interestingly, ISY begins flashing that it's disconnected only when the cable is physically unplugged. So I assume, it thinks it's connected.

Lights appear normal on ISY but nothing is blinking either at the network jack or on the front (until I unplug it anyway). I've had this happen a couple of times in the last few days. It's happened previously but I haven't taken time to figure out why. I power cycle my ISY and everything returns to normal.

I've attached today's portion of my error log.

As a first step, I have three meshed Asus routers, so I moved ISY from the one in my office to the main router attached directly to my cable modem.

Any other thoughts, ideas or troubleshooting steps to take?

ISY Error Log.v5.3.4__Fri 2021.09.17 02.50.46 PM.zip

Edited by DaveStLou
Posted
Issue: My ISY suddenly stops communicating over it's wired network connection. I've tried replacing the network cable but no change. Interestingly, ISY begins flashing that it's disconnected only when the cable is physically unplugged. So I assume, it thinks it's connected.
Lights appear normal on ISY but nothing is blinking either at the network jack or on the front (until I unplug it anyway). I've had this happen a couple of times in the last few days. It's happened previously but I haven't taken time to figure out why. I power cycle my ISY and everything returns to normal.
I've attached today's portion of my error log.
As a first step, I have three meshed Asus routers, so I moved ISY from the one in my office to the main router attached directly to my cable modem.
Any other thoughts, ideas or troubleshooting steps to take?
ISY Error Log.v5.3.4__Fri 2021.09.17 02.50.46 PM.zip

To be clear when you say can’t communicate this is from accessing via the AC? When the controller is in this state can you ping it on the network? Does it remain visible on the network router?
Posted (edited)
5 hours ago, Teken said:

To be clear when you say can’t communicate this is from accessing via the AC? When the controller is in this state can you ping it on the network? Does it remain visible on the network router?

I can't communicate via the AC or the UD Mobile app. Also polyglot nodes stop working. ISY is still managing all Insteon and Zwave devices and programs are functioning, except obviously any dependent on polyglot states or the Portal.

I cannot ping it and it disappeared from the routers list of connected devices.

4 hours ago, Techman said:

Have you tried rebooting your router?

Yep, that's how I've cleared it many times in the past. What I'm trying to figure out is why it's happening.

It's as if the router is blocking it or removing it but I have no such rules, scripts or policies on the router (that I can think of) that would cause that to happen. Very odd. 

Edited by DaveStLou
Posted

When did this start and what is the frequency of the problem of not being able to connect? What has changed in the network if any?

How is it now being directly connected to the main router?

Posted
15 hours ago, DaveStLou said:

I can't communicate via the AC or the UD Mobile app. Also polyglot nodes stop working. ISY is still managing all Insteon and Zwave devices and programs are functioning, except obviously any dependent on polyglot states or the Portal.

I cannot ping it and it disappeared from the routers list of connected devices.

Yep, that's how I've cleared it many times in the past. What I'm trying to figure out is why it's happening.

It's as if the router is blocking it or removing it but I have no such rules, scripts or policies on the router (that I can think of) that would cause that to happen. Very odd. 

Sounds more like a router or a firewall issue, did you check to see if there's any firmware updates available for your router?

Posted
4 hours ago, Teken said:

When did this start and what is the frequency of the problem of not being able to connect? What has changed in the network if any?

How is it now being directly connected to the main router?

I'm not sure how long it's been going on but I'd guess about once every three to four weeks over the last three or four months. This week it happened twice but I haven't determined a pattern.

It was wired to an Asus RT-AC88U which is hard wired to my main router, an Asus RT-AX88U using "AI-Mesh".

1 hour ago, Techman said:

Sounds more like a router or a firewall issue, did you check to see if there's any firmware updates available for your router?

I agree, it seems like a router issue. I use Asuswrt-Merlin firmware which I update as released. I am on stable version 386.3_2.

Short of a bug which I haven't found anything on, my leading theory is there may be a hardware issue in the RT-AC88U. That's why I moved it to my RT-AX88U to see if that makes a difference.

 

Posted

Anything else on the network dropping off?!? If so that would really point to the router / mesh combo.

Have you assigned a static IP address or reserved a DHCP address locked to the MAC address of the controller?

If it’s just plain DHCP with no reservation hardware dropping off or not able to obtain a new DHCP lease is very common. This is more so where a network appliance does not follow and respect the standards where it must release and renew its IP address upon coming back on line / seen by the router.

When such a problem exists this can create a IP conflict. Best case one or two devices can’t be seen and operate. Worse case it literally takes down the entire network!

Posted
On 9/18/2021 at 3:04 PM, Teken said:

Anything else on the network dropping off?!? If so that would really point to the router / mesh combo.

Have you assigned a static IP address or reserved a DHCP address locked to the MAC address of the controller?

If it’s just plain DHCP with no reservation hardware dropping off or not able to obtain a new DHCP lease is very common. This is more so where a network appliance does not follow and respect the standards where it must release and renew its IP address upon coming back on line / seen by the router.

When such a problem exists this can create a IP conflict. Best case one or two devices can’t be seen and operate. Worse case it literally takes down the entire network! emoji2357.png

I have had trouble with Polisy on the network although not as frequent and I haven't diagnosed it to the same problem. They both have manual IP set but I did notice I also had DHCP reservations for the same IPs on them. Since they're not in the DHCP-assigned range, at minimum that's not necessary and it may be cause trouble when the IP lease runs out.

All is quiet now but tomorrow when I have the house to myself, I intend to take everything down and do a clean restart starting with the cable modem for thoroughness and all of the routers.

Thanks for brainstorming with me @Teken and @Techman

Posted
I have had trouble with Polisy on the network although not as frequent and I haven't diagnosed it to the same problem. They both have manual IP set but I did notice I also had DHCP reservations for the same IPs on them. Since they're not in the DHCP-assigned range, at minimum that's not necessary and it may be cause trouble when the IP lease runs out.
All is quiet now but tomorrow when I have the house to myself, I intend to take everything down and do a clean restart starting with the cable modem for thoroughness and all of the routers.
Thanks for brainstorming with me @Teken and @Techman

It’s best practice to either assign a Static IP or reserve the same based on the MAC address but not both.

Many routers allow a reservation within the DHCP Pool which should not be used. It should be outside of the DHCP Pool to avoid a IP conflict. Some custom firmware like Tomato and others depending upon version had a bug where it would duplicate the IP Address and obvious network conflict would ensue.

This was famously seen on no less than three generations of LinkSys routers! Other things to consider while you’re rebooting all the network hardware is to clear all web cache / Java cache. Along with completing a IP & DNS Refresh / Renew on all systems so it will obtain the latest IP address either assigned or leased.

Good luck!
Posted
12 hours ago, Teken said:

Many routers allow a reservation within the DHCP Pool which should not be used.

I've seen that on some other routers I've worked with. Never understood it.

12 hours ago, Teken said:

Other things to consider while you’re rebooting all the network hardware is to clear all web cache / Java cache. Along with completing a IP & DNS Refresh / Renew on all systems so it will obtain the latest IP address either assigned or leased.

That's my plan!

Posted
33 minutes ago, DaveStLou said:

I've seen that on some other routers I've worked with. Never understood it.

That's my plan!

Most routers will not allow a reservation outside of the allocated IP address pool. The IP addresses they manage are only inside the allocation pool and some will not even allow access to other IP addresses outsdie of their IP management range. They are designed to work that way.

AiMesh routers has a lot of problems. I went though hell and back with my AC68u for years. When I replaced it with a AX98u router more problems started. Then I discovered the AC68u router couldn't handle more than 51 devices. I would just forget some devices and my ISY and polisy were both found operating with IP addresses of 0.0.0.0 once after a total failure of function.

In a desperate attempt to fix this I added a second AX92u router to my AiMesh and things got even worse. After swapping routers around a few times I discovered my Roku Stick+ would not stay connected to my 5Ghz band and I then discovered that most mesh routers turn the magnitude down to about 10-20% of what a router puts out, using other equipment. I contacted ASUS and paid the $40 shipping to have one router repaired.  It came back as a different unit and things have worked well since. 5GHz is basically garbage but the WiFi6 6Ghz is fabulous. Magnitudes are still way down from whole home routers but the 1200 (even on 2.4GHz) and 2400Mbps connections are nice. The AC68u is being scrapped as an antenna finally broke off and it will not mesh anymore, which I cannot be bothered with. My old Netgear router runs circles around both my Ax92u units for my 35+ Magic Home lightbulbs and RGBW strips. They never disconnect now.

Strange thing about these routers is when a WAN problem would happen they seemed to shut down the WiFi systems. When a bad router acted up the other router would disconnect from the WAN. It was very hard to pin down and I never really did.

I suggest you find out which router is causing you problems (or best guess) and contact ASUS. They have put out some real bad models and they know it.

Since I have contacted a few router companies and tried to find out how many connection their models can support. After a lot of garage answers about WiFi etc. I have discovered...they don't know. Support people don't even know what you are talking about. :(

  • Like 1
Posted (edited)
1 hour ago, larryllix said:

Then I discovered the AC68u router couldn't handle more than 51 devices. I would just forget some devices and my ISY and polisy were both found operating with IP addresses of 0.0.0.0 once after a total failure of function.

While my list is slightly below that limit, I do believe that's a factor.

1 hour ago, larryllix said:

I suggest you find out which router is causing you problems (or best guess) and contact ASUS. They have put out some real bad models and they know it.

Since I have contacted a few router companies and tried to find out how many connection their models can support. After a lot of garage answers about WiFi etc. I have discovered...they don't know. Support people don't even know what you are talking about. :(

My AC88U is my prime suspect. I actually bought the AX88U to replace it since I had some odd problems I just didn't have time to troubleshoot in the past. After Merlin started supporting AiMesh so I put it back into use - probably to my long-term determinant. Now that I'm retired, I have some additional time to explore and troubleshoot. Appreciate the thoughts and insight @larryllix.

Edited by DaveStLou
Posted (edited)
7 hours ago, DaveStLou said:

While my list is slightly below that limit, I do believe that's a factor.

My AC88U is my prime suspect. I actually bought the AX88U to replace it since I had some odd problems I just didn't have time to troubleshoot in the past. After Merlin started supporting AiMesh so I put it back into use - probably to my long-term determinant. Now that I'm retired, I have some additional time to explore and troubleshoot. Appreciate the thoughts and insight @larryllix.

I used WRT- ??? in my ASUS Ac68u, wjich seemed to work well for a while. AFAIK they started adding features which used up more NVRAM, and then began to complain about NVRAM shortages. I began trimming names and all kinds of silly things, and that appeared to alleviate the warnings... for a while. but it kept getting worse., again I finally went back to the ASUS software which then worked much better for a while but then grew more and more unreliable. Since then I have read that the version of AC68u, I bought on sale, only had 64KB installed and they secretly changed the new releases to 128KB of NVRAM. Grrrrrr.....

When I bought the AX92u it was to prove the router was bad or not but it brought other problems. Then a second one was ordered and the problems got more complex. A third one came and it went back to amazon. It was probably fine. Al seems good now after the replacement unit from ASUS repair over the last 2-3 months now except the older 5GHz band magnitude is so low on all the ASUS routers.

I suspect the 5GHz lower band may be slightly off frequency also and the Roku Stick+ was off frequency the other way??

I have also found many WiFi devices do not like to change routers and must be rebooted after changing connections.  My WiFi RGBCW bulbs would not even connect to the same SSID and password on 2.4GHz. I had to start over.

Edited by larryllix
Posted

One thing to consider which isn’t directly related to the problem at hand. But has plaque many routers in the past is how they handle friendly names. Some absolutely will freak out if you exceed the number of characters allowed.

Others will drop off or won’t update correctly if special characters are used or spacing is present. As odd as this sounds a few in the past were bricked as when the router was rebooted and saw a label with the exact same model number as part of the friendly name extremely bad coding told the router to shut down or loop!

Posted

Take one, I shut down routers and devices and restarted but it did not go well. "Network storm" is a term a guy I used to work with used to use - devices were slow or were not able to connect. So, take two I brought up things one at a time.

Everything was fine until I brought my Blue Iris server online. Very long story short, I found that somehow all of my programs had become enabled, including many that are called. The Blue Iris connection was that ISY was rapidly cycling through active profiles. Which in turn impacted ISY as changes were triggering other program. Fortunately, I preface program that are supposed to be disabled with a "-" so they were easy to identify once I knew what I was looking for.

I don't know if that was the root cause of the original issue unless the network traffic caused my own LAN-based denial of service. ? Or it may have pushed my questionable router over the edge.

Anyway, this evening it appears all is well.

Posted
9 hours ago, DaveStLou said:

Take one, I shut down routers and devices and restarted but it did not go well. "Network storm" is a term a guy I used to work with used to use - devices were slow or were not able to connect. So, take two I brought up things one at a time.

Everything was fine until I brought my Blue Iris server online. Very long story short, I found that somehow all of my programs had become enabled, including many that are called. The Blue Iris connection was that ISY was rapidly cycling through active profiles. Which in turn impacted ISY as changes were triggering other program. Fortunately, I preface program that are supposed to be disabled with a "-" so they were easy to identify once I knew what I was looking for.

I don't know if that was the root cause of the original issue unless the network traffic caused my own LAN-based denial of service. ? Or it may have pushed my questionable router over the edge.

Anyway, this evening it appears all is well.

If you have the AiMesh automatic load balancing enabled it may take hours to move devices around between bands and nodes. It makes the routers very busy. There are rules to set up, that control all this device movement and some devices are not capable of it. I finally turned off the load balancing that switches devices between bands. Many of my devices would not "heal" and stayed disconnected. Also the automatic switching of routers nodes can be a problem for many devices. In order to switch a devices connection the connected node must refuse to talk to the device. That will cause devices to disconnect and reconnect to the other node. I did a lot of experimenting until slightly less than half of my WiFi devices switched nodes and they were in a better physical distance to that node they found in the end. Those levels are now set about -55dBm to -60dBm before they will be forced to switch.

I found ASUS routers would never properly reset things from a soft reboot. Power cycling them was the only way to to ensure they didn't act up within 24 hours each time.

In the end I found ASUS routers had many more features than any router I found for double the price. If others were cheaper, I would have dumped the ASUS units. Reading forums, it seems all brands have problems with this mesh concept. It saves the neighbours WiFi signals with lower magnitudes but doesn't seem to help the owner of them much. MY nearest neighbour at about 500' away can overpower my mesh router signal strength inside my own house. I play the channels, avoiding his channel usage. This usually means using the newer channels that older routers cannot do.

I hope you understand the 5GHz radar avoidance system (DFS?) if you under any flight paths or near an airport. That has cause many people problems with routers disconnecting on the 5GHz band. It seems many of the new routers are now avoiding those middle 5GHz ferquencies.

Posted
10 hours ago, DaveStLou said:

Very long story short, I found that somehow all of my programs had become enabled, including many that are called.

Did you, by chance, upgrade ISY to 5.x sometime 3-4 months ago (when the problem started)? I noticed when I went from 4.9 to 5.3.4 all my programs became enabled. Just wondering if there was any update during that time to the ISY that went un-noticed then to start the issues you were facing. 

Glad it's seeming normal again. 

Consumer grade routers can often cause lots of unexpected problems as we add more wifi connected devices. While they may be low cost and easy to setup troubleshooting router issues are very difficult for most general home users. 

  • Like 1
Posted
12 hours ago, DaveStLou said:

Very long story short, I found that somehow all of my programs had become enabled, including many that are called.

I've never had this issue with disabled programs, but after reading others indicating it can happen, Over the summer I created a Run at Startup Program that disables all disabled programs.  To locate the disabled programs I created a text backup of programs by copying the entire "my Programs" folder to the clipboard, pasting it to notepad and then searching for all instances of "[Not Enabled]".

Then I had to rename some programs because all names must be unique, for example previously I had a number of programs all named FrontDoor, Since the file path or folders aren't used with the Disable Program command in a program I had to rename these... FrontDoor.open, FrontDoor.heartbeat, FrontDoor.LowBatt.. etc  Then I created a single Run at Startup Program that disables all the programs that should already be disabled.  It then evolved to being the the only run at startup program and its last statement starts the next program that should run at startup.

  • Like 1
Posted
2 hours ago, MrBill said:

I've never had this issue with disabled programs, but after reading others indicating it can happen, Over the summer I created a Run at Startup Program that disables all disabled programs.  To locate the disabled programs I created a text backup of programs by copying the entire "my Programs" folder to the clipboard, pasting it to notepad and then searching for all instances of "[Not Enabled]".

Then I had to rename some programs because all names must be unique, for example previously I had a number of programs all named FrontDoor, Since the file path or folders aren't used with the Disable Program command in a program I had to rename these... FrontDoor.open, FrontDoor.heartbeat, FrontDoor.LowBatt.. etc  Then I created a single Run at Startup Program that disables all the programs that should already be disabled.  It then evolved to being the the only run at startup program and its last statement starts the next program that should run at startup.

Similar efforts here. To keep things more obvious, I add the letters LD, for Leave Disabled, at the end of each program name that should be disabled. Also have a startup program that manually disables them all at startup. By using the LD, I can find them all quickly and check them etc. 

This is the program. ....The trick is to keep it up to date! 

 

 

Startup Disabled Programs LD - [ID 011E][Parent 01EB][Not Enabled][Run At Startup]

If
   - No Conditions - (To add one, press 'Schedule' or 'Condition')
 
Then
        Disable Program 'Alarm Turn Off LD'
        Disable Program 'Alert from Front MD LD'
        Disable Program 'Announce Front Doorbell LD'
        Disable Program 'Announce Rear Doorbell LD'
        Disable Program 'Awning position checkClosed LD'
        Disable Program 'Awning position checkOpen LD'
        Disable Program 'Rain Test Bedroom Door'
        Disable Program 'Dog In Sensor LD'
        Disable Program 'Dog Out Sensor LD'
        Disable Program 'Dog out for 1 hour LD'
        Disable Program 'Dogs Going Out LD'
        Disable Program 'Dogs Home Alert LD'
        Disable Program 'Dogs Home Stop Alert LD'
        Disable Program 'Doorbell Chimes and Actions LD'
        Disable Program 'Entrance way On LD'
        Disable Program 'Rain Test Excersize Door'
        Disable Program 'Exercise Cans Flash LD'
        Disable Program 'Rain Test Front Door'
        Disable Program 'Front Lights Only LD'
        Disable Program 'Front Lights and Spotlights LD'
        Disable Program 'Garage Back Door Test LD'
        Disable Program 'Garage Mid Auto Shut LD'
        Disable Program 'Garage Mid Count LD'
        Disable Program 'Garage North Auto Shut LD'
        Disable Program 'Garage North Count LD'
        Disable Program 'Garage South Auto Shut LD'
        Disable Program 'Garage South Count LD'
        Disable Program 'Garbage open middle door LD'
        Disable Program 'Garbarge Exit LD'
        Disable Program 'Heat Matt Alert reset LD'
        Disable Program 'Heat Matt on Morning 2 Hours LD'
        Disable Program 'Jacuzzi tub mood End Action LD'
        Disable Program 'Lock Mud Room Door LD'
        Disable Program 'Lock Rear Garage Door LD'
        Disable Program 'Mbath Shower Cans Flash LD'
        Disable Program 'Mbed Doug reading 2 LD'
        Disable Program 'Mbed Doug reading 3 LD'
        Disable Program 'Mbed Doug reading 4 LD'
        Disable Program 'Mbed Doug reading 5 LD'
        Disable Program 'Mbed Spot LD'
        Disable Program 'Mbed Wendy reading 2 LD'
        Disable Program 'Mbed Wendy reading 3 LD'
        Disable Program 'Mbed Wendy reading 4 LD'
        Disable Program 'Mbed Wendy reading 5 LD'
        Disable Program 'Pole Barn Lights LD'
        Disable Program 'Ring Chimes LD'
        Disable Program 'SP Cans Flash LD'
        Disable Program 'Set 00 LD'
        Disable Program 'Set 15 LD'
        Disable Program 'Set 30 LD'
        Disable Program 'Set 45 LD'
        Disable Program 'Set 500 LD'
        Disable Program 'Set 600 LD'
        Disable Program 'Set 700 LD'
        Disable Program 'Set 800 LD'
        Disable Program 'Alarm Snooze LD'
        Disable Program 'Visitor 1 Rearm system Away LD'
        Disable Program 'Visitor 2 Lights on LD'
        Disable Program 'X10 Codes LD'
 
Else
   - No Actions - (To add one, press 'Action')
 
Leave disabled.. called by Startup
 

Posted
8 hours ago, larryllix said:

I found ASUS routers would never properly reset things from a soft reboot. Power cycling them was the only way to to ensure they didn't act up within 24 hours each time.

I hope you understand the 5GHz radar avoidance system (DFS?) if you under any flight paths or near an airport. That has cause many people problems with routers disconnecting on the 5GHz band. It seems many of the new routers are now avoiding those middle 5GHz ferquencies.

I have found hard reboot is the only sure way to reset my Asus routers. On the wifi side, I have been generally satisfied with AiMesh experience and dual-band smart switching. Thankfully, I am not near any airport or regular flight paths, so I haven't experienced that.

7 hours ago, Geddy said:

Did you, by chance, upgrade ISY to 5.x sometime 3-4 months ago (when the problem started)? I noticed when I went from 4.9 to 5.3.4 all my programs became enabled. Just wondering if there was any update during that time to the ISY that went un-noticed then to start the issues you were facing. 

Glad it's seeming normal again. 

Consumer grade routers can often cause lots of unexpected problems as we add more wifi connected devices. While they may be low cost and easy to setup troubleshooting router issues are very difficult for most general home users. 

I am on 5.3.4 but have been on it for sometime. When this happened I recall having read about this issue on the forum but haven't had time to double back.

Agree 100% with the lack of help in troubleshooting. There's just enough there to do things we never thought possible years ago but not the tools professionals use to keep those features sin tune.

6 hours ago, MrBill said:

I've never had this issue with disabled programs, but after reading others indicating it can happen, Over the summer I created a Run at Startup Program that disables all disabled programs.  To locate the disabled programs I created a text backup of programs by copying the entire "my Programs" folder to the clipboard, pasting it to notepad and then searching for all instances of "[Not Enabled]".

3 hours ago, dbwarner5 said:

Similar efforts here. To keep things more obvious, I add the letters LD, for Leave Disabled, at the end of each program name that should be disabled. Also have a startup program that manually disables them all at startup. By using the LD, I can find them all quickly and check them etc.

These are great ideas! I will definitely implement.

Thanks all!

  • Like 1
Guest
This topic is now closed to further replies.

×
×
  • Create New...