Sirmeili Posted November 25, 2023 Posted November 25, 2023 This just started happening, but my polisy is going non-responsive once a day. It's quite frustrating. One thing I've noticed when I do get to the logs is that it says there are errors with "java.net.Sockettimeoutexception" over and over. I can still SSH into the device, but nothing seems to be overwhelming the system. I also can't access the ISY Admin either. I have zero nodes installed and basically just use this as an ISY for my HomeAssistant install. It has been working for months without an issue until recently. Any ideas? I will also open a ticket, but thought the forums might offer faster responses on the holiday weekend.
gregkinney Posted November 25, 2023 Posted November 25, 2023 Yes, I have had the same issue. Below is the text from my ticket a couple weeks ago. It was supposedly fixed in 5.7.1, however, I am still having disconnections. I have opened another ticket but Michel thinks that it's just portal network connectivity. Michel Kohanim replied 17 days ago Hi Greg, Recent updates to Home Assistant have caused dramatic bombardment from HA to Polisy causing it to either crash or consider it DOS attack. We are currently working on a solution that will handle these attacks more gracefully. We'll hopefully have a solution by the end of the week. View more! Greg Kinney replied 17 days ago Yes I have Home Assistant Michel Kohanim replied 17 days ago Hi Greg, do you have any of the following: 1. Home Assistant 2. Home Bridge 3. ELK or Harmony Node servers
Sirmeili Posted November 25, 2023 Author Posted November 25, 2023 Ok, so it's HA doing it to the polisy? Anyway I can validate this is the same issue in my logs that you know of?
gregkinney Posted November 25, 2023 Posted November 25, 2023 I did not validate anything with my logs so I don't know.
Sirmeili Posted November 25, 2023 Author Posted November 25, 2023 4 minutes ago, gregkinney said: I did not validate anything with my logs so I don't know. I looked at the HA logs and I did see lots of these, so it could be that it is on the HA side, but not sure as I never looked before. 2023-11-25 11:57:12.651 DEBUG (MainThread) [pyisy.events.websocket] Starting websocket connection. 2023-11-25 11:57:12.654 ERROR (MainThread) [pyisy.events.websocket] Unexpected websocket error Session is closed Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/pyisy/events/websocket.py", line 218, in websocket async with self.req_session.ws_connect( File "/usr/local/lib/python3.11/site-packages/aiohttp/client.py", line 1141, in __aenter__ self._resp = await self._coro ^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/aiohttp/client.py", line 779, in _ws_connect resp = await self.request( ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/aiohttp/client.py", line 400, in _request raise RuntimeError("Session is closed") RuntimeError: Session is closed I did install the HACs UD ISY/IOX plugin (which is just a newer version of hte one in HA that uses a beta version of pyisy). I'll see if that is any better.
gregkinney Posted November 25, 2023 Posted November 25, 2023 Please report back and let me know if it helped 🤞
Sirmeili Posted November 25, 2023 Author Posted November 25, 2023 57 minutes ago, gregkinney said: Please report back and let me know if it helped 🤞 Will. do. So far so good and I don't see the above errors in the HA logs, but then again, don't know if they occurred because the polisy freaked out. I should know in the nest 24 hours and I'll report back.
Sirmeili Posted November 26, 2023 Author Posted November 26, 2023 So, just a quick update. As of this morning the Polisy is acting fine (both PG3x and the ISY). I'm not seeing any of the java.net.sockettimeoutexceptions in the polisy logs. Still not going to say it's 100% working, I'll continue to monitor, but things are looking up. Some notes if you are using Home Assistant and want to install the beta version of the "Universal Devices ISY/IoX" integration from HACs. t is a replacement/overwrite. That is to say you only have to install it from HACs and nothing else. It will put itself in the custom_components directory and upon restart HA will us that version instead of the included one (so easy install/backout) If you are using any Events in HA for ISY devices, the event data has changed and broken a lot of my automations. It adds the address into the event data. It's a pretty easy fix, but something to be aware of. As far as I can tell, except #2, everything else from a device/entity perspective is the same and all my dashboards and automations (except the issue in #2 for triggers) are working as expected. I'll reprort back tomorrow if it's still working or sooner if I notice it breaks again. 1
Sirmeili Posted November 27, 2023 Author Posted November 27, 2023 (edited) Well, not great news. I woke up and while HA can control the ISY, the polisy is slow to react I get duplicate commands (like it thinks that the ISY didn't respond and HA tried again, or that was all on the ISY side) Trying to log into the pg3x web interface just constantly kicks me back out to the login screen after logging in, but before doing so, it says there is no ISY detected. I have no idea where to see how HA might be overloading the ISY. I've kept the Event Viewer on and I don't see a bunch of excessive commands or anything being sent to it. I've also tried rebooting from the ISY admin interface and the phone since this started happening, but all I get is any ISY with no lights on it. I have to power cycle it manually to get it to work. Looking at the HA logs, I noticed that I was getting heartbeat messages on Saturday from the ISY, but they stopped shortly after my last restart (within an hour or 2) and after that I would get these 2023-11-25 15:40:15.170 WARNING (MainThread) [pyisyox.events.websocket] Websocket disconnected unexpectedly with code: 0 I've also seen this here and there: 2023-11-26 04:13:31.099 ERROR (MainThread) [pyisyox.events.websocket] Error during receive Received frame with non-zero reserved bits After about 24 hours I start to see this: 2023-11-26 13:54:06.099 WARNING (MainThread) [pyisyox] Timeout while trying to connect to the ISY. That goes on untl this morning when it was not as responsive. Note that until this morning, HA didn't see to have many issues talking to the ISY, but obviously in the background it did. I still don't know if this is on the ISY or the HA side, but I can tell you that the code on the HA side for this integration seems to not have changed since earlier this year. it's using websockets so not sure if something changed in the base HA code, but the integration hasn't changed. At this time, I'm not 100% sure the "beta" is the way to go. If anyone can tell me where I can see the logs that points to HA hammering the ISY, I would love to look at them so I can go back to HA and see if it can ber fixed over there. Otherwise, their logs don't show a bunch of communication back to the ISY until the ISY starts to timeout. Edited November 27, 2023 by Sirmeili
gregkinney Posted November 27, 2023 Posted November 27, 2023 Michel said he was making changes on the ISY side to be able to handle HA so I would keep to that logic at the moment. I'm excited for you to share all of the above with them in your ticket. Please let me know what happens, I'm going to wait to do anything until you hopefully get somewhere with the above info.
Sirmeili Posted November 27, 2023 Author Posted November 27, 2023 3 hours ago, gregkinney said: Michel said he was making changes on the ISY side to be able to handle HA so I would keep to that logic at the moment. I'm excited for you to share all of the above with them in your ticket. Please let me know what happens, I'm going to wait to do anything until you hopefully get somewhere with the above info. Yeah, I'm still waiting on a first contact from my ticket (holiday weekend so I understand). What are you doing at the moment to keep everything running? This is killing my WAF. Are you waiting for it to fail or are you doing some kind of automated reboot?
gregkinney Posted November 27, 2023 Posted November 27, 2023 (edited) I'm not confident I'm having serious issues anymore. Before updating to 5.7.1, I would notice 2-4 crashes per day and I would confirm it indeed was crashing because it would be 5-10 minutes before things would respond again. After updating to 5.7.1, I'm getting notifications that it has disconnected and reconnected 1-2 times per day, however, I don't think it is crashing this time. Things are immediately responding after I get the notifications. It might be exactly what Michel said - that it just loses connectivity with the portal temporarily. So we might be having different issues, I'm not sure. Are you on 5.7.1? Edited November 27, 2023 by gregkinney
gregkinney Posted November 27, 2023 Posted November 27, 2023 Thought I would share my updated ticket: Michel Kohanim replied 22 minutes ago Hi Greg, the first thing you need to do is to disable HA for a day and see whether or not you still get these issues. As far as I remember, your HA was bombarding IoX with traffic. Although we may have fixed the crash, it does not mean that Polisy can handle the traffic. View less! Greg Kinney replied 31 minutes ago Yes the last runtimes are correct. Yes I am on 5.7.1. When I was on 5.7.0 and I would get a disconnect/reconnect notice on my phone from the UD app, it would be 5-10 minutes before any devices would respond because it was loading everything (I assume it had restarted or crashed). Now on 5.7.1, when I get a disconnect/reconnect notice, my devices are still immediately responsive. So if it's a network related issue, is that on my end or on your end? View less! Michel Kohanim replied 56 minutes ago Thanks Greg. So, the last/next runtimes are correct? If so, the issue is network related. What's your IoX version? Is it really 5.7.1?
PB11 Posted November 29, 2023 Posted November 29, 2023 (edited) Also, finding my polisy going offline each morning. I don’t have HA on my network. This is something new since the 5.7.1 update for me. error from UD mobile Edited November 29, 2023 by PB11
PB11 Posted November 29, 2023 Posted November 29, 2023 Further to the above, polisy does show and initiates connection via Admin Console but then hangs on "starting Subscription". I have confirmed portal license is active.
gregkinney Posted November 29, 2023 Posted November 29, 2023 @PB11 Did you start a ticket? Please do so hopefully something gets done about this.
PB11 Posted November 29, 2023 Posted November 29, 2023 1 hour ago, gregkinney said: @PB11 Did you start a ticket? Please do so hopefully something gets done about this. Hi @gregkinney i did start a ticket and will let you know how it goes. 1
PB11 Posted November 30, 2023 Posted November 30, 2023 So I heard from UD support. After requesting my system logs, they suggested I run update once again. This would be the third time running the update for 5.7.1. Third time seems to be the charm as I woke up this morning with the polisy still online. I’ll wait a few days before claiming this is the final fix as it was somewhat sporadic.
gregkinney Posted December 10, 2023 Posted December 10, 2023 I was able to fix the problem by upgrading to PG3X. Well, at least so far. 3 days now and no disconnects.
Sirmeili Posted December 12, 2023 Author Posted December 12, 2023 I just wanted to give an update. This was fixed by updating to a newer version at the request of Michael from UD. I have been up for the past week and a half without issues.
apostolakisl Posted December 13, 2023 Posted December 13, 2023 I had some issues with my polisy that are quite possibly falling into this same category, though they seem to have been fixed. I use Automation Shacks Nodelink program and per Michelle it was overwhelming Polisy with open sockets and causing ISY to reboot and then in a later firmware was causing it to freeze. Reboots were about once a day and were preferred to freezing. After sending a bunch of logs to Michelle and having him remote into my machine, it appears that a few ISY updates later the problem was solved. Nodelink I do not believe has had any changes in a long time, perhaps years, so at some point it would appear that a change in ISY was the source of the compatibility issue.
gregkinney Posted December 13, 2023 Posted December 13, 2023 @apostolakisl very interesting. I also use nodelink. Wonder if that's what it was for me.
Recommended Posts