Jump to content

Programs randomly running at the same time each hour throughout the day?


Recommended Posts

Posted (edited)

It's also strange that a bunch of IR commands seem to fire when the issue occurs. Might have to pull the IR dongle and see if that makes any difference. Debugging this issue is SO slow because I can only try one thing at a time and then it can take a day or two before the issue re-occurs. And that in itself is strange; some days it's fine and others it can happen two or three times? But most often it seems to happen around 1am.

Edited by arzoo
Posted
1 hour ago, arzoo said:

I removed the battery from the suspected sensor (14 72 B6 1) but the problem triggered again last night at 12:52. The fact that it's always around the same time has me thinking what programming I might have in place that's the culprit. The only programs I'm running with an hourly trigger are for monitoring the heartbeat signal on four Insteon water sensors. When the water sensors are working correctly, they send a heartbeat once about every 24 hours. I have programs that check every hour, incrementing a counter. If the counter exceeds 24 then I assume the battery is low. Anyway, I'll disable those programs and see if it makes any difference.

1 possibility down and xx to go...  I'm sure it doesn't feel like it, but this is progress.

Please also try the following:

  1. Disable Sensor 14.7B.DF.  It is also triggering repeatedly and calling similar programs.
  2. You have many programs.  I'm focused on the one that's turning Scene 25 On and then Immediately OFF (the RED highlighted entries above).   This should never happen.  Please try to locate this program and correct it or disable it.
  3. Identify device 50.AD.58 and the programs that it triggers.  The program activity is also suspicious in that multiple on/off scenes are being executed.
  4. Do try disabling the periodic programs you mentioned above.
  5. Look at your Log file (/tools/log) to see if you have any patterns that show up @12:52.
  6. Try disabling all programs (put a condition on the folder to disable it).  If the problem still shows up you've learned a lot about what it isn't.

As you disable sensors and things, feel free to post additional logs.  Hopefully things will begin to make sense.

Please also post any programs that you suspect may be an issue. 

  • Like 2
Posted
50 minutes ago, IndyMike said:

1 possibility down and xx to go...  I'm sure it doesn't feel like it, but this is progress.

Please also try the following:

  1. Disable Sensor 14.7B.DF.  It is also triggering repeatedly and calling similar programs.
  2. You have many programs.  I'm focused on the one that's turning Scene 25 On and then Immediately OFF (the RED highlighted entries above).   This should never happen.  Please try to locate this program and correct it or disable it.
  3. Identify device 50.AD.58 and the programs that it triggers.  The program activity is also suspicious in that multiple on/off scenes are being executed.
  4. Do try disabling the periodic programs you mentioned above.
  5. Look at your Log file (/tools/log) to see if you have any patterns that show up @12:52.
  6. Try disabling all programs (put a condition on the folder to disable it).  If the problem still shows up you've learned a lot about what it isn't.

As you disable sensors and things, feel free to post additional logs.  Hopefully things will begin to make sense.

Please also post any programs that you suspect may be an issue. 

Will do, and again, thanks for all your help with this!

Posted

@arzoo

Not sure if this has been previously mentioned, but in the program summary tab look for programs that ran or finished about 52 minutes past the hour. My guess is that it's one of your heartbeat prorgrams that runs hourly.

Posted
6 minutes ago, Techman said:

@arzoo

Not sure if this has been previously mentioned, but in the program summary tab look for programs that ran or finished about 52 minutes past the hour. My guess is that it's one of your heartbeat prorgrams that runs hourly.

That was one of the first things I checked and none of those four programs ran at 52 past. Their last run time was 55 past which seems close but probably just a coincidence. I'm pretty sure these programs are not the issue but it can't hurt to rule them out. The other thing I should mention is none of my programs had been updated prior to this issue starting.

Posted

It could be a combination of the low battery in the motion sensor and one or more running programs. You should disable all the running programs in an effort to narrow things down

Posted
1 hour ago, arzoo said:

@IndyMike You mention scene # and group #, how do I identify which scenes by the #?

Unfortunately I don't have an easy answer there. There may be an easy way to interrogate the ISY for group / scene numbers. I don't know what that is. 

When I have done this in the past, I have downloaded the contents of the PLM and searched for the group / scene numbers there. It's rather painful.

Posted

If you run the Topology report, The scenes, along with their scene number and the members of the scene can be found at the bottom of the report

Posted (edited)
12 hours ago, Techman said:

If you run the Topology report, The scenes, along with their scene number and the members of the scene can be found at the bottom of the report

Not quite correct, scene numbers in the Event log and the plm are 2-digit hex (0 - FF, 0 to 255 dec).  But you did give me an idea...

If you use the rest interface to query the isy you can show the scene members in XLM format - https://wiki.universal-devices.com/ISY_Developers:API:REST_Interface

The specific command I used in my browser was - "xxx.xxx.xx.xx/rest/nodes/scenes" where the "xxx.xxx.xx.xx" is your ISY ip address.  This should display all of the scenes in your device.

The following is a small section of the scene listing for my ISY.  I believe the <device group> number corresponds to the event viewer/ PLM group numbers.  The <address> corresponds to the # listed in the topology report. 

Ex from my topology report - address 50269 shows 3 of my devices as scene members.  Address 50269 in the reset scene xml shows the same members and a <device group> of 25.

@arzoo, you need to reverse this process.

  1. Access the rest scene listing from your browser as I showed above - xx.xx.xx/rest/nodes/scenes.  You will be prompted to login to the Eisy.
  2. Search for the "<device group>25<device group>" "<deviceGroup>37<deviceGroup>" entry.  Correction: the scene #25 in the ISY Event Viewer is in Hexadecimal.  The deviceGroup entries in the Rest scene output are in Decimal ( 25 Hex = 37 Dec).
  3. Note the <address> entry associated with the above <device group>
  4. Search your topology report for the <address> that you noted.  This should be the scene that is being activated.

 

Security / Security Front Entry (50269):

Security / BSMT KPL A House Secure / BSMT KPL C - Front Entry: Responder
2nd Floor / Master Fan KPL A / Master Fan KPL C - Front Entr: Responder
Security / Stair KPL A House Secure / Stair KPL - D Front: Responder

 

REST XML Output:

<group flag="132" nodeDefId="InsteonDimmer">
<address>50269</address>
<name>Security Front Entry</name>
<family>6</family>
<parent type="3">22260</parent>
<deviceGroup>25</deviceGroup>
<pnode>50269</pnode>
<ELK_ID>M06</ELK_ID>
<members>
<link type="0">19 21 42 4</link>
<link type="0">C 8A E3 4</link>
<link type="0">19 21 FF 3</link>
</members>
</group>
<group flag="4" nodeDefId="InsteonDimmer">
<address>1455</address>
<name>SC 2nd Floor Status</name>
<family>6</family>
<parent type="3">44910</parent>
<deviceGroup>60</deviceGroup>
<ELK_ID>J09</ELK_ID>
<members>
<link type="0">29 4C 9F 4</link>
<link type="0">58 CB 74 6</link>
<link type="0">58 D0 2B 3</link>
<link type="0">1C DE 23 4</link>
<link type="0">1D FD 42 3</link>
</members>
</group>
<group flag="4" nodeDefId="InsteonDimmer">
<address>24859</address>
<name>SC 2nd Floor</name>
<family>6</family>
<parent type="3">44910</parent>
<deviceGroup>61</deviceGroup>
<ELK_ID>J08</ELK_ID>
<members>
<link type="0">29 53 3C 1</link>
<link type="0">29 53 3C 2</link>
<link type="0">29 53 3C 3</link>
<link type="0">29 53 3C 4</link>
<link type="0">14 78 B2 1</link>
<link type="0">14 6C F0 1</link>
<link type="0">1B F4 54 1</link>
<link type="0">1F BD EE 1</link>
<link type="0">41 D 9E 1</link>
<link type="0">1A 4F 6B 1</link>
<link type="0">58 D0 2B 1</link>
<link type="0">58 D0 2B 2</link>
<link type="0">58 D0 2B 4</link>
<link type="0">F D7 F1 1</link>
<link type="0">29 53 3C 5</link>
<link type="0">29 53 3C 6</link>
<link type="0">19 21 FF 1</link>
<link type="0">19 21 FF 2</link>
<link type="0">1D FD 42 1</link>
<link type="0">1D FD 42 2</link>
<link type="0">1D FD 42 4</link>
<link type="0">1D FD 42 6</link>
<link type="0">1D FD 42 8</link>
</members>
</group>

 

Edited by IndyMike
Correction
Posted (edited)

I'll try to figure out the device/scene groups tonight. But I just don't understand the timing of this issue; it only occurs at 52 minutes past the hour and at random hours (although most often after midnight and again about 12 hours later). If it were a program doing this it would be at a consistent time and if it were a faulty device (motion sensor for example), it would be totally random. I'm really beginning to panic that this can't be resolved or maybe the EISY is the cause. Or could it be the PLM?

Anyway, for now I've disabled all programs (set if 1=0 IF condition at top level folder). Without any programs running, I'll have to check the event viewer to see if the problem occurs tonight.

I'll post new logs tomorrow.

Edited by arzoo
Posted (edited)
25 minutes ago, arzoo said:

Anyway, for now I've disabled all programs (set if 1=0 IF condition at top level folder). Without any programs running, I'll have to check the event viewer to see if the problem occurs tonight.

You're on the right track with this approach. To optimize it it,  I would set that same 1=0 logic on all your top level program folders. Then step through turning them on from the top down, either

  • Folders with many programs... or
  • Folders with time based programs in it

The PLM could be partially to blame, when flaky Insteon light activity happens out of the blue, it can be the sign of a failing PLM. How old is the PLM? 

However, the PLM and most Insteon devices do not have timers/clocks in them so it doesn't fit the 52 minute problem. The only Insteon device with a timer that I'm aware of is the timer key pad, where the buttons had different count down times on them... you don't have one of those do you? They're long out of production. EDIT. MS's have timers too, they are suspect as well

Edited by paulbates
Add MS as a timer device
Posted
46 minutes ago, paulbates said:

You're on the right track with this approach. To optimize it it,  I would set that same 1=0 logic on all your top level program folders. Then step through turning them on from the top down, either

  • Folders with many programs... or
  • Folders with time based programs in it

The PLM could be partially to blame, when flaky Insteon light activity happens out of the blue, it can be the sign of a failing PLM. How old is the PLM? 

However, the PLM and most Insteon devices do not have timers/clocks in them so it doesn't fit the 52 minute problem. The only Insteon device with a timer that I'm aware of is the timer key pad, where the buttons had different count down times on them... you don't have one of those do you? They're long out of production. EDIT. MS's have timers too, they are suspect as well

I assume by setting 1=0 on the top level folder (My Programs) that all programs are disabled, right?

I don't have any timer key pads but I do have multiple motion sensors, some of which are the older model. I guess I need to pull the batteries on all of them.

The PLM is pretty old also and every so often I have to reload the link table (from the ISY), but it seems to function ok.

Posted
31 minutes ago, arzoo said:

The PLM is pretty old also and every so often I have to reload the link table (from the ISY), but it seems to function ok.

That one is a bit of an OUCH.  Devices will sometimes loose their link tables if you have a voltage upset while the device (EEPROM) is being updated.

If you haven't had power glitches, and the device is still dumping it's memory, it's more than likely a failing power supply.

I'm not a fan of firing the replacement parts cannon.  Keep trying to isolate things, but remember this symptom.  You may get to a point where nothing else makes sense.

Posted (edited)
4 hours ago, arzoo said:

I assume by setting 1=0 on the top level folder (My Programs) that all programs are disabled, right?

Right, the suggestion was to move that down a level.. set all but one of the program folders (assuming you have your programs in folders) methodically let each folder run for whatever time you think is right and have no problems, the uncheck another... rinse and repeat (after removing 1=0 from the top level)

However, your other answer convinces me that the problem is not programs and barking up the wrong tree

4 hours ago, arzoo said:

I don't have any timer key pads but I do have multiple motion sensors, some of which are the older model. I guess I need to pull the batteries on all of them.

Can you get to all of the MS easily? You'll want to, one at a time, press and hold the set button and then check the options for each especially the timers... looking for 52 minutes or any value that looks suspicious. A factor reset and restore device is in order...

but hold on to that thought....

 

4 hours ago, IndyMike said:

I'm not a fan of firing the replacement parts cannon.  Keep trying to isolate things, but remember this symptom.  You may get to a point where nothing else makes sense.

+1 to what @IndyMike said. PLMs can suffer a long and agonizing death. Continuing to refresh the PLM's link table is a road to nowhere and likely explains the continued degreadation of ISY functioning. 

Here's my suggestion:

  1. Order/receive a new PLM
  2. Find an iox backup from before the most recent events happened
  3. Restore that backup
  4. Swap the old PLM with new, following the PLM replacement procedure
    1. Except... skip step 1, backup, that list, my guess is that the current state PLM/Links table is too far gone
    2. This will take a while and you have to individually do each wireless only device by placing it in set mode
  5. Look for any devices with the green "io" next to them and do a "restore device" until they're all clear.
  6. Test as best you can to address individual device problems. Factory resetting / restore device may be in order for some with corrupt data tables.
  7. Check all of the settings of each MS. My guess is the MS are ok, but have corrupt tables... hard to say
Edited by paulbates
  • Like 1
Posted
22 hours ago, paulbates said:

Here's my suggestion:

  1. Order/receive a new PLM
  2. Find an iox backup from before the most recent events happened
  3. Restore that backup
  4. Swap the old PLM with new, following the PLM replacement procedure
    1. Except... skip step 1, backup, that list, my guess is that the current state PLM/Links table is too far gone
    2. This will take a while and you have to individually do each wireless only device by placing it in set mode
  5. Look for any devices with the green "io" next to them and do a "restore device" until they're all clear.
  6. Test as best you can to address individual device problems. Factory resetting / restore device may be in order for some with corrupt data tables.
  7. Check all of the settings of each MS. My guess is the MS are ok, but have corrupt tables... hard to say

I think I may have a new PLM in my spares collection - otherwise I'll have to purchase. I do also have iox backups (after every program change I've made). But before I move forward with that, here's my latest debug results;

With all programs disabled (1=0 at top level folder), I don't think there's been any crazy stuff happening at 52 past the hour. If that's the case, then it would seem to imply that a program or programs are at fault. Or maybe a program interacting with a faulty device?

Next up would be to re-enable all programs and disable just a single folder to try and further isolate the issue as @IndyMike suggested. Or does it make more sense to just replace the PLM as @paulbates suggested?

The log I'm posting here includes data from the 7th at 12am thru this morning (about 32 hours). Looks like at 2:52am and 7:52am on the 7th this issue occurred. At around 8am on the 7th was when I disabled all the programs, and from then on it seems clear. Can anyone take a look at the log and confirm what I'm seeing? Thanks!

ISY-Events-Log.v5.7.1.txt

Posted

It really appears to be a program that's causing your problem. Before you start replacing anything you should disable blocks of programs to try to narrow down the possible offending program.

Keep in mind that if a program is disabled and another program calls up a statement in the disabled program, that statement will run even though the program is disabled.

  • Like 1
Posted

@arzoo,

A number of items from your log:

  1. No evidence of your motion sensors triggering.  The program that was calling scene 25H is no longer active.
  2. You appear to rest activity that is firing every 5 minutes.  I'm not sure what this is, and cannot say that it's related.
    Wed 08/07/2024 02:57:23 AM : Create REST U7 [/rest/profiles/ns/1/connection]
    Wed 08/07/2024 02:57:23 AM : Create REST U7 [/rest/profiles/ns/0/connection]
    Wed 08/07/2024 03:02:21 AM : Create REST U7 [/rest/profiles/ns/1/connection]
    Wed 08/07/2024 03:02:21 AM : Create REST U7 [/rest/profiles/ns/0/connection]

     

  3. I concur that you had some "events" @2:52 and 7:52.  I do not see and obvious trigger for the events.  There was no incoming Insteon traffic at that time.
  4. Disabling the programs did appear to work for the time period you specified.  I'm not sure that's long enough to be conclusive.

 

Aside from the above, I believe it was @paulbates (not me) that suggest you try disabling program sub folders to further isolate things.  I think that's an excellent idea.  I would just caution that you wait "long enough" with all programs disabled be sure that it's a program causing the issue. 

@Techman mentioned that disabled programs would still run if executed by another program.  This is exactly correct.  As always there is a BUT.  I you disable the program FOLDER, you cannot execute a program regardless of how it is called.

In other words, disabling a program folder is a safe and conclusive method to prevent it from running.

Lastly, I'm not sure if we ever discussed the "crazy stuff" that happens during the event.  Could you describe this a bit?  Is it always the same devices? 

There is some activity in the logs at the end of the event.  The ISY is sending out a series of On/Off commands to multiple scenes.  I don't see an obvious trigger for the event.

Posted
18 hours ago, IndyMike said:

@arzoo,

A number of items from your log:

  1. No evidence of your motion sensors triggering.  The program that was calling scene 25H is no longer active.
  2. You appear to rest activity that is firing every 5 minutes.  I'm not sure what this is, and cannot say that it's related.
    Wed 08/07/2024 02:57:23 AM : Create REST U7 [/rest/profiles/ns/1/connection]
    Wed 08/07/2024 02:57:23 AM : Create REST U7 [/rest/profiles/ns/0/connection]
    Wed 08/07/2024 03:02:21 AM : Create REST U7 [/rest/profiles/ns/1/connection]
    Wed 08/07/2024 03:02:21 AM : Create REST U7 [/rest/profiles/ns/0/connection]

     

  3. I concur that you had some "events" @2:52 and 7:52.  I do not see and obvious trigger for the events.  There was no incoming Insteon traffic at that time.
  4. Disabling the programs did appear to work for the time period you specified.  I'm not sure that's long enough to be conclusive.

 

Aside from the above, I believe it was @paulbates (not me) that suggest you try disabling program sub folders to further isolate things.  I think that's an excellent idea.  I would just caution that you wait "long enough" with all programs disabled be sure that it's a program causing the issue. 

@Techman mentioned that disabled programs would still run if executed by another program.  This is exactly correct.  As always there is a BUT.  I you disable the program FOLDER, you cannot execute a program regardless of how it is called.

In other words, disabling a program folder is a safe and conclusive method to prevent it from running.

Lastly, I'm not sure if we ever discussed the "crazy stuff" that happens during the event.  Could you describe this a bit?  Is it always the same devices? 

There is some activity in the logs at the end of the event.  The ISY is sending out a series of On/Off commands to multiple scenes.  I don't see an obvious trigger for the event.

I'm unsure what's causing the rest activity; maybe the UD Mobile app on my phone?

I agree, I may not have disabled all programs for long enough but the problem has been happening pretty consistently at least one time every morning at 12:52 or 1:52 or 2:52am, so an overnight test seems valid.

I re-enabled all but two folder programs (one for alarm stuff and one for alexa stuff) and the problem happened again, so I disabled the alarm folder (leaving only alexa folder enabled) and the problem did not occur over night. I realize these are very short test periods.

What I've done now is enable all programs folders except for the alarm programs. I'll continue to run in this configuration and see how it goes. I've already looked over the programs in that folder and I don't see how any of them could cause the x:52 activity, so that's still a mystery.

You asked what happens when the problem occurs - the crazy stuff; I have a set of programs that function as an addition to my actual house alarm system. Were the main house alarm to trigger, these programs do stuff like flash all the lights (inside and outside), turn on additional Insteon sirens, beep at all the keypads, and send text messages. The text messages were the first indication that something was not right - I started getting a text message (at 52 minutes past) that the alarm mode was disabled and the Insteon keypad in our bedroom would beep (waking me up). As you've mentioned, the log also indicates a bunch of other Insteon scenes and switches are turning on and off, but I've not actually seen any physical indication of this. There's also indication that IR commands are firing (from our home theater remote I send IR commands to the EISY to activate lighting scenes), but here again I've not seen any physical indication of this. And based on what you've said prior, it seems like there are no triggers for all these events when the problem occurs?

Posted

It appears to be one of your programs that runs hourly that's causing your issues, there's no other explanation.

Take a close look at each of your programs for your alarm

Posted (edited)

Does the event viewer (level 3) show when a program is run? I've manually kicked off a few programs and all I see in the event viewer is something like this:

Fri 08/09/2024 02:17:56 PM : [            Time] 14:18:14 0(0)
Fri 08/09/2024 02:18:26 PM : [            Time] 14:18:44 0(0)
Fri 08/09/2024 02:18:55 PM : [            Time] 14:19:14 0(0)

The reason I ask is because when sorting the Program Summary by Last Run Time, I see a program that ran at 2:18:03pm but there's no reason that program should have triggered. Looking at the event log, the only entries around that time are the ones above. The program ID is 00D3 which does not show up in the event log at all? So I'm really confused why this program ran?

In fact, my event viewer is full of these empty "Time" entries. What do they indicate?

Screenshot 2024-08-09 145425.png

Edited by arzoo
Posted (edited)
42 minutes ago, arzoo said:

Does the event viewer (level 3) show when a program is run? I've manually kicked off a few programs and all I see in the event viewer is something like this:

Fri 08/09/2024 02:17:56 PM : [            Time] 14:18:14 0(0)
Fri 08/09/2024 02:18:26 PM : [            Time] 14:18:44 0(0)
Fri 08/09/2024 02:18:55 PM : [            Time] 14:19:14 0(0)

The reason I ask is because when sorting the Program Summary by Last Run Time, I see a program that ran at 2:18:03pm but there's no reason that program should have triggered. Looking at the event log, the only entries around that time are the ones above. The program ID is 00D3 which does not show up in the event log at all? So I'm really confused why this program ran?

In fact, my event viewer is full of these empty "Time" entries. What do they indicate?

Screenshot 2024-08-09 145425.png

Unfortunately the answer is NO.  If a program is triggered by a Insteon device you can see the logic path the ISY uses in the IF section, and any device communication that is sent in the Then/Else section. 

The "time" entries are normally an indication that something is being executed on the ISY, but has not resulted in an Insteon communication.

Can you post one of the suspect programs?

As an example, if I manually execute the IF clause on the following program, it does not generate any Insteon communication.  It does show a time marker in the event viewer.  The same would be true when a program is triggered and executes the "else" path with no statements.

 

image.png.ae5ec507a4610b3244b777a35ea317ab.png

Edited by IndyMike
Posted
1 hour ago, IndyMike said:

Unfortunately the answer is NO.  If a program is triggered by a Insteon device you can see the logic path the ISY uses in the IF section, and any device communication that is sent in the Then/Else section. 

The "time" entries are normally an indication that something is being executed on the ISY, but has not resulted in an Insteon communication.

Can you post one of the suspect programs?

As an example, if I manually execute the IF clause on the following program, it does not generate any Insteon communication.  It does show a time marker in the event viewer.  The same would be true when a program is triggered and executes the "else" path with no statements.

Here's the program and log from when it ran last (4:26:50pm). There no reason this program should run. One of the motion sensors (54.55.53) in the IF did trigger right before, but the rest of the IF clause (time between 1am - 6am and the variable MBRKeypadBacklightIsOff=0) should resolve to false. So I'm totally confused.

Screenshot 2024-08-09 163515.png

Screenshot 2024-08-09 163655.png

Screenshot 2024-08-09 164150.png

Posted

Or maybe the program fires every time one of the motion sensors triggers and it's just running the empty ELSE?

  • Like 1
Posted
1 minute ago, arzoo said:

Or maybe the program fires every time one of the motion sensors triggers and it's just running the empty ELSE?

Bingo - your program triggered at 4:26 PM but evaluated to false.  It did not modify the backlight on your KPL.

Next run time is correctly shown as 1:00 am. 

  • Like 1
Guest
This topic is now closed to further replies.

×
×
  • Create New...