Jump to content

Multiple switch events?


Recommended Posts

Hi,

 

I've got a SwitchLinc dimmer (v35) that works fine for local operation and when responding to scene commands. But a program I wrote which triggers on its ON and OFF control actions behaves as if there are spurious control messages. And looking at the switch in the Event Viewer seems to confirm this.

 

I'm a beginner when it comes to deciphering the Insteon messages. So can someone help me decode what's happening? Here is the event viewer output (level 3) for a single "ON" press of the dimmer paddle. The SWL is device 0F.AF.E6 and the ISY PLM is 0F.44.FC:

 

2009/03/28 15:04:01 : [iNST-SRX    ] 02 50 0F.AF.E6 00.00.01 CB 11 00    LTONRR (00)
2009/03/28 15:04:01 : [standard-Group][0F.AF.E6-->Group=1] Max Hops=3, Hops Left=2
2009/03/28 15:04:01 : [   F AF E6 1]      DON   0
2009/03/28 15:04:01 : [   F AF E6 1]       ST 255
2009/03/28 15:04:01 : [iNST-SRX    ] 02 50 0F.AF.E6 00.00.01 CB 11 00    LTONRR (00)
2009/03/28 15:04:02 : [standard-Group][0F.AF.E6-->Group=1] Max Hops=3, Hops Left=2
2009/03/28 15:04:02 : [   F AF E6 1]      DON   0
2009/03/28 15:04:02 : [iNST-SRX    ] 02 50 0F.AF.E6 00.00.01 CB 11 00    LTONRR (00)
2009/03/28 15:04:02 : [standard-Group][0F.AF.E6-->Group=1] Max Hops=3, Hops Left=2
2009/03/28 15:04:02 : [   F AF E6 1]      DON   0
2009/03/28 15:04:02 : [iNST-SRX    ] 02 50 0F.AF.E6 00.00.01 CB 11 00    LTONRR (00)
2009/03/28 15:04:02 : [standard-Group][0F.AF.E6-->Group=1] Max Hops=3, Hops Left=2
2009/03/28 15:04:02 : [   F AF E6 1]      DON   0
2009/03/28 15:04:03 : [iNST-SRX    ] 02 50 0F.AF.E6 0F.44.FC 41 11 01    LTONRR (01)
2009/03/28 15:04:03 : [standard-Cleanup][0F.AF.E6-->ISY/PLM Group=1] Max Hops=1, Hops Left=0

It sure looks to me like the SWL sends 4 DON messages for the single paddle press. DOF, DFON, and DFOF also show multiple entries.

 

For comparison, here's the event output for an "ON" press of another v35 SWL (0F.B3.59), which seems to send just the one expected message:

 

2009/03/28 15:05:56 : [iNST-SRX    ] 02 50 0F.B3.59 00.00.01 CB 11 00    LTONRR (00)
2009/03/28 15:05:56 : [standard-Group][0F.B3.59-->Group=1] Max Hops=3, Hops Left=2
2009/03/28 15:05:56 : [   F B3 59 1]      DON   0
2009/03/28 15:05:56 : [   F B3 59 1]       ST 255
2009/03/28 15:05:56 : [iNST-SRX    ] 02 50 0F.B3.59 0F.44.FC 41 11 01    LTONRR (01)
2009/03/28 15:05:56 : [standard-Cleanup][0F.B3.59-->ISY/PLM Group=1] Max Hops=1, Hops Left=0

The device links table for the "bad" device (0F.AF.E6), as compared to the ISY links table, looks like this:

 

[identical] 0FF8 : A2 00 0F.44.FC FE 1B 00
[identical] 0FF0 : E2 01 0F.44.FC 00 00 01
[identical] 0FE8 : A2 19 0F.44.FC FF 1C 01
[ignore] 0FE0 : 00 00 00.00.00 00 00 00

 

Device links check for the "good" SWL (0F.B3.59) yields identical output as above for the other SWL, which surprises me. Note that both SWLs are only used for local operation, and in scenes triggered by the ISY.

 

Any insight? Do I have a bad SWL which sends multiple messages in error? Thanks!

Link to comment
Share on other sites

Hi markens,

 

This is very interesting! I was aware of the fact that all RF devices send the same message twice (and we put a fix for it in ISY) but this I had never seen or heard. I am very much interested in hearing others with this problem.

 

At the moment, I think it's your switch. Can you reboot (airgap) your switch and let me know whether or not you have the same issue. Also, would it be possible to move one of your AccessPoints and retry?

 

With kind regards,

Michel

Link to comment
Share on other sites

Hi Michel,

 

At the moment, I think it's your switch. Can you reboot (airgap) your switch and let me know whether or not you have the same issue. Also, would it be possible to move one of your AccessPoints and retry?

I tried airgapping with no change. Then removed from ISY, factory reset, and re-add to ISY. Also no change.

 

With a relatively small house, I'm not using access points in my network. I have a 2406H SignaLinc installed near my electrical panel.

 

A tried a few more things. The switch in question controls a light over my kitchen sink, and is the only other load on the garbage disposal circuit. So I tried temporarily connecting another SLD to that circuit (plugged in to the disposal outlet). The duplicate message problem reliably disappeared. In fact, airgapping either switch caused the remaining one to start sending duplicate messages.

 

So it's pretty clear there is a communications issue here. But I'm baffled by the log messages, and what they tell us about actual messages being sent. The hop count looks ok (received on the first hop), right?

 

The big question to me is why the switch is sending the duplicate messages, which appear to happen multiple messages per second. The ISY obviously receives each message ok because they're in the event log. And am I right that the message being sent is a scene (group) command which does not expect an ack until the cleanup is sent? If so, then that makes it even more baffling to me.

 

While I was at it, I plugged my test switch into another kitchen circuit which has no other insteon devices on it, and observed similar duplicate messages. All my other devices are on circuits which have multiple devices, so perhaps that's why I don't see problems with those switches.

 

Definitely appreciate more insight on this. Thanks much!

 

[edit to add:] Commands from the ISY all work reliably exactly as expected. Log entries for the problem switch match those to others for both individual device actions, and scene actions for which the switch is a member.

Link to comment
Share on other sites

Thanks for the detailed response. Now that you mention it, SignaLincs are no longer supported and that might be the cause (I remember hearing that before).

 

Can you remove some or all of your SignaLincs and retry?

I can easily pop the breaker on the SignaLinc and try, except my ISY PLM is currently on the opposite phase from the problem switch.

 

Before I start moving things around: SH still has the 2406H listed for sale, and is described as Insteon compatible. So it seems strange that it would no longer be supported. Perhaps what you heard refers to a different SignaLinc model? I'm about to call SH anyway, so I'll ask them about it.

Link to comment
Share on other sites

Yes, the 2406H should be fine - I believe Michel thought you were referring to the older RF-based SignaLincs.

 

I have a 2406H here for testing. I'm currently doing some other testing, however, and don't want to add too many variables to the mix - so the 2406H will probably have to wait a few weeks.

Link to comment
Share on other sites

Yes, the 2406H should be fine - I believe Michel thought you were referring to the older RF-based SignaLincs.

I talked with SH, and they confirm the 2406H is ok.

 

The rep filled in some additional details for me, which I found interesting. In particular, the multiple messages from a SLD are not unexpected in this situation. It can easily send 5 messages in the first second (which is what I see) until it sees a response from any device, even one simply resending. So with no other devices on this circuit, the "closest" insteon device is relatively far away. All the messages get to the ISY ok (as seen in the log), but it takes long enough for an ack to be received that the SLD keeps sending.

 

When I put another SLD nearby, it provides the confirmation needed to the originating switch to make it stop sending. Even though it is not a prticipant in the requested scene.

 

Now I understand why someone on another thread suggested that plugging an unused LampLinc into a problem circuit may help communication issues. It certainly looks like it would help here.

 

Bottom line is that "normal" operation is fine, not affected by the duplicate messages. I only noticed it because my program attempts to detect a second ON (or OFF) action by checking status of the light in addition to the control press. I'll either add a spare device on that circuit, or change my program to take it into account.

 

As an aside, it seems very difficult to obtain technical details about the actual Insteon message protocol. Is this restricted to developers?

 

Thanks!

 

--Mark

Link to comment
Share on other sites

Mark, you really had us scratching our heads. I even did the same experiment as you (kitchen appliance circuit) and saw the same results. I don't think I could have found another outlet in our house that would be that isolated.

 

Details about Insteon messaging can be found in the white papers at Insteon.net.

 

Thank you for the information,

Rand

Link to comment
Share on other sites

No where in the docs had I read that INSTEON switches keep sending till they get a response. I thought the whole purpose was to send, wait for an ACK, if not received, then blink for error.

This was what I thought, too, so I asked the SH rep for more clarification. There does seem to be a distinction between receiving a real ACK (which the switch must still get within a few seconds or it will blink), and simply seeing the original message repeated (which is sufficient to cause it to stop sending the message, assuming it will get through ok).

 

The log entry I originally posted had multiple commands followed by a single cleanup 2-3 seconds later. Presumably the ACK happens to the cleanup command, which then satisfies this requirement of the sending switch.

 

This actually all makes sense from a clustered reliability point of view. Normally, duplicates do no harm and so this protocol works fine under that assumption.

 

This is very interesting! I was aware of the fact that all RF devices send the same message twice (and we put a fix for it in ISY) but this I had never seen or heard.

Would it be useful to consider putting in some sort of duplicate detection for other devices as well? It looks like all duplicates are sent before the cleanup, so perhaps dups received before the cleanup could be logged and ignored? (Probably need a timeout in there as well in case the cleanup is not received.)

 

Thanks,

--Mark

Link to comment
Share on other sites

Hello Mark,

 

I'm having problems with the concept that the SLD can re-initiate a transmission within a current command window. The entire Insteon message hopping scheme is based on timed messages synched to the powerline crossing. Per the protocol, the sender is prohibited from retransmitting within the message hop and device response time frame.

 

If the SLD were to start a transmission, not hear a repeat, and then restart a new transmission it could create havoc. The SLD may not have heard a valid repeater due to line noise, etc. Restarting a new transmission would cause mulitple transmitters to collide with differing information. That's not to say that SH couldn't have come up with a method to reset/retry within the transmission window. I just can't conceive of how they would implement this alongside legacy hardware.

 

What is possible, and within the protocol, is that your PLM is not hearing the group cleanup command. This command is being transmitted with one Hop and is not being registered by the PLM for the first 3 group commands. Since the SLD doesn't see the ACK from the PLM, it is implementing message retries per page 28 of the Insteon White Paper ("Insteon the Details").

 

The timing looks about correct for this scenario. Each Insteon standard message is sent using 5 packets of data. These packets are transmitted during 5 zero crossing of the AC carrier. One additional zero crossing of "quiet time" is included at the end of the transmission to allow RF devices to communicate. Total time for the 6 zero crossing is .05 sec (6/120hz).

 

Group Command - 0.05 sec

Hop 1 - 0.05

Hop 2 - 0.05

Hop 3 - 0.05

Group Cleanup - 0.05 (sent with max hop = 1, the PLM doesn't hear this initially)

Hop 1 - 0.05

Cleanup ACK - 0.05 (assume 1 hop)

Hop 1 - 0.05

 

Total time 0.4 seconds - assuming the plm didn't hear the Group cleanup, the SLD won't receive the cleanup ACK and will retry the above sequence up to 5 times.

 

To summarize, the theory here is that your SLD is having problems communicating with the PLM when using 1 Hop transmissions. Have you tried a scene test with this device? I would think you would receive intermittent results.

 

IM

Link to comment
Share on other sites

Hello IM,

 

Thanks for all the good info, which makes a lot of sense. I'm going to digest it (along with the white papers referred to above) and do some more tests.

 

Several immediate questions: Adding a second SLD at that location causes the duplicates to go away. In your scenario, why does this "fix" it?

 

Also, please help clarify some things in the log in my first posting above. If I interpret correctly, the ISY has received four identical group commands [LTONRR (00)] and then one cleanup [LTONRR (01)] directed to the ISY. It appears that each of the group commands arrived on the first hop (hops 3, hops left 2). But you're saying that, according to the protocol, there should be an additional cleanup command (that the ISY did not receive) for each of these group commands? Just want to be sure I understand your point.

 

To summarize, the theory here is that your SLD is having problems communicating with the PLM when using 1 Hop transmissions. Have you tried a scene test with this device? I would think you would receive intermittent results.

Yes, I've tried a scene test with this device with 100% success rate over 15 or so tests. The light also responds immediately 100% of the time when the scene is commanded via the ISY. Here is a sample scene test result:

 

2009/03/31 10:02:00 : [GRP-RX      ] 02 61 19 13 00 06 
2009/03/31 10:02:01 : [iNST-SRX    ] 02 50 10.94.F9 0F.44.FC 61 13 19    LTOFFRR(19)
2009/03/31 10:02:01 : [standard-Cleanup Ack][10.94.F9-->ISY/PLM Group=0] Max Hops=1, Hops Left=0
2009/03/31 10:02:01 : [iNST-SRX    ] 02 50 0F.BA.45 0F.44.FC 66 13 19    LTOFFRR(19)
2009/03/31 10:02:01 : [standard-Cleanup Ack][0F.BA.45-->ISY/PLM Group=0] Max Hops=2, Hops Left=1
2009/03/31 10:02:02 : [iNST-SRX    ] 02 50 0F.B3.59 0F.44.FC 61 13 19    LTOFFRR(19)
2009/03/31 10:02:02 : [standard-Cleanup Ack][0F.B3.59-->ISY/PLM Group=0] Max Hops=1, Hops Left=0
2009/03/31 10:02:02 : [CLEAN-UP-RPT] 02 58 06 
2009/03/31 10:02:02 : [iNST-SRX    ] 02 50 0F.AF.E6 0F.44.FC 61 13 19    LTOFFRR(19)
2009/03/31 10:02:02 : [standard-Cleanup Ack][0F.AF.E6-->ISY/PLM Group=0] Max Hops=1, Hops Left=0
----- kitchen all lights Test Results -----
[succeeded] kitchen sink light (F AF E6 1)
[succeeded] kitchen light bdoor switch (F BA 45 1)
[succeeded] kitchen light (10 94 F9 1)
[succeeded] kitchen passthru light (F B3 59 1)
----- kitchen all lights Test Results -----
2009/03/31 10:02:09 : [iNST-ACK    ] 02 62 00.00.19 CF 13 00 06          LTOFFRR(00)

 

Thanks again for the help,

--Mark

Link to comment
Share on other sites

Mark,

 

I'm planning on trying to set up the same scenario to do some testing. I haven't seen this before.

 

Several immediate questions: Adding a second SLD at that location causes the duplicates to go away. In your scenario, why does this "fix" it?

 

Unfortunately, you blew be out of the water when you said that the scene test comes back 100%. I can't currently come up with a scenario where the PLM could hear the one hop scene test response and not hear the one hop group cleanup.

 

Also, please help clarify some things in the log in my first posting above. If I interpret correctly, the ISY has received four identical group commands [LTONRR (00)] and then one cleanup [LTONRR (01)] directed to the ISY. It appears that each of the group commands arrived on the first hop (hops 3, hops left 2). But you're saying that, according to the protocol, there should be an additional cleanup command (that the ISY did not receive) for each of these group commands? Just want to be sure I understand your point.

 

I think we're saying the same thing. I would expect to see a group cleanup after the first group command (as you posted in the "successful example"). My theory was that the SLD timed out waiting for an acknowledge to the first cleanup and began issuing retries.

 

Looking at the above, I can see a possible hole in my logic. Here's a variant (and probably where you were originally headed):

1) SLD issues the group command and does not see any message HOPs in response.

2) SLD reissues group command 3 times (still doesn't see message HOPs). This would be an "undocumented feature". Not a violation of the protocol since the group cleanup hasn't been issued yet.

3) SLD times out on retries and issues the group cleanup.

 

The above would fit your observations since adding another device to the line would allow the SLD to see a HOP. It would presumably follow immediately with the group cleanup. It would also explain why your scene test passes.

 

Maybe we'll call that a loophole in the protocol. Now I need to get my scope back so I can check this out...

 

Sorry if I lead you down the wrong path,

IM

Link to comment
Share on other sites

Mike, I think you hit the nail on the head.

 

If the device doesn't hear repeats it creates it's own. Either way it's going to wait for the repeats to end before sending a cleanup. Happens even with a very old v24 SwitchLinc here.

 

Kind of a nice way to check if comm errors could occur at a particular location.

 

When I used the kitchen appliances circuit I saw 4-5 repeats, my laundry room circuit showed 2-3. I just added an ILL to a sump pump circuit and it sometimes has a slight delay coming on. Not being a controller I can't check it that way, but I could have checked the outlet before I installed it Not that it would have mattered, but I wouldn't have been surprised by spotty responses.

 

This could be very useful.

 

Rand

Link to comment
Share on other sites

Very interesting discussion. I've now read the section on message hopping in the white paper, and that has filled in more knowledge gaps for me. The "creating its own repeats" scenario seems to be a good explanation for observed behavior. Although the white paper does explicitly say that the originator does not repeat its own messages. At least when others are repeating.

 

It'll be interesting to see what your scope analysis shows. Please keep us posted!

 

Thanks,

--Mark

Link to comment
Share on other sites

Mark and Rand,

 

I plan to do some simulation testing on an isolated circuit this weekend to confirm. I should be able to plug my PLM and a SLD on a isolated circuit and observe the retry pattern (no repeaters).

 

I do have a question about the physical arrangement of your system. I have roughly 4500 sq feet on three levels and a modest number of devices (~45). My PLM is at the panel and I am currently running with only one accesspoint (passive coupler installed at the panel). I have some very long runs to bedrooms on the second floor with only one Insteon device on the circuit. Given this configuration, I can't find any devices that respond in the manner that you've documented.

 

Can you describe the physical arrangement (location of plm, accesspoints, phases) where you see this occurring?

 

Thanks,

IM

Link to comment
Share on other sites

Mike, if you can't find a circuit like this I can only believe it is due to the Insteon coupler you have. I think your best test would be take out the coupler and add an AP. I wouldn't move anything.

 

We have 1780' two story + basement. When the house was built four circuits were allocated to lights and outlets, two for each floor with the basement sharing one and outside lights connected as convenient. There were eight dedicated circuits for appliances for a total of 200 Amps. The only 220 we have is the AC.

 

Being a rogue I have added four more breakers to the same box; basement, PCs, etc.

 

I installed an X10 bridge many years ago, a CP000, it didn't help X10 but it's still in place. It is piggybacked on two of the appliance breakers. I have two APs, one on the PLM. Each is on one of the first floor circuits. Most of my Insteon devices are on the first floor and the basement. The 2nd floor circuits have only a 4-way (multiple devices) and one InlineLinc which is intermittent. I can connect an Insteon switch to any of the appliance circuits and see multiple commands and the number of repeats varies depending on the circuit.

 

I have been reading and hearing good reports on the SignaLinc. I am very interested in your results.

 

Thank you,

Rand

Link to comment
Share on other sites

Can you describe the physical arrangement (location of plm, accesspoints, phases) where you see this occurring?

My environment: 1500 sqft house, single level. 200A panel at front corner. SLD in kitchen, roughly diagonally opposite panel. Only loads on this circuit are this light and the garbage disposal. No access points in the house. 2406H passive phase coupler, about 6' from the panel on an old 220 circuit. PLM is on a circuit on the opposite phase, plugged in about 25' from the panel.

 

--Mark

Link to comment
Share on other sites

Mark and Rand,

 

Thanks for the descriptions - these help. We have three rather varied setups (although all have passive couplers).

 

I spent some time playing last night, but was unable to simulate the problem. I placed my PLM and a KPL on a circuit filtered by a Filterlinc. What I found was that I was getting enough signal past the Filterlinc that "other" units were still repeating. I tried a number of tricks using 250' extension cords, incandescent loads, etc but was unable to isolate this circuit. Even with a Filterlinc between the PLM and my panel, I was able to operate the entire house reliably.

 

New action - take Filterlincs into work and test frequency response.

 

Given the fact that I'm able to drive through a filterlinc, I was again wondering how it is that your devices can't communicate with "other" repeaters. This morning it dawned on me (no pun intended) that you were both working with Kitchen circuits. If these are GFCI protected you may be getting signal loss through the breaker. Years ago I did a lot of X10 testing with GFCI's searching for a make that would not overly attenuate the sginals. I wound up with Leviton (previous house - don't have any numbers handy).

 

My new house has a number of different makes of GFCI breakers. One in particular (bsmt) does a real number on X10/Insteon. I'll try to make some measurements and get back with you.

 

IM

Link to comment
Share on other sites

Mike, the PLM is an Insteon repeater. I suggest you leave the PLM in it's normal location when connecting the KPL to another circuit.

 

No GFCI breakers here. I don't believe my passive coupler does anything. The 2406H is an Insteon repeater so I'm not sure why Mark's SL repeats.

 

Rand

Link to comment
Share on other sites

No GFCI breakers here, either. I do have another kitchen circuit with local GFI protection for it and several downstream outlets. I replaced one of them with an OutletLinc, which works fine.

 

The 2406H is an Insteon repeater so I'm not sure why Mark's SL repeats.

Are you sure it repeats? I thought it was passive as well. At any rate, there is no Insteon address sticker on the device.

 

But this is an interesting thought. It turns out that all three circuits I've noticed any "self repeats" from are on the opposite phase from the PLM. Perhaps this is a factor after all.

 

--Mark

Link to comment
Share on other sites

I wouldn't have thought the 2406H was an Insteon repeater either, but according to the spec sheet it is:

 

http://www.smarthome.com/2406H/SignaLin ... red/p.aspx

 

I wouldn't be surprised if that was just a type-o though.

 

That's what I was looking at. The Quick Start Guide does say it is passive. Without a neutral I suppose it could not be a repeater. Sorry for the mis-information.

 

Rand

Link to comment
Share on other sites

Rand,

 

you are of course correct. Somewhere along the road, I developed the impression that "respondents" to direct commands did not repeat the incoming command. I can't find this anywhere in the white paper and don't remember where I developed this idea. Is this your understanding as well?

 

In any case, since the SLD is putting out a group command, I would expect the PLM to repeat the transmission. That presents a problem to our theory.

 

Looking at the above, I can see a possible hole in my logic. Here's a variant (and probably where you were originally headed):

1) SLD issues the group command and does not see any message HOPs in response.

2) SLD reissues group command 3 times (still doesn't see message HOPs). This would be an "undocumented feature". Not a violation of the protocol since the group cleanup hasn't been issued yet.

3) SLD times out on retries and issues the group cleanup.

 

Since the PLM is registering the group command, it should also be repeating it. The SLD should be able to hear these repeats (hops) since the device successfully passes the scene test.

 

The alternative is that the PLM is repeating the group command and not hearing the direct cleanup. In this case the SLD would presumably time out waiting for an acknowledge and then retry. I can't explain why the PLM would be unable to hear the group cleanup if the SLD passes scene tests.

 

I still can't replicate this anomaly. If I leave my PLM at the panel, many devices hear the group command and all of them sound off during the "hopping" time frame. This occurs with the SLD on either phase and behind a filter.

 

Haven't given up yet, just rethinking the problem statement. Either of you live near South Bend IN? This would be a lot easier if we could just put a scope on your systems.

 

Mike, the PLM is an Insteon repeater. I suggest you leave the PLM in it's normal location when connecting the KPL to another circuit.

 

No GFCI breakers here. I don't believe my passive coupler does anything. The 2406H is an Insteon repeater so I'm not sure why Mark's SL repeats.

 

Rand

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.


×
×
  • Create New...