ELA Posted June 3, 2013 Posted June 3, 2013 I have been working on a strange Insteon happening for some time now revolving around a certain circuit that I use an APL on for a Fluorescent lamp load. The strange results happen even when the light is off ( or unplugged) so lets ignore the load for now. I have had two previous APL's go bad over time in this circuit and returned them. The latest APL is an i2CSdevice R4.2 firmware, R4A hardware. When I tried to add the new device to the ISY it failed to add twice in a row. In the level 3 log I could see the ISY was not getting a response. I moved the APL closer to the PLM and it added properly. I deleted the APL from the ISY and then moved the APL back to the original problem location and now it added from that location just fine. It looked like it already knew the engine type for the device now and that helped it along. To take a closer look at what was happening I used docklight to send the extended ''2F" read data command to request the first link record. This was the command that seemed to be giving the exchange problems. Here is a post of the docklight record for one PLM->APL exchange: Note how there are 6 responses from one request. I included the times so that you can see this was a real lengthy exchange. Here is the scope trace of this exchange: I figured the extra messages were the result of a marginal comm issue and confirmed that in an isolated network I got better results yet there was still something strange. Here is the response log from an ELAM ( sent using zero hops) : I am guessing it is normal for the device to response first in a standard message form followed by the extended response but with 1 hop max? Here is the scope trace for this exchange: Anyone know why the the two standard message hops appear at the end? I am consistently seeing these even in a isolated network with excellent comms. Seems very strange how Insteon uses extended messages & i2CS to improve things and then spews standard messages in amongst them to screw it up? After I added the APL to the ISY it is working fine with the standard length messages required for scenes and direct console controls. I just find the extended message exchanges confusing and disconcerting. I recognize there is some anomaly in the circuit this device operates on. Signal levels are all above 1Vp-p so it is not an amplitude issue but there is some sort of resonance or phase shift timing issue going one there. I am taking a closer look at that. I am most curious at how the exchange goes on an isolated network. Why send both standard and extended responses and then tack on a couple of rouge standard packets on the end? This is a preliminary look and I will be doing more testing in an attempt to understand this better ( i hope). In the mean time I wanted to put it out there for input from others who may know more about i2CS methods. I can readily duplicate the marginal comms by sending from the PLM through a filterlinc on an isolated network. This causes the APL to respond 3, 4 5 or 6 times.
LeeG Posted June 4, 2013 Posted June 4, 2013 “Anyone know why the the two standard message hops appear at the end? I am consistently seeing these even in a isolated network with excellent comms. Seems very strange how Insteon uses extended messages & i2CS to improve things and then spews standard messages in amongst them to screw it up?†Don’t understand the query about standard messages. There are five Extended messages issued (cannot explain the last two) with the first three as expected per very old documentation. The first Extended message was sent with Max Hops=1 Hops Left=0. Since no ACK was sent in response the second Extended message was issued with Max Hops=2 Hops Left=1. No ACK was sent for that so the third Extended message with Max Hops=3 Hops Left=2 was sent. Why the next two Extended messages were sent with Max Hops=3 Hops Left = 2 is unknown. Since SmartLabs does not document this stuff perhaps I2CS has expanded the number of attempts when ACK is not received.
ELA Posted June 4, 2013 Author Posted June 4, 2013 Thanks for you insights LeeG, I will try to keep the two different tests separate. In the first full network test: I understand that you are saying that 3 retries are expected with increasing max hops. The last two are not understood. Please help me along with your comment about "no ACK was sent in response the second Extended message ". These are response messages to an original request from the PLM. Are ACKs expected for a response message from the controlled device? I did not think they were? I compare this to a direct command to turn on a device. The controller expects an ACK that the device acted on its control. But the controller does not then ACK the ACK? Why are there no ACKs to the response msg in the "isolated network either"? Or are those two at the end ACKs? In the second test in an isolated network: The scope shows two groupings of 5 packets each separated by a space. This is what I was referring to as two standard messages. Unless this was intended to be a extented message but they forgot the middle packet? It is very difficult to see in the scope traces with more hops and more return messages but the same two "standard groupings of 5 packets" appear in all these tests??
LeeG Posted June 4, 2013 Posted June 4, 2013 The 02 50 with the ACK flag byte is the ACK from the APL, ACKing the 2F command. The next 02 51 message is not an ACK (no ACK bit in flag byte). It is an Extended message from the APL containing the requested 2F information. I have always assumed this Extended message had to be ACKed by the PLM. With the DockLight trace being serial messages only it would not show the ACK going out over the powerline from the PLM to the APL. Another assumption, since the APL repeated the 02 51 Extended message with the Max Hops bumped by 1 the APL did not receive an ACK so it is repeating the Extended message with a bump in Max Hops count. I’m trying to think a some way to exchange the same message sequence but with an actual PLM to PLM. Something like using another application rather than an actual device so that the ACK would appear as a received ACK message which would be traced. As far as the scope data an ACK would be a standard message. At least I’ve never seen an ACK that was an extended message that I can remember. That would leave the second extended message labeled (Ext response hop#1) an unknown.
ELA Posted June 4, 2013 Author Posted June 4, 2013 Thanks for your help in this LeeG, I greatly appreciate your knowledge of the details ( learned the hard way What you are saying makes sense if I understand correctly. The PLM is acknowledging receipt of the extended response msg. at a low level such that it is not passed up to the host. Unfortunately I have tried several PLMs during my development and they did not support the 0x2F command. I now you have expressed this in the past and it sure would be nice to have a true ( an independent data line monitor for Insteon). I am going to try one other method of monitoring to see if I can capture the end message/s data. In the mean time I will assume they are an acknowledgement of the extended response msg ( PLM->APL). The packet arrangement does make sense to me under this assumption. Everything as shown in the scope trace ( two extended response packets since it was sent with a 1:1). Then an assumption that the final ack from PLM to APL was also sent with a 1:1? On a side note: I am seeing some real strange behavior from a new test I have implemented in an attempt to analyze a circuits impedance/phase shift/data corruption at 131Khz. I will post on that later once I get some more data. Thanks again for your help LeeG .... Please don't ever leave us to fend for ourselves
LeeG Posted June 4, 2013 Posted June 4, 2013 The same basic flow is seen with a simple paddle press using standard messages. The device sends a Group Broadcast followed by a Group Cleanup Direct to each Responder. Note that the Group Cleanup Direct is initially sent with a Max Hops=1 because that results in the greatest powerline speed. If the ACK from the PLM as Responder did not make it back to the switch the Group Cleanup Direct would have been resent with the Max Hops bumped by 1. Also note that the ACK from the PLM to the switch is not in the trace because it is an outbound ACK back to the Controller switch. Only inbound ACKs in response to outbound messages does the PLM pass out to the application. Tue 06/04/2013 09:25:26 AM : [iNST-SRX ] 02 50 15.B2.6A 00.00.01 CB 11 00 LTONRR (00) Tue 06/04/2013 09:25:26 AM : [std-Group ] 15.B2.6A-->Group=1, Max Hops=3, Hops Left=2 Tue 06/04/2013 09:25:26 AM : [ 15 B2 6A 1] DON 0 Tue 06/04/2013 09:25:26 AM : [ 15 B2 6A 1] ST 255 Tue 06/04/2013 09:25:26 AM : [iNST-SRX ] 02 50 15.B2.6A 22.80.0B 41 11 01 LTONRR (01) Tue 06/04/2013 09:25:26 AM : [std-Cleanup ] 15.B2.6A-->ISY/PLM Group=1, Max Hops=1, Hops Left=0 Tue 06/04/2013 09:25:26 AM : [iNST-DUP ] Previous message ignored. I believe it is true that the ACK sent by the PLM to the switch would have the same Max Hops count as the received inbound message (Max Hops=1 in inbound message and Max Hops=1 in the ACK) In very rare cases the Group Cleanup Direct can be received by the PLM but the Insteon Mesh network does not get the ACK back to the device. The Insteon network works better in one direction than the other. In these cases the addition Group Cleanup Direct messages with Max Hops bumped by 1 will be traced because the messages are received but the ACKs are not being received by the Controller. There are cases like this posted on the UDI forum but I could not find an example without more extensive search. When an image is posted the search does not find data within the image.
ELA Posted June 4, 2013 Author Posted June 4, 2013 Thanks LeeG, That all makes sense. The ack must be something special for certain message types and in this case extended messages. Take the example of the standard Get Engine message. The PLM requests the of Engine used. The device responds with the information requested. No ack is then issued from PLM back to device. At least none appears in the O'scope trace. Of course if the PLM does not receive the request info, it will retry and ask 3 more times. I totally understand what you are saying about the network being very directional when marginal. I see this all the in my in my testing. In this particular APL circuit that is very much the case. WILL THE FUN NEVER END! I had set up a test with two PLM testers & the APL, one PLM as monitor only ( in monitor mode, with all link records required) to attempt to capture the suspected ack messages. As has been usual when I tried monitor mode, no joy. Then as I was handling my newest ELAM#3 tester I heard a crackling sound as the short extension cord to it had become loose. After I tightened the connection now that ELAM's transmit signal levels had dropped from 4-5Vp-p to 2V p-p!!! I did connected all loads, except the Oscope ( isolated network by filterlinc). Where levels should have been greater than 5V p-p at this point they were still only 2V p-p. A little while later they dropped to 100mv p-p! Appears I have a new project to side track me. It's always somethin .... Since I modify the PLM in building an ELAM I have no choice but to diagnose and repair this one myself. my poor sweet ELAM the 3rd
LeeG Posted June 4, 2013 Posted June 4, 2013 You are absolutely right about the Query Insteon Engine. The Engine identifier is returned to the application (ISY) in the ACK response so an additional message from the device is not necessary. Note the ACK bit is on in the flag byte as this is a basic ACK message of the Query Insteon Engine command. The actual Engine identifier is returned in the cmd2 field of the basic ACK message Tue 06/04/2013 11:11:01 AM : [iNST-SRX ] 02 50 1C.FD.D5 22.80.0B 2B 0D 02 (02)
ELA Posted June 4, 2013 Author Posted June 4, 2013 LeeG, That is exactly why I do not understand the "0x2F" extended command implementation. Why not turn on its ack flag bit and call it good. Why send a standard 0x2F response first ( with no data of any use), followed by an extended 0x2F response ( with the data), and then top this all off with the need for the PLM to send a separate ack message. Makes no sense to me, granted I have limited vision. Set the Ack bit in the extended response ( with data) and call it good. If the PLM does not receive it, it would retry 3 times. Seems like this is all slowing down the comms and not making anything any better, but possibly worse. I know I am not going to change anything by complaining. Just have to live with it as it is.
ELA Posted June 4, 2013 Author Posted June 4, 2013 Just noticed that the "D5 of extended data" bytes of the two examples were different. I first noticed different checksums and then went looking. No clue what D5 is telling me? The old 2007 doc I have says unused. Sure would be nice to know what that might mean?
ELA Posted June 6, 2013 Author Posted June 6, 2013 Good News ELAM #3 is alive and well! CAUTION: Do not use extension cords with Insteon devices that do not fully engage the contacts of the AC plug. Normally the blade contacts on an AC plug are about 0.65" long. The receptacles on the cord I was using were set back deep into the rubber of its housing such that the first 0.5" of the blade was not engaged with the receptacle (metal to metal). This left only about 0.1 -0.15" of metal to metal contact. Thus it was very easy for the plug to become intermittently connected. This is what happened to my poor ELAM. The PLM portion of the ELAM did not take kindly to the arcing produced when the plugs were poorly mated. A real shame since I paid $8 for a 3ft extension cord that looked to be very sturdy and well built. I have noticed this on a lot of extension cords now a days that they have really cheaply designed receptacles. I started this thread because I have had terrible luck with APLs in one location in my network. I am still investigating what the issue may be but one thing is clear that APLs have an especially hard time in this location. An i2cs APL using extended commands has an even harder time. Luckily the extended commands are only required during linking. Now that the ELAM is working again I wanted to highlight something I have noticed on every APL I have ever tested. They all do not simulcast very well. They stand out above all others I have tested as poor simulcast performers. Here is a trace of an isolated network with just the PLM talking to the APL with 1 hop. The trace only shows the original PLM send and the 1st hop. You can see how level the PLM send packets are vs. the 1st hop where the PLM and APL attempt to simulcast together. A miserable effort! This cannot be blamed on phase shift or loading since it is an isolated network with 5 ft between devices.
Brian H Posted June 7, 2013 Posted June 7, 2013 ELA, I am probably barking up the wrong tree here. In My Computer. Try clicking on the APL and selecting Advanced. See what it reports about setting PLM communications,
IndyMike Posted June 7, 2013 Posted June 7, 2013 ELA, Once again it would appear that we are leading parallel lives. I'll agree that the APL's aren't the best at simulcasting I2. In fact, I can't call any of my V4.X or V5.X devices "good" at simulcasting a PLM transmission (see traces below). As you stated, I do not believe this has anything to do with line loading, or loading induced phase shift. I do believe it is due to the differences in the zero crossing detection circuit used in the various devices. My reasoning: 1) PLM to PLM Simulcasting has always been excellent (my experience). The older PLM's (2412S that I have schematics for) use a different zero crossing scheme than the older wire in/plug in modules (again, that I have schematics for). 2) PLM to V1.0 Accesspoint Simulcasting is likewise excellent. The old V1.0 Accesspoint used the same zero crossing detector as the PLM. 3) PLM to V4.X or V5.X simulcasting is not the greatest. This may explain why adding an Accesspoint helps in boosting signals into problem circuits while "other devices" on the same circuit do not. 4) Simulcasting between V4.x and V5.x devices appears to be good. I'll be rash and ASSume that these devices use similar zero crossing detectors. 5) The bright spot - My V6.2 SWL relays (I2CS dual band) devices appear to simulcast I2 "very well". Not quite as good as PLM to PLM or PLM to ACP, but a definite improvement. This does not appear to be across the board since I have a V6.0 KPL relay (dual band) that does not fair as well. The only I2CS units that I have are the V6.2 SWL relays. I do not know if the "improved" simulcasting that I'm seeing on these devices is associated with the hardware revision or the I2CS implementation. I would be very interested if you have recent units that exhibit the improvement. The following plots were generated using a 1 Hop I2 transmission (responder removed) on an isolated circuit.
ELA Posted June 7, 2013 Author Posted June 7, 2013 Hello Brian, Would you please clarify for me your comments about the APL ( ApplianceLinc) settings in my computer? I did not follow.
Brian H Posted June 7, 2013 Posted June 7, 2013 I was mostly just curious. The later firmware versions of ISY Firmware have a advanced choice when you right click a device in My Lighting and a PLM Communications setting. All of my older devices say Device does not support PLM communications settings. Was just wondering if there was a change with the latest APL and maybe other devices.
LeeG Posted June 7, 2013 Posted June 7, 2013 The device has to present an i2CS engine for that option to be available. Would think a new APL is i2CS but do not have a new one to confirm.
ELA Posted June 7, 2013 Author Posted June 7, 2013 Hello IndyMike, They may be parallel but I am sure some reading these threads may question if it is really living : Very nice to have the opportunity to compare notes with you once again. Yes my PLM to PLM simulcasts have always been the favorite. Quite impressive really as they attain 8V p-p together and consistent levels from packet to packet. I have not attempted to associate better simulcasting with certain hardware or firmware versions. I appreciate your presence in these more technical threads as I am afraid I may bore a few with too many details that I find interesting. For the purpose of this thread about a problem circuit location with an APL on it: I felt that I had excluded all signal sucker and noise possibilities on this circuit. The received signal strength at the APL is greater than 1Vp-p yet it is not "seeing" the message as valid and thus not responding all the time. The two previous older APLs were clearly defective in that I measured there transmitter levels had deteriorated. This latest one is working very well with standard length messages ( for now). It is only the extended ones it is failing on thus far. You and I discussed possible resonant conditions a little bit in the past and I believe you were a bit skeptical that there might be resonant conditions in a home install? The calculations support the concept and in practice I have seen more than one circuit where simply adding additional capacitance to the line drastically changes the response characteristics. My newest addition to the ELAM adds capability to send only 3 sine wave cycles at 131Khz into the line. The purpose being to more accurately measure the phase angle between the voltage and current being transmitted in an attempt to better understand possible resonant conditions. I am presently collecting data for comparisons to see if this feature is worthwhile or not. In my first attempt at using this feature, on the circuit of this thread, I got a completely unexpected result: When sending only 3 cycles ( less than the 10 required to represent just one bit of Insteon signal) I am seeing one or more devices responding with a full two hop message! Say what !!!! I probably should not say full two hop message. I am seeing what appears to be two full 5 packet groupings. I intend to investigate this further and hope to track down which device/s are generating this response. Of course since we do not have access to all the latest Insteon details who can say just how strange this is? I cannot image why I would get this response from just 3 cycles of carrier at 131Khz? Does that amaze anyone else or I am missing something obvious? Update: I have not experienced this strange response in subsequent testing as of yet. Thus I have to assume that it may have been due to how I set up the test. Even though Power line comm was only sending 3 cycles at 131Khz, the RF section may have been left enabled (was intended to be turned off) and thus a full message may have been RF transmitted. I cannot say for sure either way, unless I experience this again down the road. The intent was to send only the first 3 cycles of a broadcast message with 0 hops on PLC only. I fear I may have had the configuration setup incorrect on my first attempts, unless this should reappear.
Brian H Posted June 7, 2013 Posted June 7, 2013 Sorry if I may have gone off center of the thread here. You are correct Lee. I found a I2CS I/OLinc in my I2CS Developers Hardware Kit and the ability to set retries from none to nine was avilable for that module. Not sure if it makes a difference to what ELA and other are observing.
Xathros Posted June 7, 2013 Posted June 7, 2013 ELA, IM, Brian, LeeG- I for one have never been bored by anything you guys have posted here. I very much enjoy following the research you all have been doing. My approach to problem solving comm issues has changed for the better based on what I have learned from you all. I only wish I had the necessary tools and knowledge to assist with this research. -Xathros
TJF1960 Posted June 7, 2013 Posted June 7, 2013 I agree with Xathros and couldn't have said it better myself. Thanks for all of your troubleshooting and for sharing all of the results. Tim
ELA Posted June 7, 2013 Author Posted June 7, 2013 Thank you for the positive feedback TJF1960 and Xathros... Not off topic at all Brian, Your comments are always helpful. I was just not understanding since I stuck at an older ISY version. Can you please tell me if the newer versions of ISY allow you to set the number of retries on the APL itself? If so then I am guessing the default might be 5? and that is why I see 5 extended msgs? Perhaps this is built into the new i2cs stuff, in the new device engines for them to retry, rather than have the host->PLM perform the retries?
LeeG Posted June 7, 2013 Posted June 7, 2013 If the new APL is i2CS the retry count can be changed. The value being changed is what would be the On Level in a Responder link record. For i2CS devices the Controller link record On Level field (had no use in Controller link record before i2CS) indicates the number of Group Cleanup Direct retries. The ISY normally writes a FF in that field which shuts off all Group Cleanup Messages. With the advent of the new Advanced | PLM Communication options the On Level field can be set to a number representing the number of Group Cleanup Direct retries. The No Retries selection writes a FF in the On Level field
Brian H Posted June 7, 2013 Posted June 7, 2013 I don't have any I2CS APLs. Just a few assorted I2CS modules we where allowed to PURCHASE in a Developers Hardware Kit and I had no idea exactly what was in the kit. My I2CS I/OLInc allowed none to nine retries.
IndyMike Posted June 8, 2013 Posted June 8, 2013 Hello IndyMike, Very nice to have the opportunity to compare notes with you once again. Yes my PLM to PLM simulcasts have always been the favorite. Quite impressive really as they attain 8V p-p together and consistent levels from packet to packet. I have not attempted to associate better simulcasting with certain hardware or firmware versions. I appreciate your presence in these more technical threads as I am afraid I may bore a few with too many details that I find interesting. My interest was peaked when I found that my V6.2 SWL relay units (6 of them) simulcast extended message very well. I have always been intrigued by the fact the PLM to PLM and PLM to Accesspoint communications were so solid. This can't be an accident. It goes back to the introduction of the accesspoints back in 2007(?). Would you mind posting the revision of your Appliancelinc? All of mine are V4.x or older. For the purpose of this thread about a problem circuit location with an APL on it: I felt that I had excluded all signal sucker and noise possibilities on this circuit. The received signal strength at the APL is greater than 1Vp-p yet it is not "seeing" the message as valid and thus not responding all the time. The two previous older APLs were clearly defective in that I measured there transmitter levels had deteriorated. This latest one is working very well with standard length messages ( for now). It is only the extended ones it is failing on thus far. You and I discussed possible resonant conditions a little bit in the past and I believe you were a bit skeptical that there might be resonant conditions in a home install? The calculations support the concept and in practice I have seen more than one circuit where simply adding additional capacitance to the line drastically changes the response characteristics. My newest addition to the ELAM adds capability to send only 3 sine wave cycles at 131Khz into the line. The purpose being to more accurately measure the phase angle between the voltage and current being transmitted in an attempt to better understand possible resonant conditions. I am presently collecting data for comparisons to see if this feature is worthwhile or not. Maybe a little out of context on the subject of resonances... What I should have/meant to say is that resonant conditions should not "persist" in a home install. I'm sure there are transient resonances occurring all the time as devices are switched on/off. If a resonant condition does occur, it will be between a given transmitter/receiver. With simulcasting (multiple transmitters), it would seem to be mathematically impossible for all to be at a resonant distance from the receiver. Additionally, the distributed series resistance/inductance of the powerline will greatly reduce the depth of any resonant notches (very low Q). Over the years I have regularly used Houselinc to ping the devices in my home using a 0 Hop transmission. I have never been able to identify a "resonance" in my install. My old I1 devices actually respond better with 0 Hops than they do with 3 (but we won't go into that - again). I find it very curious that your appliancelinc will respond reliably to standard messages while ignoring (some) extended messages. If this were due to a resonant condition, I would expect the device to exhibit problems with both message lengths. Perhaps a timing issue where the incoming transmission falls outside of a predefined firmware window. I do see something like this when using my old I1 devices with 2413 PLM's. Not sure of your appliancelinc revision - suffice it to say that I don't have one and have not observed this on older units. Even though I can see that some modules do not simulcast extended messages "particularly well" on an isolated circuit, I have not seen any problems when using extended messages in my actual install. I should have indicated this in my original post. I use extended messaging whenever possible (forced in the ISY options) because of it's 16X speed advantage when programming/reading link tables. This, I believe, limits exposure to transient events that can affect the standard message communication over long time periods. In re-reading what I have posted, I'm thinking that none of this is helping explain your observations... Sorry, IM
ELA Posted June 8, 2013 Author Posted June 8, 2013 Hey IndyMike, The very first post of this tread lists the APL versions. Not to worry though as I know how difficult it can be to read back and forth through a lot of posts with a lot of technical spew. I always appreciate your input and being able to bounce theories off of others in general. Agreed that resonances can be transient based on how the sender/receivers are oriented to each other. In particular in both instances where I suspect the resonances occur are with a single device at the end of a circuit by itself on a given circuit. This circuit then "looks" back into the service point ( relatively constant low impedance) due to all the loads reflected back to that point via all the other circuits and their distributed loads. I need to collect more data and have not had time to get back to this yet. I think it is very difficult to identify a resonance condition with just communications reliability data. The comms will show as unreliable ( possibly intermittently as impedance's change) but you still can't say why. People with O'scopes can measure the signal amplitude and confirm there is plenty of signal strength yet messages are not correctly being detected. That is why I am attempting to take a loser look at voltage/current phase relationships with the 131Khz 3 cycle ping option. Of course as always seems to be case I am side tracked by this totally strange, what appears to be a message response, to only a 3 cycle ping! Curious what are your initial thoughts are on that one??? Again this is pretty preliminary and I need to test further to be sure I am not creating an anomaly in my test methodology. I do have to make this a priority now over the resonance investigations since it is so very unexpected. Thanks for your input.
Recommended Posts
Archived
This topic is now archived and is closed to further replies.