srm Posted February 14, 2021 Posted February 14, 2021 I suspect that my overuse of statements setting initial values of integer variables (about 100,000 times in a year) is the cause of an apparent SD card failure in my ISY. I present the reasons for this conclusion below. This post is not a request for help, but simply an effort to inform; however, comments are welcome (especially ones that may offer a different explanation for my problem). I had been running firmware v.5.0.16C for 11.5 months on a new SD card when this problem occurred. Here's what happened and my reasoning for attributing the problem to INIT TO statements. (1) About two weeks ago, I found that all attempts to backup my ISY failed with the following error messages: -- Could Not Retrieve File /CONF/VARINIT.1 -- Could Not Create the Zip File (2) I tried deleting the offending file through telenet using RF /CONF/VARINIT.1 based on a response by @Michel Kohanim to the blog post below regarding the same type of problem; however, the delete was unsuccessful and attempts to backup still failed. (3) Through telnet, I learned that the file /CONF/VARINIT.1 had zero length. I don't know whether the failure to delete was caused by the zero length of the file or whether both the zero length and the failure to delete were caused by SD card corruption. In any case, SD card corruption was suggested by @Michel Kohanim as likely cause in both the blog thread above and in another thread about a different zero-length file that had prevented backups. My solution was to start with a new, high-endurance SD card, upgrade to v.5.3.0, and restore from an earlier backup. However, I wondered about the cause of failure of a 1-year-old brand name (SanDisk Ultra) SD card (if it was a failure). (4) I'm assuming that the failed file, VARINIT.1, is used to store the initial (i.e., boot up) values of type-1 (Integer) variables and that it is rewritten every time my ISY runs a statement, {Variable} INIT TO ... I evaluated my ISY programs and concluded that I was executing such statements roughly 100,000 times per year for type-1 variables. That's a lot of rewrites to the same file. (5) I have not found recent authoritative information on the number of write cycles that an SD card can handle, but the 2003 SanDisk Secure Digital Card Product Manual v1.9 says the following under a section called endurance: "SanDisk SD Cards have an endurance specification for each sector of 100,000 writes typical... For instance, it would take over 10 years to wear out an area on the SD Card on which a file of any size (from 512 bytes to maximum capacity) was rewritten 3 times per hour, 8 hours a day, 365 days per year." If the VARINIT.1 file was always being rewritten to the same sectors, then my failure was right in line with the expected failure point based on that old SD spec. However, that out-of-date SanDisk manual, as well as more recent documents, also mention "wear leveling", which is supposed to ensure that a given physical sector isn't written to excessively. That might mean that those 100,000 writes were distributed across lots of sectors and should not have caused a failure. I should note that I also do a lot of data logging with the ISY, which involves a lot of writes--though in the form of appending to existing files rather than rewriting the same file. In any case, the facts that VARINIT.1 is the file that failed, and that my code wrote to it on the order of 100,000 times in the 11.5 months that the SD card was in service, suggests to me that those writes *might* have been responsible for the failure. I have now changed my code to reduce the use of INIT TO statements by about a factor of 10. Though this assessment is by no means conclusive, it is something to think about if you get carried away with using INIT TO statements like I did.
larryllix Posted February 14, 2021 Posted February 14, 2021 (edited) If you are writing the same value to the init to SD card memory, it is not likely that value ever changed and no wear 'n tear should have been incurred. EAROM memories have improved a lot over the years, but I believe only the SSD drives incorporate the randomization wear'n tear prevention techniques. Opinions on that have varied lot over the years. It may be age and brand dependent. Edited February 14, 2021 by larryllix
KeviNH Posted February 15, 2021 Posted February 15, 2021 SanDisk cards have reserved spare blocks, and I believe some of their products have in-firmware wear leveling? I know the "WD Purple" cards do. If you can obtain an "Industrial" spec card like ATP, they provide tools which can tell you when a card has exhausted the spares.
srm Posted February 15, 2021 Author Posted February 15, 2021 I didn't even know that WD made SD cards. But I now see that https://shop.westerndigital.com/ includes WD Purple Ultra Endurance microSD cards with "Card Health Status Monitoring Capability", as well as several SanDisk models, including "High Endurance" and "Max Endurance" models. When I restored my ISY last week, I used a SanDisk High Endurance model. I think that next time I'll try to find one with a spec sheet that specifically mentions Wear Leveling. WD and SanDisk both have "Industrial" models that use that terminology.
Recommended Posts