Jump to content

Outage - 2025-01-01


Recommended Posts

Posted

There has been an unplanned portal outage on 2025-01-01 from about 4:30 UTC to about 13:35 UTC.

Starting at about 13:35 UTC, my.isy.io started responding, and ISY/Polisy/eisy could start reconnecting. At 13:50 UTC, the vast majority of the units had reconnected and portal was fully functionnal.

Root cause

The root cause was a lack of disk space for the database server. There was actually disk space left (5%) which should have been sufficient, but the database server stopped responding to requests as the space left was below a certain threshold. Our monitoring did not alert us as 5% left was considered sufficient as the disk space grows very slowly. The solution was to allocate more disk space.

Impacts

Due to the length of the outage, services using oAuth like Alexa and Google Home may have become unlinked. To fix this, they just have to be relinked.

Alexa relinking:

  1. Open the Alexa app,
  2. Click the 3 bars at the bottom, then Skills & Games
  3. Search for "isy" and choose "ISY Optimized for Smart Home V3"
  4. Click "Disable Skill" (If instead you see "Enable to use", then skip to step 5
  5. Click "Enable to use", and when prompted enter your portal user and password

Google home relinking:

  1. Open the Google Home app
  2. Click on Settings
  3. Click on Works with Google
  4. If you see "Universal Devices" under Linked
    • Click on Universal devices, then "Unlink account"
  5. Find "Universal Devices", click on it.
    • Tip: type "Universal" in the search box at the top
    • When prompted enter your portal user and password.

 

  • Like 2
  • Thanks 11
Posted

I would like to add some context to the disk space as I have seen comments questioning the low disk space.

I agree 5% is low and it was due to be increased very soon.

But keep in mind, this is not a file server or a windows workstation where a single video added can add GB's to the storage space. In these cases, even 20% free disk space would be low.

In the case of this server, the storage space grows very slowly, like not even 1% in a year, and does not need much temporary space either. It grows slowly, linearly with new users registering for an account. We are talking just a few KB's at the most, not even 1KB if you don't add spokens and such. 

For those more technical, the database on this server is MongoDB. We learned the hard way that MongoDB will refuse write operations if the free disk space is smaller than the database itself (or perhaps the size of the collections - I read conflicting information on the subject). That was completely unexpected. But regardless, it won't happen again.

 

  • Like 2
  • Thanks 5
Guest
This topic is now closed to further replies.

×
×
  • Create New...