Postmortem
On Tuesday the 16th of August, during the hours of 1.24 PM UTC to 2.36 PM UTC, Uptick's EU hosting resources suffered a partial outage, resulting in 1 hour of degraded service and 20 minutes of full downtime for all of our UK customers. The outage was initially caused by health check changes we'd implemented off the back of the previous day's outage.
A misconfiguration in these caused an outage that eventually rippled through to a service that the customer's production servers were relying on. We've taken measures to better isolate our health check services from our production servers, introduced additional monitoring to have health checks on our health check services, as well as reviewing our processes around how we release changes in this area. We apologise for the disruption, especially with it being back-to-back with yesterday's outage.
Please rest assured that this isn't the start of a bigger pattern of outages; the issues behind each were quite different and have both been carefully addressed. We take uptime very seriously and will be working hard to regain trust in our otherwise rock-solid uptime stats.
Resolved
On Tuesday the 16th of August, during the hours of 1.24 PM UTC to 2.36 PM UTC, Uptick's EU hosting resources suffered a partial outage, resulting in 1 hour of degraded service and 20 minutes of full downtime for all of our UK customers.