An ongoing heatwave in the United Kingdom has led to Google Cloud and Oracle Cloud outages after cooling systems failed at the companies’ data centers.
For the past week, the United Kingdom has suffered an ongoing record-breaking heat wave causing stifling temperatures throughout the region.
However, today, with temperatures reaching a record-breaking 40.2 degrees Celsius (104.4 Fahrenheit), cooling systems at data centers used by Google and Oracle to host their cloud infrastructure have begun to fail.
To prevent permanent damage to hardware components and thus create a prolonged outage, both Google and Oracle have shut down equipment, leading to outages in their cloud services.
Oracle was the first to be affected, with the company reporting a cooling failure at approximately 11:30 AM EST today, causing “non-critical hardware” to be powered down.
“As a result of unseasonal temperatures in the region, a subset of cooling infrastructure within the UK South (London) Data Centre experienced an issue. This led to a subset of our service infrastructure needed to be powered down to prevent uncontrolled hardware failures,” reads an Oracle Cloud status message that appears to have been first spotted by TheRegister.
“This step has been taken with the intention of limiting the potential for any long term impact to our customers.”
However, even with only non-critical hardware powered off, Oracle states that customers in this zone may be unable to access their Oracle Cloud Infrastructure resources.
Almost two hours later, Google also reported cooling failures in one of their buildings hosting the europe-west2-a zone for region europe-west2.
“There has been a cooling related failure in one of our buildings that hosts zone europe-west2-a for region europe-west2. This caused a partial failure of capacity in that zone, leading to VM terminations and a loss of machines for a small set of our customers,” reads the Google Cloud incident report.
“We’re working hard to get the cooling back online and create capacity in that zone. We do not anticipate further impact in zone europe-west2-a and currently running VMs should not be impacted. A small percentage of replicated Persistent Disk devices are running in single redundant mode.”
“In order to prevent damage to machines and an extended outage, we have powered down part of the zone and are limiting GCE preemptible launches. We are working to restore redundancy for any remaining impacted replicated Persistent Disk devices.”
Like Oracle, this cooling failure is disrupting Google Cloud customers, with virtual machines being terminated, unreachable machines, and Persistent Disk devices running in single redundancy mode.
Both companies report that they do not expect any further impact as they work to bring cooling systems back online.