Google Cloud hit by new networking-related outage

Google Cloud has been hit by another networking-related outage.

The brief incident hit at 14:23 on 4 June, US Pacific Time (about 7:23 AM Sydney time) and according to Google’s incident report meant that “VM creation, live instance migration and any changes to network programming might be delayed.”

“Google Cloud Load Balancer and Internal Load Balancer backends may fail health checking in us-east4-a. Newly created VMs in the affected regions may experience connectivity issues,” the report added.

By 14:45 the Google team identified the problem and the company declared it fixed at 15:31.

Which is nicely swift response.

But also the second networking-related outage Google has experienced in a week – the company had a longer outage on Monday, Australian time, and has promised credits to impacted customers.

It is unclear if this new incident is related to Monday’s problems.

But with Google generally rated the third-most-used cloud behind AWS and Azure, the company can ill-afford outages of any sort. Never mind two in a week.

The company is clearly aware that Monday's crash was not well-received, as it has taken the unusual step of publishing a preliminary investigation report that blamed "a configuration change that was intended for a small number of servers in a single region."

But the change was instead "incorrectly applied to a larger number of servers across several neighbouring regions, and it caused those regions to stop using more than half of their available network capacity."

You can guess the rest - demand exceeded supply, with the result that service degraded.

"YouTube measured a 2.5 percent drop of views for one hour, while Google Cloud Storage measured a 30 percent reduction in traffic," Google's post says. "Approximately 1 percent of active Gmail users had problems with their account; while that is a small fraction of users, it still represents millions of users who couldn’t receive or send email."