Australian channel partners of Amazon Web Services have given the cloud computing and storage service provider a surprisingly good score following the major outage at its Sydney facility Sunday, which caused embarrassment for several major brands.
While the majority of partners CRN spoke to agreed the outage impacted at least some of their clients, they would only give the mildest of criticisms of the cloud provider, if at all.
Most partners CRN contacted used the outage as an opportunity to warn clients to review their respective chosen hosting architecture models to ensure that they were aligned with their service availability expectations.
Neil Hitz, managing director of AWS partner CloudTrek, captured the overriding mood in the AWS ecosystem, saying the incident should motivate IT decision makers to run a ruler over their operations.
“Amazon provides the infrastructure building blocks that by themselves may not be fault-tolerant, so it’s important to plan, architect and test for failure in the cloud. Incidences like the one we witnessed over the week illustrate the importance of these immutable principles,” Hitz said.
Far from criticising AWS, Hitz had room for faint praise. “From a partner perspective, AWS were responsive in assessing the impact of the outage to our customers. They also asked the question ‘how could we improve?’ – so we don't have any criticisms for AWS at this time."
Zack Levy, chief executive of Sydney-based partner Strut Digital, agreed that openness was more important than infallibility.
“If AWS will not be transparent, I will have some criticism. I still prefer to have our solutions and our customers solutions on AWS for a very simple reason — if there is an interruption in their environment, they are extremely incentivised to resolve it as quickly as possible simply because of the type and size of customers they have. That makes their smallest customer as important as their largest one, which is a good position to be in,” Levy said.
Other AWS partners including Melbourne-headquartered Versent said the outage had no impact on their customers. Richard Steven, chief executive of Brisbane AWS partner, ITOC, said the company “looked forward to working with” AWS.
Overview of the outage
As reported by CRN, Amazon started reporting problems with its Elastic Compute Cloud services in one of its Australian availability zones around 3.45pm Sunday during the height of lacerating storms that swept down the NSW east coast driven by an unusually powerful tropical low that caused widespread destruction and flooding across the city.
Shortly before 5pm that afternoon, AWS reported power issues at one of its Sydney data centres.
By 4.50am Monday, AWS was able to provide a detailed anatomy of the outage which confirmed that its Asia-Pacific South East serving area — chiefly EC2 and EBS instances — was essentially crippled for the better part of 10 hours with early warning signs appearing Sunday afternoon. In the same post it gave the issue the status of “resolved”.
At 8am Monday, AWS was reporting that it was still mopping up to recover a handful of computing instances and storage clusters.
However, by then AWS had dragged multiple high-profile brands through the wringer including Carsales, Domain, Dominos Pizza, REA Group, Channel Nine, Foxtel’s Foxtel Play streaming video service, its rivals Presto and Stan, and home food delivery brand Menulog. The outage affected all of them to some extent, however, REA Group was the least disrupted by virtue of having multi-zone and multi-region failover, according to CRN’s sister publication, iTnews.
While criticism of AWS has been scant from its partners, it’s clear that the outage has warmed the debate over the merits of private and public cloud infrastructure. The clearest message to emerge from the incident was that organisations needed to assess their business exposure to cloud outages and plan accordingly.
Kris Kayyal, chief marketing officer for Australian hosting operation Servers Australia, which provides an alternative platform to the hyperscale provider, was emphatic: “It highlights the fact that not all clouds are built equally. Even AWS can go down. There is a risk of downtime with any hosted service and customers need to work out how much they are able to accept. If they can't afford any downtime then they need to have multi-site availability for true network and datacentre redundancy."
Servers Australia, which last week acquired the assets of a dedicated hosting rival in a deal worth "millions", has continued to grow despite warnings that the likes of AWS, Azure and Google would signal the death knell for independent infrastructure-as-a-service players.
"People are saying the little boys are being pushed out by Amazon and Google but it is not true. We are fighting back and we are healthy and we are killing it. They are not doing that great of a job," said Servers Australia chief executive Jared Hirst.
Architect for availability
Strut Digital's Levy said: “We always recommend architecting your solutions on AWS assuming that there will be failures, at the server level or availability zone level. I haven’t seen the final root cause analysis from AWS but I would assume that doing that would have saved some AWS customers from the interruption."
ITOC's Steven added: “All of our customers who had architected and planned for such an (outage) avoided (end user) service disruption."
And the message appears to be getting across. Domain’s chief technology officer yesterday told iTnews that it was very likely to prompt a change in the company’s use of AWS.
However, Stevens said that technology decision makers still needed to make sober and balanced judgements about the level of service they required.
“For some customers, a highly available architecture is a valid choice but for other customers it is not merited. We work with all of our customers to ensure they make the right choice and implement an appropriate level of availability,” he said.