Microsoft today experienced a major Azure outage.
The company's status page reported that as of 19:43 UTC - 5:43AM in Sydney - "customers may experience intermittent connectivity issues with Azure and other Microsoft services (including M365, Dynamics, DevOps, etc)."
Microsoft staff "have identified the underlying root cause as a name server delegation issue with DNS resolution, affecting network connectivity and downstream impact to Compute, Storage, App Service, AAD, and SQL Database services."
A fix has been found and as of 22:10 UTC - 8:10AM in Sydney - Microsoft advised that " Mitigation has been applied, and engineering teams are clearing resolver cache to fully mitigate the issue. Most services are showing recovery."
But the status page still reports outages and there's no word on when full service will be restored.
The DNS outage means that while Azure services are running, they can't be reached. So the tenor of social media comment is mixed!
The good news is that all of our Azure Resources (ie. DB, Web, Storage, Container, etc) are running just fine!
— ☁ əuoʎpnoןɔ ☁ (@CloudyOne) May 2, 2019
The not so good news is that we can't access them because the Network Infrastructure is having intermittent issues #azure #azureoutage #outage pic.twitter.com/LG2MGh1JLQ
The DNS problem was fixed at around 9:30 AM Sydney time but full recovery took another couple of hours.
Microsoft's also offered a few more hints about the nature of the fault, writing "To mitigate, engineers corrected the nameserver delegation issue. Applications and services that accessed the incorrectly configured domains may have cached the incorrect information, leading to a longer restoration time until their cached information expired."
The company has promised a proper explanation in 72 hours. CRN will cover that as it emerges.