IBM has experienced a severity one issue in its cloud, for the second time this week
At around 11:00 am Sydney time today, IBM told customers “our technicians discovered that transactions were erroring out with a connection error to networking devices."
Labelled a "global provisioning issue", the incident was "being investigating by internal teams" at the time of writing.
The investigation bore fruit: about two hours after IBM's first notification it sent an update advising that "Cloud engineers identified a misconfiguration on a subset of network devices that caused transactions to error out and stall.
"The errant setting was corrected, and transactions began processing normally. Engineers are processing through any currently stalled transactions to move them along."
The incident was IBM’s second severity one incident this week, after a Tuesday mess that meant “customers attempting to order [virtual server instances] VSIs were receiving an error stating there is insufficient capacity to order.”
IBM again got on top of that issue in just over two hours, although it persisted for some users for another four hours before the company sounded the All Clear siren.
Neither incident has led to significant disruptions. But IBM can ill-afford even minor outages as it seeks to ensure its cloud is seen as equal to hyperscale leaders like AWS and Azure, for users and channel alike.