LONDON (AP) – The global outage that took Facebook and its other platforms offline for hours was caused by an error during routine maintenance, the company said.
Facebook’s vice president of infrastructure, Santosh Janardhan, said in a blog post that the darkening of Facebook, Instagram and WhatsApp was “not due to malicious activity, but due to an error of our own making.”
The problem occurred while engineers were doing day-to-day work on Facebook’s global Backbone network; Computers, routers and software in its data centers around the world, as well as the fiber-optic cables connecting them.
“During one of these routine maintenance jobs, an order was issued with the intention of assessing the availability of global backbone capacity,” Janardhan said on Tuesday, which inadvertently removed all connections in our Backbone network, causing Facebook data centers were effectively disconnected.”
WATCH: ‘The choices being made inside Facebook are disastrous,’ says whistleblower
Janardhan said Facebook’s systems are designed to catch such mistakes, but in this case a bug in the audit tool prevented it from handling commands properly.
That change also led to a second problem that made it impossible to access Facebook’s servers, even though they were operational.
Janardhan said engineers scrambled to fix the problem on site, but it took time because of the extra layers of protection. Data centers are “hard to get into, and once you’re in, the hardware and routers are designed in such a way that they are difficult to modify even if you have physical access.”
Once connectivity was restored, services were gradually brought back to avoid traffic surge that could lead to more accidents.
It was an “unexpected anomaly” thanks to a poorly maintained update to Facebook’s Backbone network, but the company probably could have avoided a scenario in which its servers went completely offline, making it impossible to access the tools needed to fix it. had gone. Angelique Medina of ThousandEyes of Cisco Systems, a firm that monitors Internet outages.
“The big question is why so many internal devices and systems can have the same source of failure,” Medina said. “Facebook may still have been down due to a network outage, but if they had internal access they could have resolved the outage sooner.”
WATCH: Facebook whistleblower asks Congress to control tech giant’s influence on users