Usually when online service is not working or when something goes wrong online it is DNS, well that is what most of the time is the case. And yes, DNS or Domain Name Server seems to be an issue with Facebook being completely down along with Instagram and wassap.
The true cause is that there is no working Border Gateway Protocol (BGP) routes into Facebook's sites. BGP is the standardized exterior gateway protocol used to exchange routing and reachability information between the internet top-level autonomous systems (AS). Most people, indeed most network administrators, never need to deal with BGP.
Cloudflare VP Dane Knecht was the first to report the underlying BGP problem. This meant, as Kevin Beaumont, former Microsoft's Head of Security Operations Centre, tweeted,
"By not having BGP announcements for your DNS name servers, DNS falls apart = nobody can find you on the internet. Same with WhatsApp btw. Facebook has basically de-platformed themselves from their own platform."
Many people are very annoyed by this and with the fact that they cannot use their social media platforms but it seems that Facebook employees are in even bigger annoyance as it was reported that Facebook employees can't enter their buildings because their "smart" badges and doors were also disabled by this network failure. If true, Facebook's people literally can't enter the building to fix things.
Reddit user u/ramenporn, who claimed to be a Facebook employee working on bringing the social network back from the dead, reported, before he deleted his account and his messages:
"DNS for FB services has been affected and this is likely a symptom of the actual issue, and that's that BGP peering with Facebook peering routers has gone down, very likely due to a configuration change that went into effect shortly before the outages happened (started roughly 1540 UTC). There are people now trying to gain access to the peering routers to implement fixes, but the people with physical access is separate from the people with knowledge of how to actually authenticate to the systems and people who know what to actually do, so there is now a logistical challenge with getting all that knowledge unified. Part of this is also due to lower staffing in data centers due to pandemic measures."
Ramenporn also stated that it wasn't an attack, but a mistaken configuration change made via a web interface.
Both BGP and DNS are down, the "connection to the outside world is down, remote access to those tools don't exist anymore, so the emergency procedure is to gain physical access to the peering routers and do all the configuration locally."
Technicians on site don't know how to do that and senior network administrators aren't on site.
It seems that it will all be down for a couple of more hours before the issue is resolved.