Concrete scenarios for catastrophic AI risks
Scientists are warning for the risks of superintelligent AI, but their arguments are often too abstract to be convincing. In this article, we will look at some concrete scenarios for how superintelligent AI could cause catastrophic risks. Note that these scenarios may read like science fiction, because they are science fiction.
These scenario’s are open source, which means you are free to use them in your own work, and you are free to contribute to them. If you feel like something is missing, a scenario has unrealistic assumptions, or you have a better idea, please suggest changes using the ‘edit’ button below.
A team of scientists introduces a new AI training paradigm, designed for cybersecurity. This adversarial architecture creates pieces of code and then tries to exploit security vulnerabilities in them. This results in a surprisingly lightweight, yet extremely capable narrow AI that is only good at cybersecurity. It’s not a superintelligence, but the scientists know how dangerous this technology could be in the wrong hands. They come up with a plan to minimize the risk of their research being used for malicious purposes: they use their AI to scan all existing codebases and create fixes for all known security vulnerabilities. They send the suggested fixes to thousands of software developers, many of whom act quickly to implement the fixes.
Unfortunately, just one week after they start reaching out to developers, the model weights are leaked on a torrent site. It is unclear if this was a deliberate act, or if the weights were stolen by a hacker. The AI model is now ‘out there’. Warnings are quickly issued by security experts. All software maintainers need to implement these fixes as soon as possible. All the relevant libraries need to be updated. All the software that uses these libraries needs to be updated. The updated software needs to be deployed to all devices. Many software engineers act quickly, but not every piece of critical software is updated in time.
One particular individual has downloaded the leaked model weights. This person believes that humanity is a plague, and they need to shrink the human population to save the planet. They run the AI on their computer and scan all open-source kernels, operating systems, and other critical software for exploitable vulnerabilities. This results in the most capable computer virus that has ever existed.
It uses over 1000 different zero-day exploits to infect virtually every device on the planet. It spreads over Wi-fi, Bluetooth, USB, and TCP-IP. The virus is designed to be as stealthy as possible, before activating. In minutes it has infected 80% of all devices on the planet. When it is activated, it bricks every device it has infected.
Meanwhile, in grocery stores all over the world, people suddenly can no longer pay using their cards and phones - all screens are black. Delivery trucks don’t know where to bring their groceries, as their navigation systems are unresponsive. Farmers don’t know who they can sell their crops to. Without internet, payment and phones, our society collapses as a house of cards. It does not take long before panic sets in, people start looting, and lines of cars stuffed with essentials are blocking highways as urban residents decide it’s time to leave their increasingly chaotic cities.