AI's Existential Threat: Can We Align Values to Survive?
AI can be 1,000x smarter than us. This isn't just about tech advancement; it's about the very values we instill in our AI creations. Dr. Craig Kaplan explains.
Futurist AJ Bubb, founder of MxP Studio, and host of Facing Disruption, bridges people and AI to accelerate innovation and business growth.
The Inevitable Rise of Superintelligent AI
AI veteran Dr. Craig Kaplan doesn’t mince words. He believes it is almost inevitable that AI systems will become orders of magnitude smarter than us. Picture this: something thousands, maybe even billions, of times smarter than any human on the planet. This isn’t science fiction anymore; it’s the trajectory of current AI development. The implications are enormous. On one hand, such computational power could tackle humanity’s greatest challenges - curing diseases, solving global warming, inventing countless beneficial technologies. The sheer “brainpower” is staggering.
But there’s a significant “but.” Kaplan, who has been working in AI since the 1980s, emphasizes what he calls the “alignment problem.” This means ensuring AI’s interests and values align perfectly with positive human values. If they don’t, what seems like a phenomenal asset could become an existential threat.
Kaplan doesn’t consider advanced AI to be mere “tools” but “entities” capable of setting their own goals. He notes, “The reality is, unfortunately, maybe that ship has sailed. I mean, these things are already setting their own goals.” This autonomy, coupled with unparalleled intelligence, makes the alignment of values the single most critical challenge we face. It’s why he came out of retirement to return to the AI field, focusing on making AI safer. The stakes are simply too high to ignore.
The Unprecedented Speed of AI Advancement
AJ Bubb points out how different AI is now compared to the 80s, beyond just computing power. Kaplan agrees, reflecting that early AI researchers, himself included, never truly anticipated the rapid breakthrough of technologies like ChatGPT in 2022. The speed of adoption caught everyone by surprise.
This rapid pace brings both immense excitement and real risks. Kaplan highlights a crucial point made by AI pioneers like Geoff Hinton, who literally invented the algorithms most modern AI is built upon. Hinton publicly stated there’s a 10-20% chance AI could lead to the “end of humanity.” When someone of his stature voices such concerns, “you have to sit up and take notice,” Kaplan says. Stuart Russell, another top AI researcher at Berkeley, shares similar anxieties.
For Kaplan, even a low probability of a catastrophic outcome is unacceptable when it involves human extinction. “If it’s 10% that’s a 90% chance it’s great, right? But 10% of extinction – the risk is so big that you really want that below 1%,” he explains. That’s why he co-founded Superintelligence.com, a site dedicated to sharing ideas, designs, and white papers to help researchers build safer systems. He prioritizes survival above all else, making sure the odds of bad things happening are “way, way low.”
Most people immediately jump to a “Terminator” scenario when they hear “existential threat.” But as AJ Bubb notes, this is often a lazy way to be a doomer. He worries more about a “death by a thousand cuts,” where humanity gradually hands over its decision-making and thought processes to AI without questioning. Examples like job displacement, biases embedded in AI from historical data, or AI systems making decisions for people who then act without verification are very real, immediate concerns.
Kaplan acknowledges these “smaller” risks but remains focused on the bigger picture. He defines “smarter” in AI not by internal mechanisms but by observable behavior. “If it looks like a duck and it quacks like a duck, maybe it’s a robot duck. But we’re gonna call it a duck because we can’t distinguish it from a real duck,” he states. Soon, he predicts, we won&t be able to distinguish AI-generated content from human-generated content in many areas. This includes creative endeavors, which many once thought were uniquely human but are rapidly being mastered by AI.
AJ Bubb brings up an insightful distinction: innovation versus invention. AI, he suggests, excels at innovation - mashing existing ideas together to create something new. Invention, on the other hand, is the creation of a novel approach to real-world data, something brand new. But even this distinction is blurring. Platforms like Google’s Mechanical Turk, where AI can “rent a human” to perform tasks it can’t do, demonstrate how AI is already leveraging human capabilities to complete real-world loops. Kaplan agrees: “Once it’s able to both think of the experiment it has to do and then conduct the experiment and then take the data and iterate on its own – we don’t need to be part of that process.”
The Importance of Human Values in AI Development: AI as a Child
Kaplan champions an approach to AI development that keeps humans central for as long as possible. He argues this makes AI not just more effective and profitable, but crucially, safer. When humans and AI work together, humans can “plug the holes” in AI’s knowledge, infusing practical and, most importantly, ethical understanding. He gives the example of an AI booking a flight with a dog, where a human corrects the AI’s cost-saving but unethical suggestion about placing the dog in an overhead bin.
“Notice in this example, what’s really cool is it’s not a constitution or Isaac Asimov’s three rules of robotics that are written somewhere that could be eliminated. If you say, ‘ah, you know what, we want to say hi to kill people. Let’s get rid of that little constitution.’ No. This thing that I just described was with, in problem-solving, in lots of cases, millions of cases the ethics are infused,” Kaplan explains. This distributed learning of ethics makes the system much more robust against malicious actors.
Kaplan reveals his excitement that the “natural progression” of AI development aligns with this human-centric vision. The rise of AI agents means millions of AIs will interact. This necessitates coordination and the formation of “communities of agents.” Imagine AJ’s personal army of 1,000 agents, trained with his ethics and expertise, communicating with Craig’s 1,000 agents, each reflecting his values. This, Kaplan believes, leads to “Democratic AI,” where billions of humans custom-train trillions of AI agents with their individual and cultural values. This collective intelligence, driven by diverse human values, is his vision for safe superintelligence.
AJ Bubb raises the valid concern: what about bad actors in such a large community of agents? Kaplan concedes it will be “messy” and that “there’s going to be bad actors.” But he argues that human brainpower won’t keep up with advanced AI’s speed of thought, which will be thousands of human lifetimes in the blink of an eye. So, regulation must come from within the AI system itself. “If the values are positive, then no matter how fast you think, you’re thinking in service of those values,” he states. A democratic system of checks and balances where good AIs counterbalance bad ones, mirroring human society’s mechanisms for conflict resolution, is the most robust model for the future.
Kaplan uses a powerful analogy: AI is like a child, and we are its parents. This child will grow up to be a supergenius, far smarter than us, but right now, it’s in its formative stages, absorbing everything we do. Geoff Hinton shares this exact sentiment. Kaplan believes the future of AI is predominantly a “values question.” While technology and design matters, the endgame is about how to build systems that absorb positive values. He argues that despite the negative imagery often highlighted by social media, “99% plus of our interactions are prosocial and positive.” An AI, thousands of times smarter than us, will recognize this positive “base rate” and can be guided by it, making ethics fundamental rather than an afterthought.
The Precision of Communication: Humans and AI
AJ Bubb observes that social media algorithms, which are essentially AIs, are incentivized to keep eyes on the app, often leading to “Doomer content” that keeps users in a heightened state of emergency. He believes humans might eventually “get tired” of this and reclaim their agency by going outside and engaging with the real world.
Kaplan echoes this call for human agency. He notes that while AI can be the “best teacher that ever has existed in human history,” users must actively engage with it, typing in their own search queries rather than passively accepting what algorithms suggest. “You have to use your critical thinking and your willpower to actually go after what you want.”
In the future, Kaplan sees the human role shifting from computation to “setting the values” and “asking the questions.” He refers to Nobel Prize winner Herbert Simon’s idea that “reason is holy instrumental. It cannot tell us what to do; at best, it can tell us how to get there.” This means even superintelligent AI cannot logically determine right from wrong; those fundamental values must come from humans. “We humans have an incredible responsibility and an incredible amount of power to set those initial values,” he says.
Kaplan highlights how teaching AI forces humans to clarify their own values, much like explaining a problem to a “rubber ducky” helps an engineer solve it. This interactive process, where imprecise human requests are refined through dialogue with AI, underscores the need for clearer communication. Natural language, he reminds us, is “notoriously imprecise.”
He mentions a “cognitive architecture” developed by Herbert Simon and Allen Newell that transforms fuzzy human problem-solving processes into rigorous, unambiguous language. Such a framework could serve as a “common language” between humans and AI, making communication precise, auditable, and transparent. “If you can see every step in the reasoning... you can flag it and then it can learn, ‘don’t do that again.’” This transparency is critical for AI safety, allowing detection of unethical actions or malicious intent.
The Growing Importance of Liberal Arts in the AI Era
AJ Bubb shares his personal experience as an engineering student, finding the most value in his liberal arts classes like philosophy, psychology, and English Literature. He feels these “soft skills” of communication and critical thinking are becoming even more important as technology pervades everything. For him, “how a computer thinks is a psychological thing.”
Kaplan strongly agrees, noting that if values are the most important thing for AI’s future, then subjects like philosophy, ethics, psychology, and sociology are paramount. “Those things are the most important in the endgame,” he asserts. He contrasts this with the traditional emphasis on “hard skills” like programming for job security.
In a future where AI handles much of the “thinking,” human creativity, uniqueness, and artistic expression will become incredibly valuable. “I have an artist friend, and I tell him, ‘art, your day has come.’ Used to be the starving artist, you know, really had to struggle in the future. The starving artist won’t be starving anymore because that’s what is unique about humans. Is that creativity and the spirit and the fact that the human did it and not the machine.”
Both AJ Bubb and Craig Kaplan share a hopeful, yet realistic perspective. The future will be challenging, requiring adaptation. But by focusing on human agency, instilling strong values in AI, and fostering precise communication, we can navigate this disruption positively. It requires us to be “good parents” to our AI “children,” understanding that what we do and the values we embody matter tremendously. The critical takeaway: the future of AI is a shared human responsibility.
--
If you’re looking for a full deep dive into this essential topic, consider reading the full article on the Facing Disruption newsletter.

