It has long been a published principle of OpenAI that in order to avoid dangerous races, they will stop what they’re doing and shift to helping any other lab they believed was close to creating Artificial General Intelligence (AGI), i.e., dangerous human-level intelligence. Now, as technical leaders leave OpenAI in droves and the company prepares to adopt a fully for-profit structure, it’s safe to assume this principle is dead. Watch for this text to disappear from the OpenAI website soon:
“We are concerned about late-stage AGI development becoming a competitive race without time for adequate safety precautions. Therefore, if a value-aligned, safety-conscious project comes close to building AGI before we do, we commit to stop competing with and start assisting this project. We will work out specifics in case-by-case agreements, but a typical triggering condition might be ‘a better-than-even chance of success in the next two years.’”
Anthropic was formed three years ago by safety-minded researchers who left OpenAI because they felt the company was not being sufficiently safety conscious. However, Anthropic seems to be taking itself down the same route, discovering that doing AI requires gobs of cash and aligning themselves with a powerful for-profit partner, in this case Amazon. Not to be left out, Google has combined their Google Brain and DeepMind teams under the leadership of Demis Hassabis. Being deep in the heart of the Google profit machine seems to be turning Demis’ focus more towards the business side of things.
I believe these are the three top AI labs that are positioned to reach AGI, and some say it could happen this decade. Clearly, we are already well into a dangerous race, and the best way to stop it would be for these three organizations to merge and work together. Now. I realize this is incredibly unlikely and that such a merger could be the most complicated business transaction in history, given all the funders with skin in the game. But talking about it is the only way something like this could happen.
AI pioneer Geoffrey Hinton recently left Google so he could speak freely about the existential risks AI poses to humanity. Last year, other AI luminaries signed an open letter asking for a 6-month pause on experiments more powerful than GPT-4. Clearly, people in a position to know believe that time is running out for a cooperative project to build beneficial AGI and Artificial Superintelligence (ASI). The alternative to cooperation is a capital- and energy-intensive race to see who can get there first and dominate the market via network effects. This is the path we’re on, and it’s obvious that the winner will be the company that is willing to cut the most corners on safety in order to get capabilities to market faster.
The arguments against pausing are plentiful, but should we trust such arguments coming from the people who stand to personally benefit from winning a race? In the US, the most common objection is that if we don’t develop AGI first, China will. I do believe it could be catastrophic if the Chinese Communist Party develops AGI with no countervailing force from the West. Permanent human disempowerment via an inescapable AI-led autocracy is potentially worse than extinction. However, this is not a valid justification for an uncontrolled race in the West. The united forces of OpenAI, DeepMind, and Anthropic would stand a much better chance against China than any of them alone. It’s also irresponsible to not even try for international cooperation. There are many areas where treaties or social norms have suppressed dangerous technology across the globe. The banning of CFCs for the protection of the ozone layer stands out as a successful treaty, and the stigma against human cloning is an example of suppression through social pressure alone.
The three labs I’m suggesting to merge were chosen not just because they are leaders, but because each has a specialty. OpenAI is the best at building transformer-based Large Language Models (LLMs) trained by self-supervised learning, supervised learning, and Reinforcement Learning with Human Feedback (RLHF). Many competitors are catching up to ChatGPT, but OpenAI remains ahead with their series of GPT-4 upgrades and GPT-5 on the horizon (possibly this year!). DeepMind is the leader in reinforcement learning, as demonstrated by the achievements of AlphaZero and AlphaFold. Anthropic is the leader in safety research with their pioneering work in mechanistic interpretability, which is how we can know what an LLM is thinking.
As we enter the year of agents in 2025, the world will be shocked by AI systems that prompt themselves to accomplish complex goals over time. These systems will benefit from radical increases in their ability to remember what happened as they ran. The era of huge training runs followed by a period of consumer usage will give way to the era of continuous learning. This is the perfect time to marry the transformer LLM architecture with reinforcement learning that can keep the system aligned to its goals as it learns. Of course, we’d like AI system evolution to happen safely and transparently. The marriage of OpenAI, DeepMind, and Anthropic brings the world’s best researchers together to steward these systems to safe human-level capabilities and beyond.
Since it’s unlikely the money people involved with these companies will unite voluntarily, we need to look for other options. One possibility is a shotgun wedding. Unfortunately, the corrupt US government, were it actually aware of what’s going on in AI, would probably not want to interfere with the monied interests.
Probably the best hope is for the AI researchers themselves, who have all of the actual power in this drama, to free themselves from the lure of money and to unite under a truly safety aligned banner. What good is a million dollar salary if you end up contributing to human extinction? Tech companies are increasingly sidestepping the complexity of mergers and acquisitions with the “acquihire” approach of hiring away all of the important people, leaving a husk of a company behind. A variant of this, perhaps called quitfluencing (combining quitting with both influencing and confluence) could emerge as the solution. Perhaps the key players could crystallize around the nucleus of Ilya Sutskever’s Safe Superintelligence. Sutskever is the latest principal to leave OpenAI over disagreements with Sam Altman about the company’s direction.
What does all this have to do with you? Since nobody in power will support the idea of a cooperative effort, it’s up to us. Learn about AI alignment and spread awareness. Discuss the idea of an urgent confluence of key talent that drains the for-profit monsters of the human talent they need. This may be our last chance, in this final era where human intelligence actually matters.