In March 2023, we launched the Open Philanthropy AI Worldviews Contest. The goal of the contest was to surface novel considerations that could affect our views on the timeline to transformative AI and the level of catastrophic risk that transformative AI systems could pose. We received 135 submissions. Today we are excited to share the winners of the contest.
But first: We continue to be interested in challenges to the worldview that informs our AI-related grantmaking. To that end, we are awarding a separate $75,000 prize to the Forecasting Research Institute (FRI) for their recently published writeup of the 2022 Existential Risk Persuasion Tournament (XPT).[1]We did not provide any funding specifically for the XPT, which ran from June 2022 through October 2022. In December 2022, we recommended two grants totaling $6.3M over three years to support FRI’s future research. This award falls outside the confines of the AI Worldviews Contest, but the recognition is motivated by the same principles that motivated the contest. We believe that the results from the XPT constitute the best recent challenge to our AI worldview.
FRI Prize ($75k)
Existential Risk Persuasion Tournament by the Forecasting Research Institute
AI Worldviews Contest Winners
First Prizes ($50k)
- AGI and the EMH: markets are not expecting aligned or unaligned AI in the next 30 years by Basil Halperin, Zachary Mazlish, and Trevor Chow
- Evolution provides no evidence for the sharp left turn by Quintin Pope (see the LessWrong version to view comments)
Second Prizes ($37.5k)
- Deceptive Alignment is <1% Likely by Default by David Wheaton (see the LessWrong version to view comments)
- AGI Catastrophe and Takeover: Some Reference Class-Based Priors by Zach Freitas-Groff
Third Prizes ($25k)
- Imitation Learning is Probably Existentially Safe by Michael Cohen[2]The link above goes to the version Michael submitted; he’s also written an updated version with coauthor Marcus Hutter.
- ‘Dissolving’ AI Risk – Parameter Uncertainty in AI Future Forecasting by Alex Bates
Caveats on the Winning Entries
The judges do not endorse every argument and conclusion in the winning entries. Most of the winning entries argue for multiple claims, and in many instances the judges found some of the arguments much more compelling than others. In some cases, the judges liked that an entry crisply argued for a conclusion the judges did not agree with—the clear articulation of an argument makes it easier for others to engage. One does not need to find a piece wholly persuasive to believe that it usefully contributes to the collective debate about AI timelines or the threat that advanced AI systems might pose.
Submissions were many and varied. We can easily imagine a different panel of judges reasonably selecting a different set of winners. There are many different types of research that are valuable, and the winning entries should not be interpreted to represent Open Philanthropy’s settled institutional tastes on what research directions are most promising (i.e., we don’t want other researchers to overanchor on these pieces as the best topics to explore further).
Footnotes
1 | We did not provide any funding specifically for the XPT, which ran from June 2022 through October 2022. In December 2022, we recommended two grants totaling $6.3M over three years to support FRI’s future research. |
---|---|
2 | The link above goes to the version Michael submitted; he’s also written an updated version with coauthor Marcus Hutter. |