Open Philanthropy recommended a grant of $1,700,000 over two years to the Center for Open Science to support the development of a systematic benchmark assessing how effectively large language models (LLMs) can evaluate, replicate, and conduct scientific research.
This grant was funded via a request for proposals for projects benchmarking LLM agents on consequential real-world tasks. This falls within our focus area of potential risks from advanced artificial intelligence.