Open Philanthropy recommended a grant of $130,000 to ETH Zürich to support the development of a benchmark to measure whether LLM agents can design and implement adversarial attacks that overcome defenses described in the academic literature. The project will be led by Professor Florian Tramér.
This grant was funded via a request for proposals for projects benchmarking LLM agents on consequential real-world tasks. This falls within our focus area of potential risks from advanced artificial intelligence.