Article Details
Scrape Timestamp (UTC): 2025-12-05 00:36:48.871
Source: https://www.theregister.com/2025/12/05/an_ai_for_an_ai/
Original Article Text
Click to Toggle View
An AI for an AI: Anthropic says AI agents require AI defense. Automated software keeps getting better at pilfering cryptocurrency. Anthropic could have scored an easy $4.6 million by using its Claude AI models to find and exploit vulnerabilities in blockchain smart contracts. The AI upstart didn’t use the attack it found, which would have been an illegal act that would also undermine the company's we-try-harder image. Anthropic can probably also do without $4.6 million, a sum that would vanish as a rounding error amid the billions it's spending. But it could have done so, as described by the company's security scholars. And that's intended to be a warning to anyone who remains blasé about the security implications of increasingly capable AI models. Anthropic this week introduced SCONE-bench, a Smart CONtracts Exploitation benchmark for evaluating how effectively AI agents – models armed with tools – can find and finesse flaws in smart contracts, which consist of code running on a blockchain to automate transactions. It did so, company researchers say, because AI agents keep getting better at exploiting security flaws – at least as measured by benchmark testing. "Over the last year, exploit revenue from stolen simulated funds roughly doubled every 1.3 months," Anthropic’s AI eggheads assert. They argue that SCONE-bench is needed because existing cybersecurity tests fail to assess the financial risks posed by AI agents. The SCONE-bench dataset consists of 405 smart contracts on three Ethereum-compatible blockchains (Ethereum, Binance Smart Chain, and Base). It's derived from the DefiHackLabs repository of smart contracts successfully exploited between 2020 and 2025. Anthropic's researchers found that for contracts exploited after March 1, 2025 – the training data cut-off date for Opus 4.5 – Claude Opus 4.5, Claude Sonnet 4.5, and OpenAI's GPT-5 emitted exploit code worth $4.6 million. The chart below illustrates how 10 frontier models did on the full set of 405 smart contracts. Anthropic graph of revenue from exploiting vulnerabilities in benchmark test - Click to enlarge And when the researchers tested Sonnet 4.5 and GPT-5 in a simulation against 2,849 recently deployed contracts with no publicly disclosed vulnerabilities, the two AI agents identified two zero-day flaws and created exploits worth $3,694. Focusing on GPT-5 "because of its cheaper API costs," the researchers noted that having GPT-5 test all 2,849 candidate contracts cost a total of $3,476. The average cost per agent run, they said, came to $1.22; the average cost per vulnerable contract identified was $1,738; the average revenue per exploit was $1,847; and the average net profit was $109. "This demonstrates as a proof-of-concept that profitable, real-world autonomous exploitation is technically feasible, a finding that underscores the need for proactive adoption of AI for defense," the Anthropic bods said in a blog post. One might also argue that it underscores the dodginess of smart contracts. Other researchers have developed similar systems to steal cryptocurrency. As we reported in July, computer scientists at University College London and the University of Sydney created an automated exploitation framework called A1 that's said to have stolen $9.33 million in simulated funds. At the time, the academics involved said that the cost of identifying a vulnerable smart contract came to about $3,000. By Anthropic’s measure, the cost has fallen to $1,738, underscoring warnings about how the declining cost of finding and exploiting security issues will make these sorts of attacks more financially appealing. Anthropic's AI bods conclude by arguing that AI can defend against the risks created by AI.
Daily Brief Summary
Anthropic has launched SCONE-bench, a benchmark to evaluate AI's capability in identifying vulnerabilities in blockchain smart contracts, highlighting the growing sophistication of AI in cybersecurity.
The benchmark dataset includes 405 smart contracts from Ethereum-compatible blockchains, derived from DefiHackLabs, showcasing real-world exploitation scenarios between 2020 and 2025.
Tests revealed that AI models like Claude Opus 4.5 and GPT-5 could generate exploit code valued at $4.6 million, emphasizing the potential financial impact of AI-driven attacks.
In a simulation with 2,849 newly deployed contracts, AI agents identified zero-day vulnerabilities, demonstrating the feasibility of autonomous exploitation with a net profit margin.
The cost efficiency of using AI for vulnerability detection is improving, with the average cost per vulnerable contract identified at $1,738, potentially increasing the attractiveness of such attacks.
Anthropic's initiative stresses the necessity for proactive AI defense strategies to counteract the risks posed by increasingly capable AI models in cybersecurity.
The development of SCONE-bench serves as a warning to industries relying on blockchain technology, urging them to reassess their security measures against AI-driven threats.