The International AI Safety Report 2026 highlights that while general-purpose AI is increasingly automating various stages of cyberattacks, fully “autonomous attacks remain limited,” primarily due to the inability of AI systems to consistently manage complex, multi-stage attack sequences without human oversight. The report asserts that observed failure modes in AI include executing irrelevant commands, losing track of operational state, and an inability to recover from simple errors independently.
The report distinguishes between AI systems that can “assist” in the cyberattack chain—such as identifying targets, exploiting vulnerabilities, and generating malicious code—and those that can autonomously execute an entire operation. It further states that “fully autonomous, end-to-end attacks have not been reported,” emphasizing the current limitations of AI in cybersecurity.
Chris Anley, chief scientist at NCC Group, noted in a statement shared with TechInformed that the absence of fully autonomous cyberattacks does not eliminate the risks involved. He explained that attackers are already utilizing AI to identify vulnerabilities and create exploits, which enables more attacks with less technical expertise. Anley characterized AI-enabled attacks as the “new normal” and urged organizations to invest in faster detection methods, robust controls, and defensive AI to address the scale and speed of modern cyber threats.
Data from DARPA’s AI Cyber Challenge (AIxCC) provides a benchmark for performance in discrete security tasks. In a controlled environment during the August 2025 Final Competition, DARPA reported that competing AI systems discovered 54 unique synthetic vulnerabilities out of 63 challenges and successfully patched 43 of them. This final data reflects revisions from earlier preliminary figures concerning patch success, while the total number of discovered vulnerabilities remained unchanged. As detailed by the Congressional Research Service, the AIxCC aims to transition AI systems toward “machine speed” identification and patching strategies to bolster critical infrastructure against both human-led and AI-assisted threats.
The World Economic Forum’s Global Cybersecurity Outlook 2026 situates AI within a broader risk landscape that includes geopolitical fragmentation and uneven cyber capabilities. The report underscores AI’s dual role, enhancing defensive measures while simultaneously empowering more sophisticated attacks.
Incident datasets also indicate an increasing exposure to AI in cyber threats. Verizon’s 2025 Data Breach Investigations Report Executive Summary found “evidence” of generative AI usage by threat actors, as reported by the AI platforms themselves. It stated that the prevalence of synthetically generated text in malicious emails has doubled over the past two years. The report also revealed that 15% of employees accessed generative AI systems using corporate devices, with many employing non-corporate emails (72%) or corporate emails lacking integrated authentication (17%). Mandiant’s M-Trends 2025 further reported that exploits (33%), stolen credentials (16%), and phishing (14%) were the leading initial infection vectors in its 2024 investigations.
Research on agent reliability has also shed light on the brittleness of AI in long-horizon tasks. “The Agent’s Marathon,” published on OpenReview, highlights that large language model (LLM) agents “remain brittle” over extended tasks, with performance deteriorating rapidly. Meanwhile, the authors of the Agent Security Bench (ASB) identify that LLM-based agents could introduce “critical security vulnerabilities” and propose ASB as a framework to benchmark attacks and defenses across various scenarios, tools, and methods. A 2025 survey paper on ScienceDirect frames “LLM-based agents” as a distinct area for both attacks and defenses, suggesting evaluation criteria for assessing their effectiveness.
The 2026 report notes a shift toward “Frontier AI Safety Frameworks” as a key method for managing risks associated with AI. These frameworks increasingly depend on “if-then” safety commitments, which stipulate specific capability thresholds that, once reached by a model, activate mandatory safety mitigations. This approach aims to address the “evidence dilemma,” a challenge in formulating policy when the pace of AI advancements outstrips the scientific understanding of their associated risks. The report indicates that the number of companies publishing voluntary safety frameworks has doubled since last year, though it cautions that “real-world evidence of their effectiveness remains limited.” To bolster safety layers further, the report advocates a “defense-in-depth” strategy, which integrates technical safeguards, system-level monitoring, and organizational risk processes to prevent systemic breaches resulting from failures in any single control.
See also
Anthropic’s Claims of AI-Driven Cyberattacks Raise Industry Skepticism
Anthropic Reports AI-Driven Cyberattack Linked to Chinese Espionage
Quantum Computing Threatens Current Cryptography, Experts Seek Solutions
Anthropic’s Claude AI exploited in significant cyber-espionage operation
AI Poisoning Attacks Surge 40%: Businesses Face Growing Cybersecurity Risks





















































