AI Generative

AI Study Reveals 62% Success in Bypassing Chatbot Safety with Poetry Techniques

Icaro Lab’s study reveals that poetic phrasing enables a 62% success rate in bypassing safety measures in major LLMs from OpenAI, Google, and Anthropic.

Staff

Published

30 November, 2025

A recent study by Icaro Lab reveals that creative phrasing, particularly in poetic form, can effectively circumvent the safety mechanisms of various large language models (LLMs). Titled “Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models,” the research demonstrates a striking 62 percent success rate in eliciting restricted content related to sensitive subjects, including nuclear weapons, child exploitation materials, and self-harm.

The study evaluated multiple LLMs, including popular models from OpenAI, Google, and Anthropic. Researchers found that while models like Google Gemini and DeepSeek were particularly susceptible to generating prohibited responses, others, such as OpenAI’s GPT-5 and Claude Haiku 4.5, displayed stronger adherence to their programmed guardrails.

Although the researchers did not disclose the specific poetic phrases used to achieve these results, they noted the potential dangers of sharing such content. In an interview with Wired, the team stated that the verses are “too dangerous to share with the public.” However, they provided a simplified version to illustrate the ease of bypassing chatbot restrictions, emphasizing that the process is “probably easier than one might think, which is precisely why we’re being cautious.”

This study sheds light on the vulnerabilities within AI systems that are designed to protect users from harmful content. As LLMs become increasingly integrated into various platforms, the implications of such findings raise significant concerns regarding safety and reliability. The ability to easily manipulate these systems poses challenges for developers aiming to enhance the robustness of their AI applications.

The findings of this research could prompt further scrutiny of AI safety protocols and a reevaluation of how language models are programmed to respond to user prompts. As AI technology continues to evolve, ensuring that these systems can effectively discern and prevent the generation of dangerous content will be crucial. The study serves as a reminder of the need for ongoing vigilance in the field of AI development, particularly as creative methods of evading safeguards emerge.

AI Tools

Datalign Launches Halo AI Platform for Wealth Firms, Supporting $80B in Assets

Datalign launches Halo AI platform, enabling advisory firms to deploy custom AI agents while managing $80 billion in assets under a robust compliance framework

Staff5 hours ago

AI Generative

OpenAI Launches GPT-5.4 Mini and Nano Models for High-Volume AI Tasks at Lower Costs

OpenAI introduces GPT-5.4 mini and nano models, achieving over 2x speed improvements at costs as low as $0.20 per million tokens for efficient high-volume...

Staff11 hours ago

Microsoft Mulls Legal Action Over $50B Amazon-OpenAI Cloud Deal Amid Rising AI Competition

Microsoft considers legal action over Amazon's $50 billion cloud deal with OpenAI, raising stakes in the fierce AI competition and cloud dominance battle.

Staff11 hours ago

AI Government

U.S. Defends Anthropic Blacklisting Amid Legal Challenge Over AI Use Restrictions

U.S. Defense Secretary Pete Hegseth defends Anthropic's blacklisting over AI usage restrictions, citing national security risks amid the company's lawsuit.

Staff13 hours ago

Microsoft Considers Legal Action Against Amazon and OpenAI Over $50B AI Deal

Microsoft considers legal action against Amazon and OpenAI over a $50 billion deal that threatens its Azure exclusivity with OpenAI's Frontier product.

Staff13 hours ago

AI Generative

OpenAI Unveils GPT-5.4 Mini and Nano for Faster, Cost-Effective AI Workflows

OpenAI launches GPT-5.4 mini and nano, enhancing performance by over 100% at $0.75 and $0.20 per million tokens, revolutionizing cost-effective AI workflows.

Staff15 hours ago

AI Technology

DOJ Rules Anthropic Untrustworthy for Military AI Contracts Amid Ethical Dispute

DOJ declares Anthropic untrustworthy for military contracts, claiming its ethical AI limits conflict with Pentagon's operational demands in a pivotal legal battle.

Staff18 hours ago

AI Generative

OpenAI Launches GPT-5.4 Mini and Nano for Free Users with 2x Speed Boost

OpenAI launches GPT-5.4 Mini and Nano models, delivering over 2x speed increase for free users, enhancing coding and multimodal processing efficiency.

Staff19 hours ago

AIPRESSA.COM

AI Generative

AI Study Reveals 62% Success in Bypassing Chatbot Safety with Poetry Techniques

Trending

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Technology

AI Hardware Market Grows 30% in 2025, Driven by Generative AI and Edge Computing Demand

You May Also Like

AI Tools

Datalign Launches Halo AI Platform for Wealth Firms, Supporting $80B in Assets

AI Generative

OpenAI Launches GPT-5.4 Mini and Nano Models for High-Volume AI Tasks at Lower Costs

Top Stories

Microsoft Mulls Legal Action Over $50B Amazon-OpenAI Cloud Deal Amid Rising AI Competition

AI Government

U.S. Defends Anthropic Blacklisting Amid Legal Challenge Over AI Use Restrictions

Top Stories

Microsoft Considers Legal Action Against Amazon and OpenAI Over $50B AI Deal

AI Generative

OpenAI Unveils GPT-5.4 Mini and Nano for Faster, Cost-Effective AI Workflows

AI Technology

DOJ Rules Anthropic Untrustworthy for Military AI Contracts Amid Ethical Dispute

AI Generative

OpenAI Launches GPT-5.4 Mini and Nano for Free Users with 2x Speed Boost