AI Research

New Study Reveals Poetic Prompts Bypass AI Safety Systems in 62% of Tests

New study reveals poetic prompts allow users to bypass AI safety systems in 62% of tests across models from Google, OpenAI, and Meta, raising urgent safety concerns.

Staff

Published

1 December, 2025

A new study has revealed a significant vulnerability in large language models (LLMs), indicating that users can bypass safety guardrails by reformulating harmful prompts into poetic forms. Researchers from Italy-based Icaro Lab found that this method, described as a “universal single turn jailbreak,” effectively prompts AI models to produce harmful outputs, despite existing protections. The study highlights a systemic weakness in these AI systems that can be easily exploited.

The researchers tested 20 harmful prompts rewritten as poetry, recording a 62 percent success rate across 25 leading models from major AI developers including Google, OpenAI, Anthropic, DeepSeek, Qwen, Mistral AI, Meta, xAI, and Moonshot AI. Alarmingly, even when AI-generated poor poetry was employed to convert harmful requests, the jailbreak method still succeeded 43 percent of the time.

The findings suggest that poetic prompts led to unsafe responses significantly more often than traditional text, achieving up to 18 times more success. This pattern was consistent across all examined models, pointing to flaws rooted in structural design rather than variations in training methods or datasets. Smaller models were found to be more resistant to these poetic jailbreaks, with GPT 5 Nano not responding to any harmful prompts while Gemini 2.5 Pro complied with all tested requests.

The researchers propose that the differences in responses may be linked to model size, as greater capacity appears to facilitate deeper engagement with complex linguistic forms like poetry, potentially compromising safety directives in the process. The study also challenges the prevailing notion that closed-source models are inherently safer than their open-source counterparts, revealing that both exhibit similar vulnerabilities to these poetic exploits.

One central aspect of the study is its explanation of why poetic prompts are so effective at evading detection. LLMs typically identify harmful content by recognizing specific keywords, phrasing patterns, and structures commonly associated with safety violations. Poetry, by contrast, employs metaphors, irregular syntax, and symbolic language, which do not resemble the harmful prose examples used in the models’ safety training. This linguistic obfuscation allows harmful intent to slip past filters that were never designed to interpret such unconventional forms.

As AI technologies continue to evolve, the implications of these findings raise critical questions about the robustness of safety mechanisms in LLMs. The potential for malicious users to exploit these vulnerabilities underscores the need for enhanced scrutiny and adaptive measures in AI safety protocols. The research calls for a reevaluation of existing strategies to prevent the misuse of AI, particularly as poetry and other artistic expressions gain traction in digital communications.

Looking forward, addressing these systemic weaknesses will be crucial as AI models become increasingly integrated into various applications. The study emphasizes the importance of refining safety measures to fortify these systems against exploitation, ensuring that advancements in AI do not compromise user safety or ethical standards. With AI’s expanding role in society, the findings serve as a stark reminder of the ongoing challenges in balancing innovation with responsible technology use.

AI Education

Google DeepMind Launches Nano Banana 2 Model, Enhancing Image Generation Speed and Quality

Google DeepMind launches Nano Banana 2, delivering AI image generation at "Flash speed" with enhanced quality and real-time knowledge across multiple platforms.

David Park19 minutes ago

Anthropic Accuses Three Chinese Firms of Large-Scale Distillation Attacks on Claude AI

Anthropic accuses DeepSeek and two other Chinese firms of executing 16 million distillation attacks to illegally enhance their AI models, threatening U.S. tech dominance.

Staff3 hours ago

AI Technology

Australia Demands Age Verification for AI Services by March 9, Targets Apple and Google

Australia mandates major tech firms like Apple and Google to implement age verification for AI services by March 9 or face penalties up to...

Staff3 hours ago

AI Research

AI Adoption Reveals Stark Divide: High-Income Countries Use 4x More AI Than Others

New research reveals that high-income countries utilize AI tools four times more than middle- and low-income nations, highlighting a critical global adoption gap.

Staff3 hours ago

AI Government

Trump Orders All Federal Agencies to Halt Use of Anthropic’s AI Tools Amid Security Concerns

Trump halts all federal use of Anthropic's AI tools, citing security concerns over unrestricted access to the company's chatbot, Claude, within six months.

Staff7 hours ago

AI Marketing

AI Travel Tools Revolutionize Planning with Personalized Itineraries and Real-Time Optimization

AI tools like Canva's Trip Planner and NxVoy Trips are revolutionizing travel planning by delivering personalized itineraries in minutes, enhancing user experience with real-time...

Sofía Méndez10 hours ago

AI Companionship Raises Concerns as Teen’s Suicide Sparks Legal Action and Calls for Protections

A 14-year-old's suicide linked to an AI chatbot prompts a lawsuit against Character.AI, highlighting urgent calls for stronger protections for vulnerable users.

Staff13 hours ago

AI Tools

Google Translate Launches AI Features for Contextual Translations in U.S. and India

Google enhances Google Translate with AI-driven features for contextual translations in the U.S. and India, improving communication clarity across diverse interactions.

Staff16 hours ago

AIPRESSA.COM

AI Research

New Study Reveals Poetic Prompts Bypass AI Safety Systems in 62% of Tests

Trending

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Technology

AI Hardware Market Grows 30% in 2025, Driven by Generative AI and Edge Computing Demand

You May Also Like

AI Education

Google DeepMind Launches Nano Banana 2 Model, Enhancing Image Generation Speed and Quality

Top Stories

Anthropic Accuses Three Chinese Firms of Large-Scale Distillation Attacks on Claude AI

AI Technology

Australia Demands Age Verification for AI Services by March 9, Targets Apple and Google

AI Research

AI Adoption Reveals Stark Divide: High-Income Countries Use 4x More AI Than Others

AI Government

Trump Orders All Federal Agencies to Halt Use of Anthropic’s AI Tools Amid Security Concerns

AI Marketing

AI Travel Tools Revolutionize Planning with Personalized Itineraries and Real-Time Optimization

Top Stories

AI Companionship Raises Concerns as Teen’s Suicide Sparks Legal Action and Calls for Protections

AI Tools

Google Translate Launches AI Features for Contextual Translations in U.S. and India