AI Generative

Perplexity Launches DRACO Benchmark for Evaluating AI Research Accuracy and Completeness

Perplexity unveils the DRACO Benchmark, an open standard for evaluating AI research accuracy, informed by millions of real user queries across ten domains.

Staff

Published

4 February, 2026

Perplexity has launched the Deep Research Accuracy, Completeness, and Objectivity (DRACO) Benchmark, positioning it as an open standard designed to assess the capabilities of AI agents in executing complex research tasks. This benchmark is now publicly accessible, enabling AI developers, researchers, and organizations across the globe to evaluate their systems. The DRACO Benchmark is informed by real-world scenarios, sourcing tasks from millions of actual production queries submitted to Perplexity Deep Research. It spans ten diverse domains, including Law, Medicine, Finance, and Academic research, and features detailed evaluation rubrics refined through expert review.

In a recent announcement via social media, Perplexity stated, “We’ve upgraded Deep Research in Perplexity. Perplexity Deep Research achieves state-of-the-art performance on leading external benchmarks, outperforming other deep research tools on accuracy and reliability.” The upgraded features are available for Max users now and will be rolled out to Pro users in the coming days.

The DRACO Benchmark evaluates AI agents across four critical dimensions: factual accuracy, analytical breadth and depth, presentation quality, and citation of sources. Notably, the evaluation process employs an LLM-as-judge protocol, ensuring that responses are fact-checked against real data, thereby minimizing subjectivity. Unlike previous benchmarks that often relied on synthetic or academic tasks, DRACO aims to focus on genuine user needs while remaining model-agnostic, allowing assessments of any AI system with research capabilities. Early results suggest that Perplexity Deep Research excels in both accuracy and speed, particularly in challenging domains such as legal inquiries and personalized queries.

Perplexity, the firm behind the DRACO initiative, is well-regarded for its AI-driven search and research tools. By open-sourcing DRACO, the company seeks to elevate the standards for deep research agents and foster broader adoption of rigorous, production-grounded evaluation methods within the AI industry. This move reflects an ongoing trend among AI developers and researchers to establish more robust metrics for evaluating AI capabilities, particularly as these technologies become increasingly integrated into various fields.

As AI systems gain traction in handling complex research tasks, the need for standardized evaluation metrics becomes ever more pressing. The DRACO Benchmark’s focus on real-world scenarios is poised to provide valuable insights into how effectively AI agents can meet user demands. This approach could significantly enhance the way AI performance is assessed and foster advancements in the technology, ensuring that AI tools are not only innovative but also reliable in practical applications.

The launch of the DRACO Benchmark represents a significant step toward improving the accountability and transparency of AI systems. By inviting participation from a global audience of developers and researchers, Perplexity is encouraging a collaborative environment in which best practices can be shared and elevated. As the AI landscape continues to evolve, initiatives like DRACO will play a crucial role in shaping the future of AI research and application, ultimately benefiting users across various sectors.

AI Business

Red Hat Reveals Small Language Models as Key to Scaling Enterprise AI Agents

Red Hat advances enterprise AI with Small Language Models that achieve over 98% validity in structured tasks, prioritizing reliability and data sovereignty.

Marcus Chen3 May, 2026

AI Research

OpenAI’s AI Model Achieves 81.6% Diagnostic Accuracy, Surpassing Human Doctors in ER Tests

OpenAI's o1 model achieves 81.6% diagnostic accuracy in emergency situations, surpassing human doctors and signaling a major shift in medical practice.

Staff3 May, 2026

AI Regulation

Korea Ventures Launches AI Initiative to Enhance Fund Management and Policy Efficiency

Korea Venture Investment Corp. unveils AI-driven fund management systems by integrating Nvidia H200 GPUs to enhance efficiency and support unicorn growth.

Staff3 May, 2026

AI Technology

Apple Raises Mac Mini Price to $799 Amid AI-Driven Supply Shortages

Apple raises Mac mini starting price to $799 amid AI-driven inventory shortages, eliminating the $599 model in response to surging demand for advanced computing.

Staff3 May, 2026

AI Research

IBM Launches Chicago Quantum Hub, Creating 750 AI Jobs and Expanding MIT Research Lab

IBM launches a Chicago Quantum Hub to create 750 AI jobs and expands its MIT partnership to advance quantum computing and AI integration.

Staff3 May, 2026

AI Government

71% of Aussies Use Generative AI, Yet Only 36% Trust Its Implementation, Says Expert

71% of Australian employees use generative AI daily, but only 36% trust its implementation, highlighting urgent calls for better policy frameworks and safeguards.

Staff3 May, 2026

AI Technology

A1 Public Relations Enhances AI Visibility for Entertainment Brands in 2026

A1 Public Relations helps entertainment brands enhance AI visibility in 2026 by integrating structured content and fresh, authoritative media, ensuring they are recognized by...

Staff2 May, 2026

AI Regulation

Academy Confirms AI Performances Ineligible for Oscars Amid Growing Industry Tensions

The Academy of Motion Picture Arts and Sciences bars AI performances from Oscar eligibility, emphasizing human-authored content amid rising industry tensions over generative AI's...

Staff2 May, 2026

AIPRESSA.COM

AI Generative

Perplexity Launches DRACO Benchmark for Evaluating AI Research Accuracy and Completeness

Trending

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Research

Amazon Awards 63 Research Grants to 41 Universities Across 8 Countries for AI Innovation

You May Also Like

AI Business

Red Hat Reveals Small Language Models as Key to Scaling Enterprise AI Agents

AI Research

OpenAI’s AI Model Achieves 81.6% Diagnostic Accuracy, Surpassing Human Doctors in ER Tests

AI Regulation

Korea Ventures Launches AI Initiative to Enhance Fund Management and Policy Efficiency

AI Technology

Apple Raises Mac Mini Price to $799 Amid AI-Driven Supply Shortages

AI Research

IBM Launches Chicago Quantum Hub, Creating 750 AI Jobs and Expanding MIT Research Lab

AI Government

71% of Aussies Use Generative AI, Yet Only 36% Trust Its Implementation, Says Expert

AI Technology

A1 Public Relations Enhances AI Visibility for Entertainment Brands in 2026

AI Regulation

Academy Confirms AI Performances Ineligible for Oscars Amid Growing Industry Tensions