Since the launch of ChatGPT-3.5 by OpenAI in November 2022, generative AI has rapidly entered the mainstream, propelling the use of AI chatbots into the public consciousness. With a remarkable milestone of surpassing 100 million monthly active users within just months, ChatGPT became synonymous with AI chat technology. However, a recent study conducted by the British firm Prolific has stirred the waters by ranking ChatGPT as only the eighth best AI chatbot, trailing behind various competitors including Google’s Gemini, DeepSeek, and Mistral.
Study Context and Methodology
Prolific’s study, which introduces a new benchmark called Humaine, aims to evaluate AI chatbots based on metrics that matter most to users. Unlike previous evaluations that heavily relied on technical benchmarks, Humaine focuses on aspects such as conversational understanding, clarity of answers, adaptability in discussions, and overall trustworthiness. The study involved around 25,000 participants who compared chatbots in head-to-head matchups, assessing their performance across four main metrics:
- Core Task Performance & Reasoning
- Interaction Fluidity & Adaptiveness
- Communication Style & Presentation
- Trust, Ethics & Safety
Results of the Humaine Study
The results have placed Gemini 2.5 Pro from Google at the top, followed closely by DeepSeek v3 and Mistral Medium. The leaderboard reveals that ChatGPT-4.1, despite its popularity, only managed an eighth-place finish. The complete rankings are:
- Gemini 2.5 Pro (Google)
- DeepSeek v3 (DeepSeek)
- Magistral Medium (Mistral AI)
- Grok 4 (xAI)
- Grok 3 (xAI)
- Gemini 2.5 Flash (Google)
- DeepSeek R1 (DeepSeek)
- ChatGPT-4.1 (OpenAI)
- Gemma (Google)
- Gemini 2.0 Flash (Google)
Understanding ChatGPT’s Lower Ranking
This outcome raises the question of why ChatGPT, which boasts around 800 million active users weekly and accounts for nearly 48% of AI chatbot usage, did not perform better. The discrepancy is attributed to the fact that the study’s methodology focused on user experience rather than just performance metrics. Prolific aims to provide insights into user preferences that had been overlooked in prior studies.
Implications for AI Chatbots
The results of the Humaine study indicate a shift in user expectations. Participants valued chatbots that provide human-like conversational experiences, showing adaptability in discussions and ethical responses. Gemini 2.5 Pro not only topped the leaderboard but also demonstrated superior adaptability and communication style, highlighting the need for chatbots to engage users meaningfully.
This study is crucial as it explores the human-facing dimensions of AI, prompting developers to rethink their designs based on user feedback. While ChatGPT remains a formidable player in the market, the findings suggest that competition is intensifying and that user experience should be at the forefront of AI development.
In conclusion, while OpenAI continues to lead in usage and brand recognition, the rankings from the Humaine study reveal that the landscape of AI chatbots is evolving rapidly. Companies aiming to innovate must focus on developing chatbots that resonate with users, fostering trust and engagement.
AI Bubble Fears Intensify as Nvidia’s $4.4T Boost Sparks Market Decline and Volatility
Study Reveals ChatGPT Drops to 8th Place, Surpassed by 7 Competing AI Models
Character.AI Faces Safety Backlash as Experts Warn of Risks for Teen Users
Erdogan Advocates for Global AI Policies and Resource Security at G20 Summit



















































