In a groundbreaking study, researchers have explored how advanced artificial intelligence (AI) models make decisions in strategic contexts traditionally reserved for human judgment. Conducted by Dmitry Dagaev, head of the Laboratory of Sports Studies at HSE University, along with colleagues from HSE University–Perm and the University of Lausanne, the research delved into the behavior of prominent AI models during the “Guess the Number” game, a modern adaptation of the Keynesian beauty contest. This experiment poses a critical question: when tasked with strategic reasoning, do AI systems think like humans?
The “Guess the Number” game requires participants to select a number between 0 and 100, with the winner being the one whose choice is closest to a designated fraction of the group’s average. Historical studies indicate that human players often deviate from optimal mathematical choices, influenced by cognitive limits and emotional factors. The researchers aimed to determine if AI would follow a similar pattern.
The team assessed five leading language models, including GPT-4o, GPT-4o Mini, Gemini-2.5-flash, Claude-Sonnet-4, and Llama-4-Maverick, across 16 scenarios inspired by classic economic experiments. Scenarios varied in parameters, such as the fraction for the winning number and how participants’ choices were aggregated—through averages, medians, or maximums. Each model acted as a single player, receiving identical instructions and repeating each scenario 50 times without learning from previous rounds, reflecting one-shot experiments typically used with human subjects.
Initial results were promising, with all 4,000 responses adhering to game rules and remaining within the 0 to 100 range. Most explanations demonstrated some level of strategic reasoning, with only 23 instances lacking it. However, when comparing AI choices to human results from previous studies by economist Rosemarie Nagel, significant differences emerged. In situations where the target was half the group average, human participants averaged about 27, while AI models consistently opted for lower values, often approaching zero, which is typically the Nash equilibrium in these scenarios.
Notably, AI behavior diverged based on game structure. For instance, in variations utilizing maximum numbers, both humans and AI tended to choose higher values, yet specific models still displayed marked differences. Claude Sonnet averaged around 35, while Llama opted for significantly lower numbers. “These results show that AI responds to changes in game structure much like people do,” Dagaev noted. However, a crucial gap was identified: in two-player game formats, none of the models recognized that choosing zero is a weakly dominant strategy, opting instead for detailed reasoning about the potential choices of others—contrasting with the formal economic training that often guides human players.
Further analysis revealed distinct behavioral patterns among the models. In pairwise comparisons, algorithms such as GPT-4o and Claude Sonnet generally produced mid-range results, while Gemini Flash fluctuated between cautious and aggressive choices. The team also examined Llama models of various sizes, from 1 billion to 405 billion parameters, discovering that smaller models tended to choose numbers closer to typical human guesses, while larger models gravitated toward theoretical predictions, selecting lower values as their complexity increased.
The researchers also investigated context sensitivity, altering prompts by changing wording or framing the game as a televised contest with emotionally charged opponents. Results indicated that both AI and human players responded similarly to emotional framing; when opponents were described as angry, both groups tended to select higher numbers. However, the overall response structure remained stable across models.
The study’s findings highlight critical insights into AI decision-making in economic contexts. While modern AI demonstrates the ability to recognize strategic settings and adjust behavior accordingly, it often behaves more “rationally” than human participants, typically opting for lower numbers. Yet, its failure to identify simple dominant strategies and its tendency to overestimate the sophistication of others mark notable limitations.
As Dagaev emphasized, “We are now at a stage where AI models are beginning to replace humans in many operations, enabling greater economic efficiency in business processes.” This research underscores the importance of understanding where AI aligns with human behavior and where it diverges, which will ultimately influence the application of these systems in markets, policy-making, and everyday life.
These insights also suggest that if AI models generally anticipate strategic behavior, they may misjudge emotional markets characterized by irrational decision-making. Conversely, their predictive alignment with comparative trends indicates potential utility in forecasting and analysis. For researchers, this study identifies areas in AI that require enhancement, particularly in recognizing straightforward strategic dominance, while for society, it provides crucial guidance on when to place trust in AI decisions versus human judgment.
See also
OpenAI’s Sam Altman Predicts AI Breakthrough with Infinite Memory by 2026
AI Researchers Dalvi and Gawas Unveil Machine Learning Advances for Anemia Detection
DP Technology Raises $114 Million in Series C to Expand AI Tools for Scientific Research


















































