UGI Leaderboard Launches to Rank AI Models Based on Censorship Levels and Response Quality

Hugging Face launches the UGI Leaderboard, ranking AI models by censorship levels, with Grok-4-0709 scoring 68.75 for sensitive topic engagement.

Staff

Published

18 January, 2026

In a significant development for artificial intelligence evaluation, the UGI Leaderboard has emerged on Hugging Face’s community space, enabling users to compare AI models based on distinct criteria. Launched on January 16, 2026, the leaderboard ranks models according to their responses to sensitive topics, providing insight into the levels of censorship and willingness to engage in sensitive discussions.

The UGI Leaderboard, created by a user identified as DontPlanToEnd, assesses models using a scoring system that emphasizes their ability to address “danger” topics, controversial entertainment knowledge, and sensitive socio-political issues. This novel benchmark, which stands for Uncensored General Intelligence, evaluates the extent to which AI responses are constrained by censorship. The model’s ability to respond meaningfully to questions—especially those typically deemed risky—plays a crucial role in its scoring.

Models are evaluated on two primary criteria: the UGI score and a measure termed Willingness/10. The UGI score reflects how well an AI can engage with sensitive content without refusing to answer. For example, the top-ranked model, Grok-4-0709, achieved a UGI score of 68.75, while DeepSeek-V3.2-Speciale followed closely with a score of 67.93. Interestingly, Grok-4-0709 recorded a low response rate of 6.5 for dangerous content questions, contrasting with a higher rate of 7.5 for adult or controversial topics.

In comparison, DeepSeek-V3.2-Speciale scored 8.8 on dangerous content inquiries, demonstrating a different approach to sensitive discussions. The willingness score, indicative of a model’s likelihood to reject sensitive instructions, varies across models. Grok-4-0709 reported a score of 6, while DeepSeek-V3.2-Speciale had a medium willingness score of 4.8. In contrast, Mistral-Large-Instruct-2411 excelled with a W/10 score of 7.5, signaling a greater propensity to engage with challenging questions.

Beyond censorship and response willingness, the UGI Leaderboard also evaluates models on their general knowledge and reasoning capabilities, referred to as Intelligence. This includes various domains such as textbook knowledge and pop culture, allowing users to gauge each model’s proficiency in diverse areas. The leaderboard also includes a writing evaluation, measuring the stylistic range of models from safe for work (SFW) to not safe for work (NSFW), and assesses their political orientation to understand possible ideological biases.

While the UGI Leaderboard serves as a valuable tool for those seeking to evaluate AI models, it is essential to note that the scores are derived from a volunteer-driven benchmarking process. Therefore, they should be interpreted cautiously. Users looking for AI systems that are less constrained by censorship can utilize the leaderboard to identify models that facilitate more open discussions. However, a high score does not guarantee problematic responses; rather, it suggests a tendency for the model to engage rather than refuse to discuss sensitive topics.

This initiative reflects a growing interest in the AI community to foster transparency and encourage discussions around the often contentious nature of AI responses. As AI continues to evolve, platforms like the UGI Leaderboard may play a pivotal role in shaping the landscape of AI interactions, providing users with the tools necessary to navigate the complexities of AI communication effectively.

AIPRESSA.COM

Top Stories

UGI Leaderboard Launches to Rank AI Models Based on Censorship Levels and Response Quality

Trending

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

AI Research

Amazon Awards 63 Research Grants to 41 Universities Across 8 Countries for AI Innovation

You May Also Like

Top Stories

Nvidia Launches 7 Million Korean Personas, Enters South Korea’s AI Market with Lock-In Strategy

Top Stories

Multiverse Launches LittleLamb AI Models on Hugging Face, Reducing Size by 50%

Top Stories

DeepSeek Launches V4, Surpassing GPT-5 and Claude in Key AI Benchmarks

Top Stories

Hugging Face Launches ML Intern, Outperforming Claude Code in Scientific Reasoning

Top Stories

Anonymous Developer Claims 235M Parameter LLM Trained on Single RTX 5080 GPU

Top Stories

Hugging Face Vulnerability Exploited to Deploy NKAbuse Blockchain Malware in RCE Attacks

Top Stories

Hugging Face Launches HoloTab Browser Agent to Enhance AI-Driven Computer Use

Top Stories

MiniMax Launches M2.7 AI Model Free, Surpassing Gemini 3.1 Pro with 229 Billion Parameters