Connect with us

Hi, what are you looking for?

AI Generative

Meituan Launches 6B LongCat-Image Model, Redefining Bilingual Image Generation

Meituan unveils the 6 billion parameter LongCat-Image model, setting a new standard for bilingual image generation with photorealistic outputs and exceptional text rendering.

In late 2025, Meituan, China’s leading super-app for food delivery, travel, and local services, made headlines in the AI community by open-sourcing LongCat-Image, a groundbreaking bilingual (Chinese-English) foundation model designed for text-to-image generation and editing. With just 6 billion parameters, the lightweight diffusion-based model delivers photorealistic outputs, exceptional text rendering, and production-ready performance that rivals or even exceeds that of much larger open-source and proprietary systems.

Developed by Meituan’s dedicated LongCat AI team, LongCat-Image emerges amidst fierce competition in multimodal AI from Chinese tech giants such as Alibaba, Tencent, and ByteDance. It distinguishes itself by prioritizing efficiency, real-world usability, and multilingual capability over sheer parameter count — a pivot away from the prevailing notion that “bigger is better” in the realm of AI image generation.

This model, built on a hybrid Multimodal Diffusion Transformer (MM-DiT) architecture, requires approximately 17-18 GB of VRAM for inference, enabling faster generation times and lower deployment costs than models boasting over 20 billion parameters. Benchmarks confirm its prowess: as of December 2025, LongCat-Image ranks second among all open-source models on T2I-CoreBench, trailing only the larger 32 billion parameter Flux2.dev. It achieves state-of-the-art results on various image editing benchmarks and scores an impressive 90.7 on the ChineseWord benchmark, demonstrating high accuracy in rendering all 8,105 standard Chinese characters.

The success of LongCat-Image can be attributed to meticulous data curation rather than brute-force scaling. The LongCat AI team employed rigorous filtering to exclude low-quality and AI-generated images while implementing a multi-stage refinement process that includes pre-training, mid-training, supervised fine-tuning, and reinforcement learning with reward models. A specialized character-level encoding strategy further enhances legibility and fidelity in both languages, particularly when text appears in quotation marks.

LongCat-Image excels in several key areas. It demonstrates superior bilingual text rendering, embedding accurate and stable Chinese and English text into images, which has historically posed challenges for diffusion models. This capability is especially valuable for applications in e-commerce, advertising, and cross-border design workflows. The model also produces studio-grade visuals characterized by believable lighting, accurate textures, and physics-aware object placement, earning high scores in subjective mean opinion tests for realism and compositional quality.

The LongCat family includes several variants: LongCat-Image, the core text-to-image model; LongCat-Image-Edit, designed for instruction-based editing; and LongCat-Image-Dev, which serves mid-training checkpoints for fine-tuning and research. All variants support Diffusers integration, LoRA adapters, ComfyUI workflows, and a full training code release under the Apache 2.0 license, significantly lowering barriers for developers to customize styles and build production pipelines.

Meituan’s focus on real-world applications is evident in the model’s design, which aims for fast, high-resolution output, batch generation for marketing assets, and reliable instruction following without layout drift. Early user feedback from platforms like Reddit highlights LongCat-Image’s editing consistency, which stands in contrast to competitors that often distort characters during refinements.

LongCat-Image is already powering features in Meituan’s own applications and web platforms, underscoring its immediate commercial viability. The release, accompanied by a technical report and resources on platforms like Hugging Face and GitHub, signals Meituan’s ambition to cultivate an open, collaborative AI ecosystem in China. In an industry where many models are closed-source and resource-intensive, LongCat-Image showcases how intelligent architecture and clean data can yield high-performance results at a fraction of the cost. For developers and businesses requiring reliable bilingual image tools, particularly in handling Chinese content, this 6 billion parameter model may well become a new standard in open-source offerings.

Explore it today at: Meituan LongCat-Image or on Hugging Face. With all weights, pipelines, and documentation readily available, the era of accessible, high-fidelity bilingual image AI has received a significant upgrade.

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

AI Cybersecurity

CyberStrikeAI's emergence has led to 21 active IP addresses exploiting AI-driven cyberattacks, raising urgent concerns for cybersecurity defenses globally.

AI Tools

Discover 39 innovative AI tools like Copy.ai and Jasper that boost productivity and creativity, transforming workflows for professionals across industries.

AI Technology

NVIDIA's stock dips to $179.68 ahead of GTC 2026, sparking investor interest amid projections of a 44.42% price increase following potential chip innovations.

AI Regulation

Jen Gennai of T3 unveils critical strategies for compliance officers to effectively deploy AI tools, ensuring ethical governance and real pain point resolution.

AI Cybersecurity

U.S.-Israel's cyber operation disrupts Iran's defenses, leading to Supreme Leader Khamenei's assassination and reshaping future military strategies.

Top Stories

Amazon's ProServe is transforming the consulting landscape, leveraging AI to drive over $10 billion in annual revenue while reshaping client engagement strategies.

AI Technology

A recent Count on Mothers survey reveals 70% of U.S. moms oppose using AI for student data collection, highlighting deep concerns over children's safety...

AI Regulation

Anthropic's Claude chatbot ascends to No. 1 on Apple’s U.S. App Store, overtaking ChatGPT amid rising consumer demand for ethical AI practices and governance.

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.