Connect with us

Hi, what are you looking for?

AI Generative

Meituan Launches 6B LongCat-Image Model, Redefining Bilingual Image Generation

Meituan unveils the 6 billion parameter LongCat-Image model, setting a new standard for bilingual image generation with photorealistic outputs and exceptional text rendering.

In late 2025, Meituan, China’s leading super-app for food delivery, travel, and local services, made headlines in the AI community by open-sourcing LongCat-Image, a groundbreaking bilingual (Chinese-English) foundation model designed for text-to-image generation and editing. With just 6 billion parameters, the lightweight diffusion-based model delivers photorealistic outputs, exceptional text rendering, and production-ready performance that rivals or even exceeds that of much larger open-source and proprietary systems.

Developed by Meituan’s dedicated LongCat AI team, LongCat-Image emerges amidst fierce competition in multimodal AI from Chinese tech giants such as Alibaba, Tencent, and ByteDance. It distinguishes itself by prioritizing efficiency, real-world usability, and multilingual capability over sheer parameter count — a pivot away from the prevailing notion that “bigger is better” in the realm of AI image generation.

This model, built on a hybrid Multimodal Diffusion Transformer (MM-DiT) architecture, requires approximately 17-18 GB of VRAM for inference, enabling faster generation times and lower deployment costs than models boasting over 20 billion parameters. Benchmarks confirm its prowess: as of December 2025, LongCat-Image ranks second among all open-source models on T2I-CoreBench, trailing only the larger 32 billion parameter Flux2.dev. It achieves state-of-the-art results on various image editing benchmarks and scores an impressive 90.7 on the ChineseWord benchmark, demonstrating high accuracy in rendering all 8,105 standard Chinese characters.

The success of LongCat-Image can be attributed to meticulous data curation rather than brute-force scaling. The LongCat AI team employed rigorous filtering to exclude low-quality and AI-generated images while implementing a multi-stage refinement process that includes pre-training, mid-training, supervised fine-tuning, and reinforcement learning with reward models. A specialized character-level encoding strategy further enhances legibility and fidelity in both languages, particularly when text appears in quotation marks.

LongCat-Image excels in several key areas. It demonstrates superior bilingual text rendering, embedding accurate and stable Chinese and English text into images, which has historically posed challenges for diffusion models. This capability is especially valuable for applications in e-commerce, advertising, and cross-border design workflows. The model also produces studio-grade visuals characterized by believable lighting, accurate textures, and physics-aware object placement, earning high scores in subjective mean opinion tests for realism and compositional quality.

The LongCat family includes several variants: LongCat-Image, the core text-to-image model; LongCat-Image-Edit, designed for instruction-based editing; and LongCat-Image-Dev, which serves mid-training checkpoints for fine-tuning and research. All variants support Diffusers integration, LoRA adapters, ComfyUI workflows, and a full training code release under the Apache 2.0 license, significantly lowering barriers for developers to customize styles and build production pipelines.

Meituan’s focus on real-world applications is evident in the model’s design, which aims for fast, high-resolution output, batch generation for marketing assets, and reliable instruction following without layout drift. Early user feedback from platforms like Reddit highlights LongCat-Image’s editing consistency, which stands in contrast to competitors that often distort characters during refinements.

LongCat-Image is already powering features in Meituan’s own applications and web platforms, underscoring its immediate commercial viability. The release, accompanied by a technical report and resources on platforms like Hugging Face and GitHub, signals Meituan’s ambition to cultivate an open, collaborative AI ecosystem in China. In an industry where many models are closed-source and resource-intensive, LongCat-Image showcases how intelligent architecture and clean data can yield high-performance results at a fraction of the cost. For developers and businesses requiring reliable bilingual image tools, particularly in handling Chinese content, this 6 billion parameter model may well become a new standard in open-source offerings.

Explore it today at: Meituan LongCat-Image or on Hugging Face. With all weights, pipelines, and documentation readily available, the era of accessible, high-fidelity bilingual image AI has received a significant upgrade.

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

AI Technology

Japan and ASEAN partner to develop localized AI solutions, reducing dependence on Chinese technology and enhancing regional digital autonomy.

Top Stories

China limits Nvidia's H200 AI chip purchases to 50% of U.S. sales amid U.S. export policy shift, reshaping global tech competition and supply chains.

Top Stories

DigitalOcean's Inference Cloud Platform, in partnership with AMD, doubles Character.ai's inference throughput and cuts costs per token by 50%, supporting over a billion AI...

Top Stories

xAI tightens Grok's image editing features to block explicit content and protect minors, addressing rising regulatory pressures as AI laws loom.

AI Technology

China plans to regulate imports of Nvidia's H200 AI chips to support domestic semiconductor growth, potentially limiting foreign acquisitions by local firms.

Top Stories

Critical security flaws in Nvidia, Salesforce, and Apple’s AI libraries expose Hugging Face models to remote code execution risks, threatening open-source integrity.

Top Stories

AI-related cheating scandals at South Korean universities threaten reputations and global rankings, with Yonsei University reporting 34 students involved in altered clinical photos.

AI Regulation

UK regulators, led by the CMA and ICO, prioritize fostering AI innovation through regulatory sandboxes while addressing competition concerns and public safety.

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.