AI Generative

Stable Diffusion Unveils Cantonese Embroidery Image Generator with 2048×2048 Resolution in 50 Seconds

Researchers unveil a Stable Diffusion model generating Cantonese embroidery images at 2048×2048 resolution in just 50 seconds, enhancing cultural preservation.

Staff

Published

1 February, 2026

A team of researchers has developed a groundbreaking image generation process that captures the intricate artistry of **Cantonese embroidery** using advanced technology from **Stable Diffusion**. This innovative approach leverages a specially trained model to replicate the unique textures and features of traditional embroidery, creating a seamless workflow integrated into the **ComfyUI** platform.

The project commenced with the establishment of a comprehensive dataset comprising 494 high-definition images of Cantonese embroidery, sourced from reputable entities like **Guangzhou Embroidery Craft Factory Co., Ltd.** and municipal-level representative **Wang Xinyuan**. Despite the artistic value of these works, the researchers encountered challenges in image quality due to the conditions under which they were photographed—primarily in indoor exhibition halls with variable lighting. As a result, the images underwent rigorous post-processing to address issues like uneven exposure and semantic redundancy.

To enhance the dataset’s reliability, images were meticulously annotated using the **WD14-tagger framework**, which involved a combination of automated and manual label corrections. These efforts ensured that the semantic representations accurately captured the complexities of the embroidery’s themes, such as distinguishing between different bird species or floral elements. The final dataset encompasses eight primary categories, including flowers, animals, and landscapes, further divided into 45 subcategories.

Utilizing **LoRA** low-rank adaptation technology, the researchers trained an adapter model, referred to as **gx_lora3.safetensors**, which specifically incorporates the texture features of Cantonese embroidery. This model was integrated with the **Stable Diffusion** framework, allowing it to generate high-quality images that maintain the intricate details characteristic of traditional embroidery. The advanced hardware setup, featuring a **32-core AMD EPYC 7542 processor** and an **NVIDIA GeForce RTX 4090D** graphics card, enables the generation of images at a resolution of 2048×2048 in approximately 50 seconds.

Technical Insights

The image generation pipeline employs a two-pronged constraint method to guide the denoising process during image creation. The first constraint utilizes the **Segment Anything Model (SAM)** for semantic segmentation, which allows for precise delineation of complex regions within input images. This segmentation produces binary masks that ensure enhanced focus during the denoising process, minimizing noise interference in target regions. Early in the diffusion stages, these masks primarily act to confine modifications, while later stages leverage them to refine details in conjunction with additional guidance.

The second constraint is facilitated through **ControlNet**, which enhances geometric structure and color fidelity by encoding depth, line art, and color information from the input image. This integration ensures that the generated images closely align with the original in terms of both spatial composition and color distribution, successfully merging artistic intent with technological capabilities. The researchers found that varying the intensity of these controls according to the complexity of the subject matter improves the model’s performance, allowing for nuanced interpretations of traditional designs.

In a series of experiments, the team tested different LoRA configurations and discovered that the optimal balance was achieved using the model from the 8th training epoch combined with a weight of 0.9. This configuration produced images that not only preserved the artistic integrity of Cantonese embroidery but also exhibited natural color transitions and detailed textures. Images generated under this setting demonstrated a clear floral morphology while avoiding excessive stylization, which had previously led to distortion of the embroidery’s intricate details.

The research indicates that the integration of advanced AI methodologies can significantly enhance the preservation and reproduction of traditional art forms. By combining historical craftsmanship with cutting-edge technology, the team has opened new avenues for digital representation in the realm of cultural heritage. As these techniques continue to evolve, they promise to foster greater appreciation for Cantonese embroidery and potentially other forms of traditional artistry, breathing new life into cultural expressions for future generations.

AI Technology

NVIDIA Reveals ComfyUI Update, Enhancing Local AI Video Generation with 2.5x Performance Boost

NVIDIA unveils ComfyUI update with 2.5x performance boost for local AI video generation on RTX GPUs, streamlining workflows for artists and developers.

Staff11 March, 2026

AI Generative

Black Forest Labs Reveals Self-Flow Technique, Boosts Multimodal AI Training Efficiency by 2.8x

Black Forest Labs launches Self-Flow, achieving 2.8x faster multimodal AI training with innovative self-distillation techniques, revolutionizing generative models.

Staff6 March, 2026

AI Technology

DG Matrix Secures $60M Series A to Advance Power Infrastructure for AI and Electrification

DG Matrix secures $60M in Series A funding to accelerate deployment of its groundbreaking multi-port solid-state transformer for AI data centers and electrification.

Staff21 February, 2026

BAIC Yuanjing Launches AI Initiative Amid $3.4B Losses to Secure Competitive Edge

BAIC launches AI subsidiary Yuanjing with a modest 5M yuan investment amid 3.4B yuan losses, aiming to enhance its automotive AI capabilities.

Staff14 February, 2026

Leonardo.ai, Midjourney, and Stable Diffusion: 2026’s Top AI Image Generators Ranked

Leonardo.ai, with over 55 million creators, emerges as a leading AI image generator in 2026, offering unique controls that cater to both indie developers...

Staff12 February, 2026

AI Technology

Beijing Zhilian Technology Secures 40 Million Yuan to Advance Optical Switching Chips

Beijing Zhilian Technology secures 40 million yuan in angel funding to accelerate mass production of its groundbreaking optical switching chips for AI networks.

Staff11 February, 2026

AI Cybersecurity

Gary Marcus Raises Alarms Over Security Risks in Open-Source AI Tools MoltBook and OpenClaw

Gary Marcus warns that popular open-source AI tools MoltBook and OpenClaw expose serious security vulnerabilities, risking enterprise operations and sensitive data.

Rachel Torres7 February, 2026

AI Generative

ComfyUI Simplifies Local AI Image Generation with Streamlined Installation Steps

ComfyUI streamlines local AI image generation with a one-click installation for NVIDIA and AMD GPUs, enhancing accessibility for beginners and creative professionals.

Staff1 February, 2026

AIPRESSA.COM

AI Generative

Stable Diffusion Unveils Cantonese Embroidery Image Generator with 2048×2048 Resolution in 50 Seconds

Technical Insights

Trending

AI Cybersecurity

Endpoint Security Market to Reach $23.9B by 2030 with 7.2% CAGR Amid Rising Cyber Threats

Top Stories

Albania Appoints AI Bot Minister Diella Amid Corruption Concerns and EU Membership Goals

AI Government

BigBear.ai Launches Biometric Platform at O’Hare, Acquires Generative AI Ask Sage for $250M

AI Technology

AI Hardware Market Grows 30% in 2025, Driven by Generative AI and Edge Computing Demand

AI Business

Enterprise Architecture Shifts to Strategic Enabler in AI-Driven Business Models

You May Also Like

AI Technology

NVIDIA Reveals ComfyUI Update, Enhancing Local AI Video Generation with 2.5x Performance Boost

AI Generative

Black Forest Labs Reveals Self-Flow Technique, Boosts Multimodal AI Training Efficiency by 2.8x

AI Technology

DG Matrix Secures $60M Series A to Advance Power Infrastructure for AI and Electrification

Top Stories

BAIC Yuanjing Launches AI Initiative Amid $3.4B Losses to Secure Competitive Edge

Top Stories

Leonardo.ai, Midjourney, and Stable Diffusion: 2026’s Top AI Image Generators Ranked

AI Technology

Beijing Zhilian Technology Secures 40 Million Yuan to Advance Optical Switching Chips

AI Cybersecurity

Gary Marcus Raises Alarms Over Security Risks in Open-Source AI Tools MoltBook and OpenClaw

AI Generative

ComfyUI Simplifies Local AI Image Generation with Streamlined Installation Steps