Connect with us

Hi, what are you looking for?

AI Generative

Bytedance Unveils Helios Model Achieving 19.5 FPS for Real-Time AI Video Generation

Bytedance’s Helios model revolutionizes AI video generation, producing one-minute videos at 19.5 FPS on a single GPU, significantly outpacing competitors.

Helios, a groundbreaking video generation model, has achieved a milestone by producing minute-long videos at 19.5 frames per second (FPS) using a single GPU. This development stands in stark contrast to most current models, which typically generate only 5-to-10-second clips and require considerable time to render them. The code and model weights for Helios are publicly accessible, marking a significant step in the democratization of technology in video generation.

While existing real-time models for longer videos rely on smaller architectures—such as the 1.3 billion parameter models that often compromise on quality—Helios builds on the larger Wan-2.1-14B, which can take around 50 minutes to produce just five seconds of video on an A100 GPU. The training of Helios is divided into three stages: Helios-Base focuses on architecture and mitigating drifting effects, Helios-Mid addresses token compression with a lower speed of 1.05 FPS, and Helios-Distilled maximizes speed by reducing computation steps to just three.

In developer benchmarks, the distilled version of Helios reached an impressive 19.53 FPS, outperforming several smaller distilled models. For instance, the SANA Video Long—despite having only 2 billion parameters—achieved a mere 13.24 FPS. This performance places Helios among the fastest in its category, especially for a model of this size.

Quality metrics for Helios are equally notable. The model scored 6.00 overall for short videos comprising 81 frames, outperforming all distilled models and competing well against larger base models. In longer video assessments, Helios secured a score of 6.94, narrowly surpassing the previous leader, Reward Forcing, which scored 6.88. A user study involving 200 participants corroborated these findings, confirming Helios’s advanced capabilities.

Longer videos have traditionally suffered from issues like quality degradation and drifting artifacts. Helios addresses these challenges through innovative methods such as relative position coding, which prevents repetitive movements, and a first-frame anchor that maintains color consistency. Additionally, a targeted perturbation simulation enhances the model’s resilience to potential errors, thus ensuring that video coherence is preserved over extended durations.

Technical Details

In a notable shift, Helios employs a unified architecture that accommodates text-to-video, image-to-video, and video-to-video generation. The model adeptly transitions between tasks based on available context. When the context is empty, it generates content from text; with just one frame input, it acts as an image animator; and when multiple frames are present, it continues existing video sequences. Even mid-generation, users can swap text prompts, allowing for a gradual transition to prevent visual disruptions.

Helios was trained on 800,000 short video clips, each under ten seconds, producing videos at resolutions of up to 384 x 640 pixels. Despite some flickering artifacts at segment transitions, the model’s overall efficiency is commendable. The research team also created a custom benchmark, termed HeliosBench, to assess real-time long videos, comprising 240 distinct prompts.

Significantly, Helios achieves its impressive performance without relying on conventional acceleration techniques such as KV cache or sparse attention. Instead, it employs aggressive data compression across two levels. A hierarchical memory structure maintains three time scales for video history, enabling recent frames to be processed with lighter compression, while older frames undergo heavier compression. This methodology effectively reduces the number of tokens for processing by a factor of eight.

Moreover, a multi-stage sampling process further streamlines token processing for each video segment. Initial steps operate at lower resolutions, reserving higher resolutions for final detailing, which brings compute costs in line with generating a single image. As a result, Helios minimizes computation steps for each video segment from 50 to just 3, leveraging real video data for context and utilizing an adversarial training objective akin to a GAN to enhance quality beyond the previous model’s limitations.

Helios is available as an open-weight model on platforms like GitHub and Hugging Face, which also hosts a live demonstration. The generated video examples can be found on the project page, although the project is strictly for research purposes and is not intended for integration into any Bytedance products. Bytedance has also recently introduced Seedance 2.0, a multimodal video generation model that processes images, videos, audio, and text simultaneously. Although Seedance delivers superior visual quality, it requires significantly more computational resources and is limited to 15-second clips.

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

AI Generative

ByteDance launches Seedream 5.0 Lite, a cost-effective AI image generator at just $0.035 per image, enhancing creativity for professionals and marketers.

Top Stories

DeepMind's Demis Hassabis warns that memory shortages are hampering AI deployment, while Google's TPUs provide a critical competitive edge in the race for artificial...

AI Generative

ByteDance's Seedance 2.0 redefines video production with native 2K output, advanced character consistency, and intuitive directing features, challenging Sora and Veo 3.

AI Marketing

MrBeast acquires fintech startup Step amid a surge of AI-generated content flooding the creator economy, highlighting significant monetization challenges for influencers.

AI Technology

ByteDance seeks over 100 AI roles in the US to fuel innovation, while Baidu targets 10 semiconductor design positions to bolster hardware capabilities.

AI Generative

Seedance 2.0 launches with a user-friendly platform for creators globally, offering multimodal AI video generation at $9/month, dramatically enhancing content creation efficiency.

AI Research

ByteDance's Seedance 2.0 generates cinema-quality videos from minimal prompts, raising copyright concerns as studios like Disney and Paramount issue cease-and-desist letters.

AI Generative

ByteDance faces legal backlash from Disney over AI-generated video Seedance 2.0 featuring Brad Pitt and Tom Cruise, prompting calls to reinforce intellectual property rights.

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.