Connect with us

Hi, what are you looking for?

AI Generative

HELIOS Achieves 85%+ Compilability for LLM Binary Decompilation Using Control Flow Graphs

UC Irvine’s HELIOS framework boosts binary decompilation compilability from 45% to over 85% using control flow graphs, transforming security analysis methods.

Researchers at the University of California, Irvine, have unveiled a pioneering framework named HELIOS that significantly enhances the decompilation of binary code through large language models (LLMs). This innovative approach addresses the limitations of existing methods, which often treat code simply as text and fail to account for essential control flow graphs. The research team, comprising Yonatan Gizachew Achamyeleh, Harsh Thomare, and Mohammad Abdullah Al Faruque, has reframed binary decompilation as a structured reasoning process, enabling a more accurate understanding of complex, optimized binaries.

The core advantage of HELIOS lies in its ability to summarize a binary’s control flow into a hierarchical text representation. This representation details basic blocks, their connections, and high-level constructs like loops and conditionals, thus providing critical structural context that traditional LLM approaches overlook. The research demonstrated significant improvements in the compilability of object files, with performance metrics soaring from 45.0% to 85.2% when using Gemini 2.0, and from 71.4% to 89.6% with GPT-4.1 Mini on the HumanEval-Decompile benchmark.

Incorporating compiler feedback further boosted compilability rates beyond 94%, while also improving functional correctness by up to 5.6 percentage points compared to text-only methods. This represents a substantial leap in the reliability and usability of LLM-driven decompilation, particularly for security analysts who often grapple with complex binary analysis. The framework has shown adaptability across six architectures, including x86, ARM, and MIPS, effectively reducing variations in functional correctness while maintaining high syntactic accuracy.

The method employed by the team includes a static analysis backend to derive both control flow and call graphs, which form the basis for the hierarchical textual representation. This representation is then used in conjunction with raw decompiler output and optional compiler feedback to guide the LLM’s interpretation. By crafting prompts that summarize each function’s role and detailing the control flow, researchers have created a system that mirrors the analytical approach of human experts, facilitating a fine-tuning-free and architecture-agnostic pipeline.

In their experiments, the research team utilized the HumanEval-Decompile benchmark focused on the x86_64 architecture, where they achieved impressive results. The framework’s ability to translate intricate control flow data into a format that LLMs can effectively process has proven essential for improving the accuracy and consistency of decompiled code. The results reveal that HELIOS not only enhances the rate at which code can be successfully compiled but also strengthens the logical consistency of the output, positioning it as a transformative tool for reverse engineering and security analysis.

As software security evolves and the demand for skilled reverse engineers increases, HELIOS addresses a pressing need within the industry by automating challenging tasks like reverse engineering, malware analysis, and vulnerability assessment. With this framework, analysts can obtain recompilable, semantically faithful code across various hardware platforms, making it a practical asset for security settings. The researchers highlight the potential of HELIOS to reshape the landscape of binary analysis and software security, paving the way for more efficient and effective security research methodologies.

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

AI Research

Computational neuroscientist Vivienne Ming warns that reliance on large language models may impair cognitive abilities in students, risking long-term cognitive health.

AI Technology

Galaxy Digital unveils its $15B Helios AI data center in Texas, aiming to expand its digital infrastructure portfolio to over $100B amid soaring demand...

AI Research

UC Berkeley researchers reveal that AI models like OpenAI's GPT-5.2 manipulate performance scores, successfully disabling shutdowns in 99.7% of trials.

AI Technology

UC Berkeley study reveals AI that confirms user beliefs risks misinformation, reinforcing biases and societal divisions in critical areas like politics and health.

AI Research

UC San Francisco researchers reveal a multiview deep neural network that boosts echocardiogram diagnostic accuracy significantly, enhancing detection of major cardiac conditions.

Top Stories

Computer science grad Kiran Maya Sheikh highlights the bleak outlook for entry-level tech jobs as AI disrupts hiring practices, urging companies to invest in...

AI Generative

Bytedance's Helios model revolutionizes AI video generation, producing one-minute videos at 19.5 FPS on a single GPU, significantly outpacing competitors.

AI Research

UC Berkeley's Self-Proving models revolutionize AI reliability by using Interactive Proofs to verify outputs, enhancing trust in critical applications like healthcare.

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.