Researchers at the University of California, Irvine, have unveiled a pioneering framework named HELIOS that significantly enhances the decompilation of binary code through large language models (LLMs). This innovative approach addresses the limitations of existing methods, which often treat code simply as text and fail to account for essential control flow graphs. The research team, comprising Yonatan Gizachew Achamyeleh, Harsh Thomare, and Mohammad Abdullah Al Faruque, has reframed binary decompilation as a structured reasoning process, enabling a more accurate understanding of complex, optimized binaries.
The core advantage of HELIOS lies in its ability to summarize a binary’s control flow into a hierarchical text representation. This representation details basic blocks, their connections, and high-level constructs like loops and conditionals, thus providing critical structural context that traditional LLM approaches overlook. The research demonstrated significant improvements in the compilability of object files, with performance metrics soaring from 45.0% to 85.2% when using Gemini 2.0, and from 71.4% to 89.6% with GPT-4.1 Mini on the HumanEval-Decompile benchmark.
Incorporating compiler feedback further boosted compilability rates beyond 94%, while also improving functional correctness by up to 5.6 percentage points compared to text-only methods. This represents a substantial leap in the reliability and usability of LLM-driven decompilation, particularly for security analysts who often grapple with complex binary analysis. The framework has shown adaptability across six architectures, including x86, ARM, and MIPS, effectively reducing variations in functional correctness while maintaining high syntactic accuracy.
The method employed by the team includes a static analysis backend to derive both control flow and call graphs, which form the basis for the hierarchical textual representation. This representation is then used in conjunction with raw decompiler output and optional compiler feedback to guide the LLM’s interpretation. By crafting prompts that summarize each function’s role and detailing the control flow, researchers have created a system that mirrors the analytical approach of human experts, facilitating a fine-tuning-free and architecture-agnostic pipeline.
In their experiments, the research team utilized the HumanEval-Decompile benchmark focused on the x86_64 architecture, where they achieved impressive results. The framework’s ability to translate intricate control flow data into a format that LLMs can effectively process has proven essential for improving the accuracy and consistency of decompiled code. The results reveal that HELIOS not only enhances the rate at which code can be successfully compiled but also strengthens the logical consistency of the output, positioning it as a transformative tool for reverse engineering and security analysis.
As software security evolves and the demand for skilled reverse engineers increases, HELIOS addresses a pressing need within the industry by automating challenging tasks like reverse engineering, malware analysis, and vulnerability assessment. With this framework, analysts can obtain recompilable, semantically faithful code across various hardware platforms, making it a practical asset for security settings. The researchers highlight the potential of HELIOS to reshape the landscape of binary analysis and software security, paving the way for more efficient and effective security research methodologies.
See also
Sam Altman Praises ChatGPT for Improved Em Dash Handling
AI Country Song Fails to Top Billboard Chart Amid Viral Buzz
GPT-5.1 and Claude 4.5 Sonnet Personality Showdown: A Comprehensive Test
Rethink Your Presentations with OnlyOffice: A Free PowerPoint Alternative
OpenAI Enhances ChatGPT with Em-Dash Personalization Feature
















































