DecompileBench: A Comprehensive Benchmark for Evaluating Decompilers in Real-World Scenarios

ACL 2025, download

Zeyu Gao , Yuxin Cui , Hao Wang , Siliang Qin , Yuanda Wang , Zhang Bolun , Chao Zhang .

Abstract

Decompilers are fundamental tools for critical security tasks, from vulnerability discovery to malware analysis, yet their evaluation remains fragmented. Existing approaches primarily focus on syntactic correctness through synthetic micro-benchmarks or subjective human ratings, failing to address real-world requirements for semantic fidelity and analyst usability. We present DecompileBench, the first comprehensive framework that enables effective evaluation of decompilers in reverse engineering workflows through three key components: real-world function extraction (comprising 23,400 functions from 130 real-world programs), runtime-aware validation, and automated human-centric assessment using LLM-as-Judge to quantify the effectiveness of decompilers in reverse engineering workflows. Through a systematic comparison between six industrial-strength decompilers and six recent LLM-powered approaches, we demonstrate that LLM-based methods surpass commercial tools in code understandability despite 52.2% lower functionality correctness. These findings highlight the potential of LLM-based approaches to transform humancentric reverse engineering. We open source DecompileBench 1 to provide a framework to advance research on decompilers and assist security experts in making informed tool selections based on their specific requirements.

Share on

Twitter Facebook LinkedIn

NISL@THU

Abstract

Share on