Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Fuzzing has emerged as the most broadly used testing technique to discover bugs. Effective fuzzers rely on coverage to prioritize inputs that exercise new program areas. Edge-based code coverage of the Program Under Test (PUT) is the most commonly used coverage today. It is cheap to collect-a simple counter per basic block edge suffices. Unfortunately, edge coverage lacks context information: it exclusively records how many times each edge was executed but lacks the information necessary to trace actual paths of execution. Our new fuzzer Arvin gathers probabilistic full traces of PUT executions to construct Dynamic Control Flow Graphs (DCFGs). These DCFGs observe a richer set of program behaviors, such as the "depth" of execution, different paths to reach the same basic block, and targeting specific functions and paths. Prioritizing the most promising inputs based on these behaviors improves fuzzing effectiveness by increasing the diversity of explored basic blocks. Designing a DCFG-aware fuzzer raises a key challenge: collecting the required information needs complex instrumentation which results in performance overheads. Our prototype approximates DCFG and enables lightweight, asynchronous coordination between fuzzing processes, making DCFG-based fuzzing practical. By approximating DCFGs, Arvin is fast, resulting in at least an eight-fold increase in fuzzing speed. Because it effectively prioritizes inputs using methods like depth comparison and directed exclusion, which are unavailable to other fuzzers, it finds bugs missed by others. We compare its ability to find bugs using various Linux programs and discover 50 bugs, 23 of which are uniquely found by Arvin.