Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Program synthesis was first proposed a few decades ago, but in the last decade it has gained increased momentum in the research community. The increasing complexity of software has dictated the urgent need for improved supporting tools that verify the software's correctness, and that automatically generate code from a formal contract provided by the programmer, along with a proof of the generated code's correctness. In addition, recent technological developments have provided tools that have enabled researchers to revisit the synthesis problem. The recent rise of SMT solvers has given synthesis tools a reliable and automated way to verify synthesized programs against contracts. The introduction of counter-example guided inductive synthesis has provided researchers with a flexible synthesis algorithm that they can adapt according to their specific domain. In this dissertation, we develop new algorithms to synthesize recursive functional programs with algebraic data types from formal specifications and/or input-output examples. We manage to scale beyond the reach of other similar tools to synthesize nontrivial functional programs, with a focus on data structure transformations. First, we address the problem of precisely specifying the desired space of candidate programs, described by context free grammars (CFGs). We implement and evaluate a method for reducing the program space by describing axioms of the target language and other domain-specific restrictions on the level of the CFG, without explicitly generating and rejecting undesirable programs. We provide a method that extracts a program model from a corpus of code and that builds a probabilistic CFG from it. We showcase the usefulness, both individually and in tandem, of these methods. Second, we develop an algorithm to efficiently traverse a possibly unbounded space of candidate programs generated from a probabilistic CFG. This algorithm is an implementation of the A* best-first search algorithm on the derivation graph generated from the CFG, with a number of domain-specific optimizations. We evaluate the efficiency of the algorithm as well as the effectiveness of the optimizations. Finally, we describe a program repair framework that locates and fixes bugs in erroneous functional programs. Our novel fault localization technique detects erroneous snippets with concrete execution and eliminates false positives by analyzing dependencies between execution traces. After the erroneous code snippet is discovered, a modified version of our synthesis algorithm generates fixes for it by introducing modifications to the original erroneous code.
Felix Schürmann, Pramod Shivaji Kumbhar, Omar Awile, Ioannis Magkanaris