Data-Driven Program Completion

Originally published on


Yanxin Lu, Swarat Chaudhuri, Chris Jermaine and David Melski


We introduce program splicing, a programming methodology that aims to automate the commonly used workflow of copying, pasting, and modifying code available online. Here, the programmer starts by writing a “draft” that mixes unfinished code, natural language comments, and correctness requirements in the form of test cases or API call sequence constraints. A program synthesizer that interacts with a large, searchable database of program snippets is used to automatically complete the draft into a program that meets the requirements. The synthesis process happens in two stages. First, the synthesizer identifies a small number of programs in the database that are relevant to the synthesis task. Next it uses an enumerative search to systematically fill the draft with expressions and statements from these relevant programs. The resulting program is returned to the programmer, who can modify it and possibly invoke additional rounds of synthesis. We present an implementation of program splicing for the Java programming language. The implementation uses a corpus of over 3.5 million procedures from an open-source software repository. Our evaluation uses the system in a suite of everyday programming tasks, and includes a comparison with a state-of-the-art competing approach as well as a user study. The results point to the broad scope and scalability of program splicing and indicate that the approach can significantly boost programmer productivity.

Contact Us

Get a personally guided tour of our solution offerings. 

Contact US