Funded PhD (3 years) in Program Analysis
Static Analysis and Refinement of Dynamic Data Structures
Most programming languages offer Application Programming Interfaces (API) providing ready-to-use abstract data structures (e.g. sets, maps, stacks, queues, etc.).These API’s, are known under the name of containers or collections, and provide off-the-shelf libraries of high level operations, such as insertion, deletion and membership checks. Container libraries are indeed available in almost all languages. These libraries give software developpers a way of abstracting from low-level implementation details related to memory management, such as dynamic allocation, deletion and pointer handling. However, the implementations of these libraries use optimized low-level datastructures and algorithms for heap memory management (e.g. skip lists, red-back trees, or overlaid linked lists). To ensure correctness of software systems manipulating complexdata structures, two essential issues must be considered: (1) the correctness of programs implementing applications, assuming high level specifications of the methods provided by the external container libraries they use, and (2) the correctness of the implementations of such libraries with respect to their abstract specifications, i.e. the fact that a specific implementation of containers ensures the expected abstract behaviour. The main aim of the PhD topics is to use static analysis techniques to show the point (2) above.
Indeed, recent works 1 in static analysis and more specifically abstract interpretation techniques provide means to build abstract interpreters which are able to soundly infer properties in a large class of logics (Presburger 1, first order logic over words of integers 2, separation logic 2, tree-valued logic 1, etc.). A crucial point in this approach is the definition of suitable logic-based specification formalisms which has also an efficient symbolic representation. Indeed, such a logic is used both to specify and to reason about sets of program states (i.e. configurations of its memory) and relations between these states. The logics studied must be investigated from the point of view ofexpressive power, decidability, and complexity, by considering several key problems that constitute the basic ingredients of any automated verification approach, such as encoding of the program statements, satisfiability, and entailment checking. In general, specifying complex memory configurations requires the use of expressive formalisms (for describing shapeand data constraints), based for instance on Separation Logic with Recursive Definitions (SL+RD) or First-Order Logic with Transitive Closure (FOL+TC), for which the decidabilityof the problems mentioned above does not hold in general. The challenge is therefore to identify logics offering good trade-offs between expressiveness, decidability and complexity, and allowing to deal with significant and practically relevant classes of programs and specifications.
for any kind of specification logic The approach to be studied is
1 T. Reps and A. Thakur, Automating Abstract Interpretation, VMCAI'16
2 A. Bouajjani, C. Dragoi, C. Enea, M. Sighireanu, On Inter-Procedural Analysis of Programs with Lists and Data, PLDI'11.
Students interested in this topic and in the position should send a CV and a letter of motivation with detailed list of competencies to Mihaela Sighireanu before June 15th, 2016.
The requested background is in theoretical computer science (logics, automata, algorithmic and complexity). Students with basic knowledge in verification, static analysis and compilation are strongly encouraged to apply. Good knowledge of at least one programming language is an asset.