University of Leicester

computer science

PhD Position supported by Microsoft Research Fellowship

A PhD studentship is available to work on:

Usage of Static Analysis for Model Checking Liveness Properties of Heap Manipulating Programs

Supervisor: Dr Nir Piterman
Microsoft Supervisor: Dr Josh Berdine

One of the major challenges facing the successful application of model checking in software is the ability to analyze programs that use dynamic memory allocation and sophisticated data structures.
This project aims to investigate the advantages of considering more expressive versions of Kripke structures (such as modal transition systems and disjunctive modal transition systems) as an abstraction domain for shape analysis.
The project has two facets:

  • In the context of static analysis, study the principles for extending the functionality of static analysis tools (such as TVLA) with this ability and the implications regarding preservation of branching-time temporal-logic properties and their automatic extraction.
  • In the context of model checking, study the additional opportunities that are afforded by the augmentation of abstract heap information with fairness information in order to enable model checking of general temporal logic specifications that rely on boundedness and finiteness of heaps.

Package

Stipend of 13,590₤ per annum and Home/EU fee waiver is available for three years.
Additional funding may be available for six more months.
Please note, if you are a national of non-EU country you will be responsible for any monetary difference between International and Home/EU fees.

For the duration of the PhD the student will receive from MSR a laptop, which will be awarded to the student upon successful completion of the PhD.

Students under this scheme are often chosen for an internship of three months in Microsoft Research. During an internship students work with Microsoft researchers on Microsoft projects. Interns in Microsoft are paid separately according to Microsoft standards. There is no guarantee that the student can get an internship in Microsoft.

Criteria

Applicants require a first-class Honours degree or equivalent in Computer Science or Mathematics, experience in programming and aptitude for mathematical subjects. Knowledge in Formal methods, Logic, Static Analysis, or Model Checking, is an advantage. A Masters degree is highly desirable.

Application & Further Details

Further questions can be directed to Nir Piterman.
To apply, send a CV and a cover letter explaining your interest in the post to Nir Piterman.
Screening of applicants is ongoing and will continue until the post is filled.

Further Details about Project

Background

Recent years have seen significant advances in application of software verification. Both model checking [1] and static analysis [2] have seen a huge increase in scope and capacity. This led to the release and marketing of tools such as Microsoft’s SDV for model checking [3] and Coverity’s static analysis [4]. The main challenge in application of model checking to software is in obtaining a finite state abstraction of the program to be checked. The prevalent approach calls for the usage of lazy predicate abstraction [5] using counter-example-guided abstraction refinement [6]. In the context of software verification, static analysis attempts to extract or prove invariants for software programs. Given a program annotated with pre- and post-conditions, static-analysis will try to validate these annotations. In many cases, the goal is to automatically produce annotations when these do not exist. Especially when applied to annotated programs, static analysis scales to huge programs and works extremely fast. In recent years many attempts at combining these two approaches have been made. Invariants extracted by static analyzers are used by software model checkers to improve scalability.

One of the major challenges facing both model checking and static analysis is the analysis of programs that use dynamic memory allocation and complex data structures stored on the heap. Recent years have seen a surge of interest in separation logic [7] as an abstract domain to represent facts about the heap. Here, we concentrate on an alternative approach that uses three-valued graphs to represent structures on the heap, as used by, for example, in the tool TVLA [8]. To capture structures on the heap, nodes in the graph relate to allocated memory locations, labels of nodes relate to pointers and additional instrumentation predicates, and edges relate to pointer fields of structures. As one node in the graph may represent multiple memory locations, edges and labels have three possible values: no, may, and must. That is, if all memory locations represented by a node do not have a certain property then its value in that node is no. If some but not all memory locations represented by a node have a property then its value is may. If all memory locations represented by a node have a property then its value is must. Nodes that cannot be distinguished by their labels and edges are united, giving rise to finite-state abstraction. There is a standard collection of “important” instrumentation predicates; however, coming up with the appropriate instrumentation predicates for a given program is somewhat of an art.

The goal of this project is to consider combinations of shape analysis and model checking. After many years of attempting to solve program verification separately, in recent years the fields have been converging to take advantage of the relative merits of each approach. Software verification tools now combine model checking, static analysis, and theorem proving in an attempt to give a unified solution to as many programs as possible. We mention two interesting previous work in this context. First, the work in [9] considers combination of TVLA and the software model checker Blast. The main contribution is in combining lazy predicate abstraction with lazy shape analysis. How to increase the precision of the shape analysis based on counter examples and problems encountered in the application of model checking. This work was extended in [10-12] to consider various ways of tuning the combination of precision of each of the components and improving the feedback between them. This approach has been restricted to reachability analysis, which is effectively equivalent to safety checking. Another instance of combination of shape analysis and model checking is the work on proving termination reported in [13]. In this work, symbolic bounds on the size of structures in the heap and changes to their size are used as ranking functions that show program termination. The advantage is that for the termination proof all that is needed is actually the bounds and there is no need to retain the information about the structures on the heap themselves.

Research Aims

Our project is motivated by studies done in the model checking community related to completeness of abstraction. Consider an infinite state system M and a property φ in some temporal logic. Soundness of an abstraction system ≼ means that the existence of an abstraction A≼M such that A⊢φ implies M⊢φ. Completeness means that if M⊢φ then there exists a finite abstraction A≼M such that A ⊢φ. Obviously, when trying to find an abstraction A that will prove that M⊢φ it is best to search in a complete abstraction framework. When considering linear-time temporal logic, Kripke structures augmented with fairness are a complete abstraction framework [14]. For branching-time temporal logic the problem is more complex. Researchers suggested more expressive notions of Kripke structures that include may and must transitions as well as fairness and hyper transitions [15]. They show that these more expressive structures are complete for model-checking branching-time temporal logic. That is, for every infinite state structure (and in particular, an infinite family of finite state structures) and properties expressed in the μ- calculus there exists a finite state abstraction that preservers these properties. The μ-calculus is an expressive branching time logic (subsuming CTL and CTL*) that can express reachability, least and greatest fixpoints, and regular properties

This project aims to investigate the advantages of considering the more expressive versions of Kripke structures as an abstraction domain for shape analysis. The project has two facets:

  1. In the context of static analysis; study the principles for extending the functionality of static analysis tools such as TVLA with this ability and the implications regarding preservation of branching-time temporal-logic properties and their automatic extraction.
  2. In the context of model checking, study the additional opportunities afforded by the augmentation of abstract heap information with fairness information in order to enable model checking of general temporal logic specifications that rely on boundedness and finiteness of heaps.

Research Directions

In the context of abstract interpretation and shape analysis, we will study the extra expressive power that is afforded by using a more expressive abstraction domain. The clear advantages of the more expressive abstraction domain is that some of the existing need for specifying instrumentation predicates is removed. Consider, for example, the representation of a linked list in Figure 1a. Even if we agree on the convention that it only represents finite structures, it may still represent a list that ends in a cyclic section. The current solution is to add instrumentation predicates that explicitly say that the list is acyclic and thus the path starting at pointer x is finite. Consider the representation in Figure 1b that includes a must hyper-edge. That is, every location on the heap represented by the second node has an n field that either points to a similar node or to null. If we, in addition, mark the self loop as unfair, the representation itself includes the fact that it cannot represent a cyclic list and that the path starting at pointer x is finite. If, as in Figure 1c, we also include an unfair inverse of n, we get the information that all the heap locations represented by that node are reachable from x. When this idea is applied to a cyclic list, in Figure 1d, the representation forces the unique interpretation of a cyclic linked list. The main advantage, however, is made clear in Figure 1e, where we have a short-cut edge nn going past some part of the list. Here, the extra expressive power includes the information that the two different sets of edges reach the same locations on the heap (see [16] for a recent attempt to capture the same information by including in the abstraction equivalence and subset relations between abstract edges). The research questions that we intend to solve as part of this project include the following. Obviously, in order to create a usable static analysis framework, we will study how to automatically construct such abstract representations from code and what are the right abstract transformers that match this more expressive formalism. We will research how to automatically extract the additional information from code and, in particular, the fairness information. Furthermore, we will study how to automatically refine heap abstraction by adding μ-calculus instrumentation predicates that arise from automatic analysis of spurious heap structures (similar to counter-example-guided abstraction refinement). This will provide a very flexible abstraction framework that increases accuracy according to the program. Finally, on a more theoretical level, we expect backward modalities and nominals in the logic to become important to express features of structures on the heap. We will research notions of completeness and preservation of properties for logics that include these features and notions of bisimulation that are tailored to the usage of expressive 3-valued structures in static analysis.

In the context of model checking the inclusion of fairness in the abstraction of the heap puts the information on finiteness of the heap in terms that are very convenient to model checking. Model checking algorithms that relate naturally to fairness are the bread and butter of the model checking community. Furthermore, a large body of previous research exists on applications of fair model checking and how to apply fair model checking to infinite state systems in general. In contrast, current approaches in shape analysis include the information about finiteness and reachability in instrumentation predicates. It is not at all clear how to make model checkers aware of instrumentation predicates and how to combine them in model checking algorithms. The research questions that we intend to solve as part of this project include the following. We will study how to connect fairness appearing in the shape abstraction and fairness of transitions in the abstract representation of the program. For example, an assignment x=x->next() in a linked list should relate to traversal of a fair/unfair edge in the shape abstraction. Further questions we intend to tackle relate to the combination of information from multiple sources (predicate and shape abstraction) within one model checking framework. Although this question is related to the research into lazy shape analysis [9] we believe that it will need adjustments for the more expressive framework of shapes. In addition, the most successful approach towards model-checking liveness properties uses transition invariants [17, 18] and we will study how integrate our new results within this approach. Finally, we intend to study hybrid temporal logics that combine branching-time logic predicates as atomic parts of a second level temporal logic that characterizes the evolvement of the program state.

Methods and Outcomes

The outcomes of this project will be new methods and insights that will improve future tools for verifying programs with dynamic heaps. In order to empirically evaluate these new techniques, we intend to rely on the extant infrastructure produced for the combination of the model checker Blast and the shape analysis tool TVLA. We aim to extend their capabilities both for the purposes of more accurate shape analysis and combination of shape analysis and model checking for liveness properties. As test cases, we will use C programs or programs written in other programming languages (such as device drivers).

This project offers much scope for choosing the precise focus according to the PhD student’s background, interest, and expertise. We will deliver theoretical research, such as rigorous formulation of transfer functions, abstraction and concretization functions that match our abstraction domain, refinement of the 3-valued transition systems used for this kind of shape analysis, and completeness results for novel logics and model variants. Our practical software implementation will be distributed according to the licensing restrictions of the software used (e.g., Blast is distributed under Apache license). Our work will be published in conferences related to static analysis and model checking such as CAV, TACAS, VMCAI, POPL, or PLDI.

References

  1. Clarke, E.M., O. Grumberg, and D. Peled, Model Checking. 1999: MIT press.
  2. Cousot, P. and R. Cousot, Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints, in POPL. 1977, ACM.
  3. Ball, T., et al., Thorough Static Analysis of Device Drivers, in EuroSys2006 2006.
  4. Bessey, A., et al., A few billion lines of code later: using static analysis to find bugs in the real world. Commun. ACM, 2010. 53(2): p. 66-75.
  5. Henzinger, T.A., et al., Lazy abstraction. SIGPLAN Not., 2002. 37(1): p. 58-70.
  6. Clarke, E., et al., Counterexample-guided abstraction refinement for symbolic model checking. J. ACM, 2003. 50(5): p. 752-794.
  7. Reynolds, J.C., Separation Logic: A Logic for Shared Mutable Data Structures, in LICS. 2002, IEEE Computer Society.
  8. Lev-Ami, T. and S. Sagiv, TVLA: A System for Implementing Static Analyses, in SAS. 2000, Springer-Verlag.
  9. Beyer, D., T.A. Henzinger, and G. Theoduloz, Lazy Shape Analysis, in CAV. 2006, Springer-Verlag. p. 532-546.
  10. Beyer, D., T.A. Henzinger, and G. Theoduloz, Configurable software verification: concretizing the convergence of model checking and program analysis, in CAV. 2007, Springer-Verlag.
  11. Beyer, D., et al., Path invariants, in PLDI. 2007, ACM.
  12. Beyer, D., et al., Shape Refinement through Explicit Heap Analysis, in ASE. 2010, Springer-Verlag. p. 263-277.
  13. Berdine, J., et al., Automatic Termination Proofs for Programs with Shape-Shifting Heaps, in CAV. 2006, Springer-Verlag. p. 386-400.
  14. Kesten, Y. and A. Pnueli, Verification by augmented finitary abstraction. Inf. Comput., 2000. 163(1): p. 203-243.
  15. Dams, D. and K.S. Namjoshi, The Existence of Finite Abstractions for Branching Time Model Checking, in LICS. 2004, IEEE Computer Society.
  16. Marron, M., et al., Shape Analysis with Reference Set Relations, in VMCAI. 2010, Springer-Verlag. p. 247-262.
  17. Cook, B., A. Podelski, and A. Rybalchenko, Termination proofs for systems code, in PLDI. 2006, ACM: Ottawa, Ontario, Canada.
  18. Podelski, A. and A. Rybalchenko, Transition predicate abstraction and fair termination. ACM Trans. Program. Lang. Syst., 2007. 29(3): p. 15.




















































































































| [University Home]|[Faculty of Science]|[MCS Home]|[CS Home]||[University Index A-Z]|[University Search]|[University Help]|

Author: Nir Piterman .
© University of Leicester June 2011 . Last modified: 14th December 2011, 15:11:31.
CS Web Maintainer. This document has been approved by the Head of Department.