Programming Analysis & Logic Programming: Projects

Program Analysis & Logic Programming

Current Projects

A Library of Polyhedra

In collaboration with Roberto Bagnara and Enea Zaffanella, this project concerns the ongoing development of the Parma Polyhedra Library (PPL). This is a modern C++ library for the manipulation of convex polyhedra and other numerical abstractions. This library is: user friendly; fully dynamic; portable; exception-safe; efficient; thoroughly documented; and free software distributed under the terms of the GNU General Public License.

Previous Projects

Product Domains for Software Analysis

This is project concerned the design and implementation in the PPL of a generic product domain that is parameterised not only on the component domains but also on the reduction that is performed between the domains. One interesting combination is when one of the component domains is a polyhedron domain (such as the closed convex polyhedron and octagon domains) and the other is a Grid domain (see below). We now support several light-weight reduction methods such as that of sharing equalities as these are not only very efficient but also are safe to use with standard polyhedral widenings. Further research is needed to find more precise reduction methods but with good performance cost and for which suitable widenings can be constructed.

Grids for Software Analysis

This project, partly funded by EPSRC, concerned the design and implementation in the PPL of the grid domain; this is a domain for use in capturing information about the possible distributions of the numerical values in a program.

Analyzing the Natural Semantics of Programs

The research focused on a generic imperative language for significant fragments of imperative languages like C and C++. After formally defining the language and its natural semantics, we have specified a generic abstract semantics for the system that can be combined with a wide range of abstract domains. We are now using this specification to build the ECLAIR prototype which incorporates a highly generic static analyzer with several instantiations for very different kinds of applications.

Analyzing Object-Oriented Languages

The research which was in collaboration with Fausto Spoto focused on analyzing interesting properties of object-oriented languages, particularly Java.

Numerical Stability of software

This project targetted improved methods, based on abstract interpretation, for detecting numerical instability in software caused by the approximate binary or floating point representations of numbers.

Finding and Verifying the Absence of Bugs in Imperative Programs

was funded as a Royal Society International Outgoing Short Visit 2007/R4 and supported a short visit by Dr Hill to Parma. The project aims were to develop improved methods, based on abstract interpretation, for detecting numerical instability in software caused by the approximate binary or floating point representations of numbers.

Geometric Abstractions for Scalable Program Analyzers

was a recent 1 year project (2008/9, EP/G025177/1) funded by EPSRC (UK).

Summary

It is widely acknowledged that relatively small defects in software can have a substantial cost both for producers and consumers. For example, system vulnerabilities are frequently introduced by programming mistakes such as allowing out of bounds accesses to buffers, overflows in operations on native integers and other errors related to memory management. Of course, there can be other causes, such as system design flaws, but finding and certifying the absence of the low-level bugs is an important prerequisite to building secure and reliable software. The approach we use to detect and locate programming errors or certify the absence of such bugs is that of static analysis; that is, the determination of correct though approximate information about the program's values at each program step. Static analysis has its roots in compiler optimization where the analysis time has to be kept very low while the properties of interest are fixed with respect to the compiler. More recently program analyzers have been developed for program verification; however these also consider a fixed set of possible run-time errors and aim for a scalability and performance that enables them to tackle very large programs.

Static analysis uses abstract domains for representing information that needs to be collected. Thus these domains have to provide a convenient but approximate representation of the accumulated information during the abstract evaluation of a program. Observe that the abstract domain component of a static analyzer has to include, not only a computer representation of the logical properties of interest, but also the operations needed to extract this information from the program's components, primitives for propagating this information forward and/or backward within the program, and operators for accelerating the analysis process and ensuring loop iterations actually terminate.

Since, many program properties of interest are intrinsically numeric, there has been a considerable amount of research on how this kind of information can be represented efficiently and precisely by means of geometric domains. The problem being to get the "right" efficiency/precision trade-off, which is difficult since this is clearly dependent on the application. Thus many geometric domains have been proposed and researched, the majority being defined by linear (i.e., planar) bounds such as polyhedra; octagons; boxes, also known as intervals; and grids, simple forms of which are also called lattices. Such a range is needed since domains such as polyhedra, although very precise, have high complexity and exponential space requirements (relative to the number of dimensions) while simpler domains such as octagons and grids are polynomial and the non-relational domain of boxes has linear complexity.

Solving this scalability problem is the main motivation for this project; here we will research new techniques for building compound geometric domains that can be constructed from several atomic ones such as those discussed above. In order to allow for varying the efficiency/precision trade-off, not only will it be parametrized on the component domains but it will also have a highly adjustable strategy for varying the kind and amount of communication between them. Thus a successful project will provide bespoke domains that are tailored for the application, allowing for both the type of property being verified and the size and complexity of the software being analyzed.