Straight ROBDDS are not the Best for Pos
Available: PDF, 300 DPI, 600 DPI, and 1200 DPI PostScript, DVI, BibTeX entry.
The subject of groundness analysis for (constraint) logic programs has been widely studied, and interesting domains have been proposed. Pos has been recognized has the most suitable domain for capturing the kind of dependencies arising in groundness analysis. Its (by now standard) implementation is directly based on reduced ordered binary-decision diagrams (ROBDDs), a well-known symbolic representation for Boolean functions . Even though several authors have reported positive experiences using ROBDDs for groundness analysis, in the literature there is no reference to the problem of the efficient detection of those variable which are deemed to be ground (or true, at the abstract level) in the context of a ROBDD. This is not surprising, since most currently implemented analyzers need to derive this information only at the end of the analysis and only for presentation purposes. Things are much different when this information is required during the analysis. This need arises when dealing with languages which employ some sort of delay mechanism, which are typically based on groundness conditions. In these cases, the naïf approaches are too inefficient, since the abstract interpreter must quickly (and often) decide whether a constraint is delayed or not. Fast access to ground variables is also necessary when aliasing analysis is performed using a domain not keeping track of ground dependencies.
We have shown that, by using an hybrid representation instead of the standard (purely ROBDD-based) one, it is possible to have constant-time, fast access to ground variables. Moreover, the new representation improves (not make worse, as it can be expected) the overall efficiency (on both time and space) of groundness analysis. Due to extreme space limitations we just give a rough idea of how the hybrid representation and its operations look like. We also report on the experimental results we obtained with the CHINA analyzer. The interested reader is referred to  for a complete account of this work.
The observation of many constraint logic programs shows that the percentage of variables which are found to be ground during the analysis, for typical invocations, is as high as 80%. This suggests that representing Pos elements simply by means of ROBDDs, as in [1,8], is not the best thing we can do (whence the title).
Here we propose an hybrid implementation where each Pos element is represented by a pair: the first component is the set of true variables (just as in the domain used in early groundness analyzers [7,6]); the second component is a ROBDD. In each element of this new representation there is no redundancy: the ROBDD component does not contain any information about true variables. This solution uses the more efficient representation for each kind of information: ``surely ground variables'' are best represented by means of sets (bit-vectors, at the implementation level), whereas ROBDDs are used only for ``conditional'' and ``disjunctive'' information.
The hybrid representation has two major advantages: (a) it gives fast, constant-time access to true variables; and, (b) allows for keeping the ROBDDs small, during the analysis, when many variables come out to be true, as it is often the case. Notice that, while having many true variables, in a straight ROBDD implementation, means that the final ROBDDs will be very similar to linear chain of nodes, the intermediate steps still require the creation (and disposal) of complex (and costly) ROBDDs. This phenomenon is avoided as much as possible in the hybrid representation. In fact, the operations on the hybrid representation make use of ROBDD's operations only when strictly necessary. When this happens, expensive operations like logical conjunction and disjunction are invoked with operands of the smallest possible size. In particular, we exploit the fact that the restriction operation (also called valuation or co-factoring) is relatively cheap. However, we cannot avoid searching for true variables. In programs where many variables are ground the ROBDDs generated will be kept small, and so also the cost of searching will be diminished.
The ideas presented in this paper have been experimentally validated in the context of the development of the CHINA analyzer . CHINA is a data-flow analyzer for CLP(H, N) languages (i.e. Prolog, CLP(R), clp(FD) and so forth) written in Prolog and C++. It performs bottom-up analysis deriving information on both call and success patterns by means of program transformations and optimized fixpoint computation techniques. The assessment of the hybrid domain has been done in a quite radical way. In fact, we have compared the standard, pure BDD-based implementation of Pos against the hybrid domain on the following problem: deriving, once for each clause's evaluation, a boolean vector indicating which variables are ground and which are not. This is a very minimal demand for each analysis requiring the knowledge about definitely ground variable during the analysis. We have thus performed the analysis of a number of programs on a domain similar to Pat(Pos) , switching off all the other domains currently supported by CHINA1. Pat(R) is a generic structural domain which is parametric with respect to any abstract domain R. Roughly speaking, Pat(R) associates to each variable the following informaton: (1) a pattern, that is to say, the principal functor and subterms which are bound to the variable; and (2) the ``properties'' of the variable, which are delegated to the R domain (the two implementations of Pos, in our case). As reported in , Pat(Pos) is a very precise domain for groundness analysis.
The experimental results are reported in Table 1. The table gives, for each program, the analysis times and the number of ROBDD nodes allocated for the standard implementation (STD) and the hybrid one (HYB), respectively. It also shows the ratio STD/HYB for the above mentioned figures (S/H). The computation times have been taken on a 80486DX4 machine running Linux 1.3.64. The tested programs have become standard for the evaluation of data-flow analyzers. The results indicate that the hybrid implementation outperforms the standard one in both time and space efficiency. The systematic speed-up obtained was not expected. Indeed, we were prepared to content ourselves with a moderate slow-down which would have been recovered thanks to the fast access to ground variables. The space figures show that we have achieved significant (and sometimes huge) savings in the number of allocated ROBDD nodes. With the hybrid domain we are thus able to keep the ROBDDs which are created and traversed during the analysis as small as possible. This phenomenon is responsible for the speed-up. It seems that, even for programs characterized by not-so-many ground variables, there are always enough ground variables to make the hybrid implementation competitive. This can be observed, for instance, in the case of the Peep program, which was analyzed with a non-ground, most-general input pattern. The following observations are important for a full understanding of Table 1:
In conclusion, the experimental results indicate that the hybrid domain outperforms, from any point of view, the standard implementation based on ROBDDS only. Surprisingly enough, we have thus been able to assess the superiority of the hybrid domain even for those cases where fast access to ground variables is not important.
Available: PDF, 300 DPI, 600 DPI, and 1200 DPI PostScript, DVI, BibTeX entry.
[Page last updated on February 19, 2001, 03:43:06.]
Home | Personal | Papers | Teaching | Links