Why are you listing the results for both goal-independent
and goal-dependent analysis?
Isn't goal-dependent/goal-independent analysis
The Right Thing ®?
No.
Neither goal-dependent analysis nor goal-independent analysis
is The Right Thing To Do.
Goal-independent analyses are unavoidable: if the analysis encounters
a goal like call(G)
such that the principal functor
and/or the arity of G
are unknown,
a goal-dependent analysis is either unsound
or it virtually becomes goal-independent,
since it must assume a ``don't know'' call pattern
for any procedure.
This is one of the reasons why focusing only on goal-dependent
analyses is, in our opinion, a mistake.
The other reason being that the ability of analyzing libraries once
and for all is desirable and, more generally, so is the separate analysis
of different program modules, especially in very large projects.
On the other hand,
focusing only on goal-independent analyses is the opposite mistake:
goal-dependent analyses, when possible, are usually more precise than
goal-independent ones.
For these reasons, we insist in presenting experimental results for both.
Why you have less benchmarks listed in the tables about
goal-dependent analyses?
For several programs in the benchmark suite,
goal-dependent analysis is pointless
(see the answer to the
Why you give analysis results also for programs where CHINA
signals a possible incorrectness?
When CHINA detects the possibility of delivering unsound results
it can be instructed to do one of two things:
-
emit a ``dont' know'' description for the entire program,
thus avoiding the incorrectness;
-
issue a warning and continue as if nothing happened.
This situation may arise because the program is self-modifying
in an impredictable way (as far as the analyzer is concerned).
This happens, for example,
when the program calls goals like assert(C)
where the principal functor of C
is unknown.
This fact could be mitigated when we can assume the program
was written for a system trapping the assertion of static
predicates (i.e., those not explicity declared as dynamic).
However, this is often unclear and, moreover, several Prolog systems
have changed their behavior in this respect when passing from one release
to another: which release were the authors referring to when
they wrote the program we use as a benchmark?
The other reason for possible incorrectness arises in programs
using setarg/3
, which is currently not supported by CHINA.
We will certainly implement support for setarg/3
,
but this work has very low priority in our task list, especially
because only a couple benchmarks are using it.
For the experimental part of our work we instruct CHINA to ignore
these causes of potential incorrectness.
We do this because our supply of benchmark programs is scarce,
so we are not in a position to throw away any.
Even if the final analysis results may be incorrect with respect
to the intended concrete semantics
(something that, as explained above, we often ignore),
these programs allow us to exercise the domains and analysis techniques
anyway.
The analysis results are always an implication like
if predicate p/n
succeeds, then [...]
You may think of the analysis results of ``nasty'' programs
as implications where further conditions have been added
in the antecedent.
Thinks like
`if the system traps any attempt
to assert a non-dynamic predicate'
or
`if setarg/3
is only used
to replace ground terms with ground terms'
and so forth.
Several programs could be also modified so as to avoid the problem,
e.g., by using several procedure like
my_assert_p(X) :-
[...]
assert(p(X)),
[...]
instead of a single procedure
my_assert(C) :-
[...]
assert(C),
[...]
Another possibility, which is the one we are pursuing, is to increase
the ability of CHINA to discover, say, which procedures may be
dynamically modified by assert/1
and relatives.
The bottom line is: don't worry.
We are not shipping a production tool yet.
When we will do it we will make sure CHINA issues a plain
``don't know'' for those cases it cannot handle correctly.
Why the numbers differ from the ones
I have seen on your paper XYZ?
There are several reasons:
The analysis times we report in order to assess the (relative)
efficiency of analysis methods are subject to even more
factors, in addition to the above mentioned ones:
-
The machines, operating systems, and compilers we use
for the development and gathering of experimental
data change.
We usually indicate the CPU employed along with the
frequency with which it is clocked, the amount of physical
memory installed and the version of the operating system.
A handful of other important parameters are instead omitted:
memory speed, amount of cache installed, bus frequency,
BIOS and OS parameters, compiler version, optimization
options used in the compilation, libraries used and so forth.
-
The compiler has bugs.
Sometimes we have to change the usual compilation switches
in order to get around a particular bug in the optimizer.
That's life.
Why the numbers you give for the XYZ analysis of aprogram
differ from the ones given by A. N. Author
in ``His Paper''?
First of all, see if the papers in question report on the methods
used to obtain the numbers.
A discrepancy in this respect is certainly one possible explanation.
We may then use different analysis techniques,
different widenings and/or widening strategies.
Staten that, bugs may and do bite us as well as A. N. Author.
If the differences in the numbers dealing with the precision of the
analysis are substantial, then I am mostly interested to know.
Should you spot one of these cases,
please let me know,
especially if our numbers are worse ;)