Roberto, Margherita and Beatrice

Home

Personal Info

Papers

Teaching

Links

On the quality of available Prolog implementations

Roberto Bagnara
Dipartimento di Matematica e Informatica
Università di Parma
Parco Area delle Scienze 53/A
I-43124 Parma
Italy Roberto wearing a brown paper bag Roberto without the paper bag

I am grateful to the editor, Pat Hill, who gave me the opportunity of reading a preliminary version of the interesting article of Paulo Moura. Reading it had the effect of revamping my feelings of dissatisfaction for all the Prolog implementations I have tried. But, before continuing: to avoid misunderstandings, I feel compelled to make clear my reasons for contributing this article with the following disclaimer:

I am a definite supporter of logic programming in general and of Prolog in particular. It precisely because of this that I am extremely distressed by what are, I believe, serious deficiencies in current Prolog implementations. By exposing some of these deficiencies, it is hoped that this note might actually help improve things.
I would like to see our community to be able to stand up to other language communities. However, I feel that we would probably succumb in any serious discussion about the relative advantages of our programming environments (which include languages, implementations, and tools) with respect to others. And this also in those application fields which make Prolog an excellent candidate. How sad. We must do something about it.

One of the discoveries I made thanks to Paulo's article is that there are at least 22 Prolog compilers around here. This is an amazing number even though some of these systems are test beds for academic research. To me this sounds like an indication of the fact that the Prolog community suffers from a an excessive degree of fragmentation which, in my opinion, is not a good thing. Given the differences in the languages actually implemented by those systems, as pointed out by Paulo, one of the results is that we have many different communities that cannot share code easily. I suspect also that a significant number of these communities are really small, if not marginal at all.

What are the reasons for this fragmentation? Lack of standardization has certainly played an important role. Weakness of standardization is also an issue: when a standard leaves too many thing unspecified or "implementation-defined", this is an open door for the birth of several incompatible systems. Another phenomenon I have observed quite frequently in our field, is the rush with which implementations go commercial. No matter how immature and buggy a new Prolog implementation might be: chances are that the developers will try to sell it. Or they will not distribute the source code. Or both. Usually this results in yet another incompatible Prolog system with its own bells and whistles, with only a few users and with a lifetime of one to two years. Whether the developers do actually make any significant money out of it is something I ignore, but I doubt. I also doubt that going commercial makes any positive contribution to quality. Forbidding access to the sources certainly does not help. Maybe the lack of standardization and the rush to go commercial constitute a vicious circle: in absence of a strong standard I could always come up with my own Prolog system offering a peculiar feature that nobody else offers (without caring so much about being compatible on the standard features). To profit from my commercial implementation I could still use the old, short-sighted, and wrong strategy of placing obstacles on the emergence of a standard, since the standard could make me lose part of my little user base. The technique of tying users to a product by means of incompatibility is a really old one. Other communities have learned how to avoid this trap a long time ago.

I don't even know the names of all the 22 Prolog implementations mentioned by Paulo. I have started using Prolog in 1984 with C-Prolog, and since then I have probably tried a dozen or so implementations. I must say that I am dismayed at how little things have changed since 1984. While the operating systems and, say, the C development environments I was using at that time were pathetically limited compared to what software technology is able to offer today, C-Prolog of 1984 could have its place among the Prolog systems available in 1999. It came with the usual interactive debugger and a couple of standard tools, like XREF, that came in handy from time to time. A very poor development environment indeed, but not that far from the kind of environment you get with most of today's Prolog systems. Let us consider a trivial example: command line editing and history. It is only recently that some Prolog systems (like SWI-Prolog and B-Prolog) offer command line editing and history. Most systems (including SICStus Prolog which is considered by many to be one of the best systems available) do not offer this facility: if you make a typing mistake you have to retype everything; if you want to issue a similar query again you have to retype everything. Isn't that striking in 1999?

In my opinion, the most important kinds of deficiencies in today's Prolog implementations are:

  1. bad documentation;
  2. bad diagnostics;
  3. very old and inadequate implementation technology.
Let us start from the problem of documentation. I am particularly sensible to this problem because I work on program analysis. And in order to develop a correct analyzer I need a correct and precise description of the language being analyzed. This turned out to be very difficult if not impossible to obtain. A few months ago I posted a request for clarifications about the SICStus module system (reading the relevant chapter of the manual should suffice for anyone to understand why I was in need for explanations). I have received several answers (including, several weeks later, the one from the SICStus officials). All the answers dealt with the problem of how to use or take advantage of the module system. None was an answer to my problem, that is, understanding the exact semantics of the system. A colleague who is in the field of Prolog programming and Prolog implementation tried to answer my questions and said: "I see your problem - I have always avoided trying to know the exact semantics, because I suspected it to be changing every release. Very unfortunate of course."

With all the work we have done in the field of semantics for logic programming (several hundreds of papers) it is very sad to find ourselves in such a state. I mean, we are able to give a formal specification for a module system or for a built-in, aren't we?

What we have seen above is just an example of ambiguous documentation. However, it is quite often the case that seemingly unambiguous documentation does not match the system at hand. Take your favorite Prolog implementation and, after reading the documentation, try the following queries:

  1. ":- [a|b]."
  2. ":- [[[a|b],c]|d]."
  3. Use the above lists as the list of operator names in op/3.
You may be surprised of the result. Very recently I have encountered yet another case of non-adherence of the system to its documentation. I was semi-automatically testing the (partial) correctness of my descriptions for the SICStus built-ins, when the rudimentary tool I have built for that purpose told me that the description I used for portray_clause/1 was wrong. I quickly rushed to page 83 of the SICStus Prolog User's Manual (version 3.7.1) and there I found:
  portray_clause(+Clause)
  portray_clause(+Stream,+Clause)

  Writes the clause Clause onto Stream exactly as listing/(0-1) would
  have written it. Same as write_term([Stream,] Term,
  [quoted(true),numbervars(true),indented(true)]) followed by a period
  and a newline, removing redundant module prefixes and binding variables
  to terms of the form '$VAR'(N) yielding friendlier variable names.
Then I went to page 5 of the manual where I verified that +Clause means that Clause "should be instantiated to a non-variable term." But this was in perfect agreement with my description of that predicate. So, as usual, I resorted to manual experimentation. Here is what I got.
SICStus 3.7.1 (Linux-2.0.34-i686): Mon Oct 05 20:02:29 CEST 1998
Licensed to prmat.math.unipr.it
| ?- portray_clause(X).
_.

true ?

yes
| ?- write_term(X,[quoted(true),numbervars(true),indented(true)]).
{INSTANTIATION ERROR: functor(_29,_30,1) - arg 2}
In words, it is not true that, for portray_clause/1 to succeed, the argument should be a non-variable and it is not true that portray_clause/1 is equivalent to write_term/2 with the indicated list of options. Moreover, that call of write_term/2 dies badly with an overly obscure error message. This brings us to the topic of bad or non-existent diagnostics. What we have just seen is an error in the implementation of write_term/2. This uses functor/3 without first checking the obvious applicability conditions. But this case is easy: we are evaluating a built-in with respect to an empty program. If you want another one, try
| ?- [user].
| p --> 3.14.
{TYPE ERROR: functor(_233,3.14,2) - arg 2: expected atom, found 3.14}
Now imagine testing a big application: dozens of source files, hundreds of predicates defined and maybe fifty calls to functor/3. One of these calls, unfortunately, goes wrong. Well, you will get a message like
{INSTANTIATION ERROR: functor(_29,_30,1) - arg 2}
and nothing else. You won't get any help in locating which call of functor/3 went wrong: neither the file or module, nor the predicate, nor the clause, let alone the position within the clause. In other words, run-time errors come without any indication about which clause was executed when the erroneous situation occurred. And this even if you are executing interpreted code. This is astonishing. Especially if you consider that people programming in C (or C++, or Fortran, or Java, or Pascal, or even assembly for that matter) can download for free a programming environment with which they can know exactly which line of code in which file caused, say, a segmentation fault. And they can obtain a complete stack trace which makes it clear who called who and with which values.

I must agree with Paulo when he says that often commercial providers try to sell bugs for features. And they react badly when they are told that their documentation is full of incongruities and does not faithfully describe the actual behavior of the system. As a result I no longer report bugs to them. I remember with pleasure when I reported a bug in the debugger of a major Prolog system: I produced a test-case whereby the program behaved differently depending on whether it was executed with tracing on or off. After two or three messages where the providers tried to divert the attention from the real problem the final answer came. It was constituted by the sentence

"The debugger does not have the property you are requesting."
Well, I was simply requesting the debugger not to be intrusive, which should be number 1 in the list of features any debugger must have. Moreover, that feature was provided in free debuggers for C (where achieving this is a bit more difficult just because anything can get access to anything) 10 years ago. Isn't that gross?

The last problem I want to address in this note is the lack of high quality tools for Prolog programming. Let us focus on the most basic component of a programming environment: the compiler. I don't know why but we are basically using compiler technology that is 15 years old. Not only does it seem that data-flow analysis is never used, but even classical optimizations, like common subexpression elimination and inlining appear to be disregarded. A couple of years ago I made some tests whose outcome was: either I had to give up first argument indexing or perform unnecessary unifications. The test-case was something like

p(f(Y, Z), ...) :-
  ...,
  q(f(Y, Z),         % Lack of CSE means a repeated unif. with f(Y, Z)
  ...
which could be alternatively written
p(X, ...) :-         % Lack of propagation means giving up 1st-arg ind.
  X = f(Y, Z),
  ...
  q(X),
  ...
I don't know the situation with this today (*). What I do know is that for C it is ages since it was advantageous to write '++i' (translated into an INC instruction) instead of 'i = i+1' (Jurassic compilers translated this into a generally slower ADD instruction.) In contrast, Richard O'Keefe's adagio "put output unifications after the cut" makes still much sense today. So, as far as I can tell, the code quality of most Prolog compilers is not very good (which is not surprising since they are mostly macro-based code generators, rather than optimizing compilers). The other thing I tend to expect from a compiler is good diagnostics, but the detection of singleton variables is all the Prolog compilers I know have to offer. In other words, I will know only at run-time if I have misspelled the name of a non-dynamic predicate. It is lucky that I no longer develop critical applications.

I could write a book on this subject, but will stop here. The intended audience for these notes knows very well what I am talking about and can guess with good precision what can be added. I am also worried about the fact that what I have written can hurt somebody. I am upset by the absurd situation in which we have put ourselves. I have observed the evolution of Prolog and Prolog environments for 15 years now. In silence. And this is my fault. Now I feel it is the time to look at the situation lucidly and without pity. And the situation is that advocacy of Prolog programming is no longer possible, at least for me. Arguments that used to work are no longer valid. Try saying to a C++ programmer "with Prolog you don't have to worry about pointers, memory allocation, and garbage collection." He will answer that he isn't worried about that either, because he can design his classes to achieve the same results. He has access to a variety of generic garbage collectors that do not interfere with his program and still has all the tools he can dream of. If he then discovered that we don't have the slightest idea about which call of functor/3 went wrong he will laugh at you, and rightly so. And what about the "simple semantics" argument? I wouldn't dare try to use it today. I am too scared of the possibility the other guy says:

"Yeah, right. And what do you do with that simple semantics?"

In summary, there is more than enough for us all to wear a brown paper bag for a year or so. However, this is not a very productive thing to do. It would be much better to finally profit from the strong points of Prolog without continuing damaging ourselves because of fragmentation, poor software engineering practices, and so forth. I am sure users, researchers, and providers know exactly what I am talking about. The "simple semantics" can really be profited from and "Programming in Logic" should no longer be an excuse for having less tools at our disposal than those "Programming in Assembly".

In conclusion, I would like to add something to the discussion on whether the ISO Prolog standard has been taken seriously so far. The following is taken (without removing any relevant context) from the official pages of EGCS (the Experimental GNU Compiler System, http://gcc.gnu.org):

Now that there is an agreed ISO/ANSI standard for C++, the compiler has a definitive document to adhere to. Earlier versions might have accepted source code that is no longer C++. This means that code which might have `worked' in a previous version, is now rejected. You should update your code to be C++.
Straight, timely, and unambiguous: that is what I call taking a standard seriously.

P.S.

A few hours before the final version of this article was handed in a message from Mats Carlsson (answering a message from Robert P. Goldman) clarified that the example I gave for the choice between first argument indexing and avoiding useless unifications does not apply to today's SICStus. According to Mats, "If all arguments are distinct variables A, B, C, ..., and the first goal is "A = Term", indexing will be done on (the principal functor of) Term." As far as I can tell, this means that if at least one argument is not a variable, or if two arguments are bound to the same variable, or the first two goals are, e.g., "true" and "A = Term", then indexing will not occur. In other words, non-optimized macro-based code generation is what is done here.

[Page last updated on February 19, 2001, 14:50:42.]

© Roberto Bagnara
bagnara@cs.unipr.it

Home | Personal | Papers | Teaching | Links