[PPL-devel] pointer "ownership" in the C API?
Basile STARYNKEVITCH
basile at starynkevitch.net
Fri Mar 27 13:52:39 CET 2009
Hello PPL maintainers & Xavier Leroy & Damien Doligez
(first.last at inria.fr) who are in BCC
to Xavier & Damien, PPL is a marvelous GPL-ed library
http://www.cs.unipr.it/ppl/
Enea Zaffanella wrote:
> Basile Starynkevitch wrote:
> [...]
>> In particular, detailed explanation of Ocaml binding to PPL is welcome,
>> since I happen to know quite well Ocaml (and the C coding rules
>> required by
>> its runtime).
>
> Hello Basile.
>
> I am trying to improve our OCaml language interface and it really
> looks like you are the right person I can ask a few simple questions.
I'll try. The best persons to ask are of course Xavier Leroy & Damien
Doligez (Xavier being the father/guru of Ocaml, and Damien being the
father/guru of its runtime), both from INRIA. I am BCC-ing them just in
case!
>
> First of all, I believe we are currently doing something bad even when
> doing simple things such as interfacing PPL variable indices.
>
> In the C++ world, a variable index has type `dimension_type', which is
> just a typedef for the standard unsigned integer type `size_t'.
I never understood why you don't use int or int32_t for dimension_type.
IMHO, PPL dimension_type-s are reasonable integers, I mean that several
PPL data structures have a memory size proportional to the value of some
dimension_type; so if in my program I have some dimension_type whose
value is one billion = 1e9, I would expect the memory requirements to be
multiple of billions (so dozens of gigabytes), and CPU time usage to be
probably a billion squared of some small CPU step time, which would
amount for a large time since 1e18 nanoseconds is more than my remaining
lifetime (I think that 1e9 sec = 31 years), and certainly more than the
lifetime of any project able to use PPL. Therefore, I don't think PPL
would realistically handle dimension_type bigger than a billion. Hence
my int32_t suggestion for dimension_type.
On AMD64/Linux size_t is long ie 64 bits while int is 32 bits (like
int32_t).
And changing dimension_type is only changing one typedef in ppl.h! Of
course, the ABI would be incompatible: people would have to recompile PPL!
> As far as I have understood, OCaml uses tagged signed integers (`int').
Yes. Practically, their LSBit is 1 and they are intptr_t (ie machine
words, so 64 bits on AMD64/Linux, hence 63 bits integers since one bit
is lost).
>
> What is the right way to translate one into the other?
>
> We are currently using
> Int_val / Val_int
> but we just discovered that this casts the value to the C native "int"
> datatype. Hence, we should be probably using Long_val / Val_long.
> However, there also is Unsigned_long_val ... but it is unclear to me
> if/when I can use it and feel confident that I am not misreading the
> input of the OCaml user.
You might use Long_val / Val_long and explicitly cast dimension_type to
(long) in the Ocaml glue code in C or C++.
>
> So, what is the right way to code helper functions such as
>
> dimension_type caml_value_to_ppl_dimension(value);
I have a recent git of PPL, and cannot find these functions.
You could code a quick variant ie
inline dimension_type caml_value_to_ppl_dimension(value v)
{
return (dimension_type)Long_val(v);
}
However, this is an optimisation which happens to usually work because
you could know that the Ocaml GC [or malloc or new] is never called in
that function. To be very safe, you'll better code the safe variant
dimension_type caml_value_to_ppl_dimension(value v)
{
dimension_type r=0;
CAMLparam1(v);
r = (dimension_type)Long_val(v);
CAMLreturnT(dimension_type, r);
}
Such a function would probably be safe even in the improbable event that
Xavier Leroy would want Ocaml integers to be more polymorphic...
(Imagine that Xavier would want Ocaml 5.43 integers to be tagged int, or
boxed int64_t, or boxed GMP mpz_t! But knowing Xavier, I won't believe
that could happen.)
> value ppl_dimension_to_caml_value(dimension_type);
I leave this one as an exercise. (both quick & safe variants).
>
> taking into account that:
> a) we would like to react properly (throwing an exception) when an
> OCaml user wrongly passes in a negative value (or, if possible, a
> value that is too big to fit into a dimension_type);
Where do you want to react? If it is inside caml_value_to_ppl_dimension
you have to use my safer variant! I was supposing you will react in the
caller of value_to_ppl_dimension.
> b) we would like to place assertions warning us whenever we try to
> convert a dimension_type that does not fit into an OCaml integer.
I'm not sure to understand that one. Ocaml integers have always one bit
less than the machine integers. How and where do you handle the case
when the integer is (on 32 bits machine) bigger than 2^30, ie fits
positively in a C int but not in an Ocaml int?
> c) we want to strive for maximum portability (i.e., support both
> different computing platforms and, if possible, different OCaml
> releases).
>
> Second question:
> recently we corrected several GC-related issues reported by Kenneth
> MacKenzie. There were several places where our code was not GC safe.
> However, there are other places where, IMHO, we are too conservative.
> For instance:
>
> extern "C"
> CAMLprim value
> ppl_version_major(value unit) try {
> CAMLparam1(unit);
> CAMLreturn(Val_long(version_major()));
> }
> CATCH_ALL
>
> Is there any need to wrap unit and the return value (an unboxed CAML
> int, afaict) with the calls to CAMLparam1 and CAMLreturn ?
There is no need to wrap it in that precise case (because we know that
verion_major() is a constant), but I strongly recommend to keep the
wrapping and stay conservative. If you don't keep the wrapping, leave a
big fat comment in your code (really a big fat comment) to warn the
future PPL maintainer! And I don't think any sensible Ocaml application
using PPL would call that ppl_version_major function a million times
(but only once at most).
>
> Similarly, in the following example,
>
> extern "C"
> CAMLprim value
> ppl_set_rounding_for_PPL(value unit) try {
> CAMLparam1(unit);
> set_rounding_for_PPL();
> CAMLreturn(Val_unit);
> }
> CATCH_ALL
>
> is there any need to wrap Val_unit ?
If set_rounding_for_PPL call some complex code which might apply an
Ocaml closure or raise an Ocaml exception or allocate an Ocaml boxed
value, you need to keep it.
My advice is to always keep the CAML* macros every where. In the rare
functions which you believe are called billions of time and which you
are sure never call the Ocaml GC or apply an Ocaml closure or raise an
Ocaml exception even indirectly, you might remove them. IIRC, there is a
nightmare scenario like your code don't call the Ocaml GC but does call
new or malloc, latter the malloc-ed zone is released, still latter it is
used by Ocaml GC (or is it the reverse?) and days latter Ocaml GC
deallocated etc.. I forgot the details, ask Damien, but they are really
ugly!
>
>
> Thanks in advance,
> Enea.
--
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***
More information about the PPL-devel
mailing list