The type-lower-bound branch
Nikodemus was intrigued by the type-lower-bound branch I pushed on repo.or.cz
a
few weeks ago. It’s a convenience hack that wound up being slightly more intricate
and interesting than planned: a user was complaining that there was no nice way to
muffle some optimisation notes. Take
(lambda (x) (declare (type integer x) (optimize speed)) (1+ x)) ; in: LAMBDA (X) ; (1+ X) ; ==> ; (+ X 1) ; ; note: forced to do GENERIC-+ (cost 10) ; unable to do inline fixnum arithmetic (cost 1) because: ; The first argument is a INTEGER, not a FIXNUM. ; The result is a (VALUES INTEGER &OPTIONAL), not a (VALUES FIXNUM &REST T). ; unable to do inline fixnum arithmetic (cost 2) because: ; The first argument is a INTEGER, not a FIXNUM. ; The result is a (VALUES INTEGER &OPTIONAL), not a (VALUES FIXNUM &REST T). ; etc.
If the user already knows that the best they can do is declare that x
is an
integer
, there is no way to muffle only the notes that amount to wishing that the
type of x
was more precisely known.
My first (failed) attempt tagged variables as hopeless (any note mentioning these
variables’ types would then be muffled). I forgot why I couldn’t make it
work, but I believe it’s because invisible copies are very common, so that
transforms manipulated lvar
s that were mere untagged copies of the tagged
variable.
Luckily, we have a well-established mechanism to make properties flow across copies: the type (propagation) system. Again, my first attempt was to create a type for hopelessly vague types. However, types must implement all three set operations (negation, intersection and union), and making that work with such an ad hoc type wasn’t an attractive prospect.
This is where type lower bounds come in. In CL, types, as exposed through
declarations, are always upper bounds: they are treated as conservative
approximations of the set of values the annotated form, variable, etc. can take. The
conservativeness is necessary because the exact static type is generally undecidable.
In other words, the meaning of (the integer x)
is that x
can evaluate to
any integer; it could actually only ever evaluate to a fixnum
(or any other
subset of integer
s), which is what the note in the original example asks
for.
If we had a way to also denote lower bounds (e.g. that x
can take (not fixnum)
values) on the exact static type of forms, compiler notes could be tested
against these lower bounds to determine when the programmer knows that
the declared or derived type cannot be improved enough for a transform
to fire. In the original example, this would amount to declaring that x
’s
exact static type is between fixnum
(exclusively) and integer
(inclusively),
or, equivalently, that x
will always be an integer
, and will sometimes not
be a fixnum
. The note is then obviously moot, and can be muffled by the
compiler.
Unlike ad hoc “hopeless” types, intersection and union of lower bound types (really, range of types, since lower bound types are always accompanied by an upper bound) are straightforward and theoretically sound. The only issue is with type negation. Since this is all for convenience, I decided to just punt and drop the lower bound before negating.
The branch, as pushed on repo.or.cz
, seems to be working. In order to keep
changes to a minimum, regular types are treated as having an implicit lower bound of
nil
, and range types (with a non-trivial lower bound) are aggressively converted to
regular types. This gives the muffling effect for some interesting simple cases, and
reverts to the old behaviour very quickly. There are probably hidden bugs (both in
the code and in the design), but since they could only be triggered by using the
extension, I’m not too worried.
N.B. This isn’t actually useful for compilation speed
I originally thought type lower bounds could be useful to improve compilation speed: by keeping around both lower and upper bounds, we are able to overapproximate types even in the presence of type negation. Once I implemented a quick prototype, ir1-widening, I realised we don’t need lower bounds at all: we only have to make sure we only ever approximate types once we’re sure they’ll never be negated. Actually, what would be even more useful is a way to compute approximate unions and intersections quickly, instead of widening types after the fact.