-
A type variable is an identifier
beginning with a quote
'
. Following the quote, there may be:
-
an optional quote, signalling that this is an equality type
variable, that can
be instantiated only with equality types. Intuitively, equality
types are the types of those objects that may be safely tested for equality. This excludes essentially
functions, promises, dynamics, continuations and abstract
types with no specified
equality1, and
all types built from such types, except for
ref
and
array
types.
- next, an optional integer, greater than or equal to 1. The
variable is then a weak type variable (in Standard ML of New
Jersey parlance, where they were
introduced), or an imperative type variable (in Standard
ML parlance). The integer is called the strength of the type
variable. Non weak type variables may be thought as having strength
infinity. The strength of a type expression is the minimum of the strengths of its type
variables (infinity if it has none). A weak type variable may only
be instantiated by a type of equal or less strength. The strength
of a weak type variable in the type of an expression f may be
thought as the minimum number n of arguments e1, …,
en such that evaluating
f
(
e1)
…(
en)
creates
the corresponding mutable object (ref or array in general).
For compatibility with Standard ML, instead of
an integer, we may write the underscore (_
) character. This
will be interpreted as though we had written the integer 1
.
- next, an optional sharp (
#
) sign. The variable is then a
numerical type variable, which may be instantiated only with numerical
types. A numerical type is either a
numerical type variable, the special type num
, the name of a
dimension (as defined by a dimension
declaration), a numerical product τ.τ' of numerical
types, or a numerical power of a numerical type τ^
x,
where x is a number (note that if the number has a negative real
part, there should be a blank between the ^
and the ~
,
because ^~
is a valid ML symbol; however, in this context the
parser is smart enough to know that such a symbol would be useless,
and correctly recognizes ^~
as though it were ^ ~
).
Any numerical type already admits equality, so the extra quote indicating an equality type
variable is superfluous in the case of numerical type variables.
- finally, a sequence of letters, digits, quotes or underscores
beginning with a letter.
Contrarily to Standard ML, attributes of type
variables matter in type declarations. For example:
type ''a pair = ''a * ''a
declares a polymorphic type of pairs, but this
type is restricted to pairs of values admitting equality. The same
declaration in Standard ML would incur no such
restriction. This feature is necessary in HimML because type
functions are partial, whereas they are total
(they apply to all types) in Standard
ML. Indeed, the numerical product and the
numerical power of types apply only to numerical types, and the map
type constructor may only be applied to two types, the first of which
having to admit equality.
Therefore, type declarations such as:
type 'a product = 'a ` 'a
type 'a 'b map = 'a -m> 'b
would have no meaning in HimML. We should have written:
type '#a product = '#a ` '#a
type ''a 'b map = ''a -m> 'b
The imperative nature of type variables is not checked, however,
in conformance with the Definition of Standard ML.
Therefore, declaring:
type '1a fakearray = '1a ref list
does not preclude the use of the type expression '2a fakearray
,
though it seems to break the rule of not replacing a type variable by
a type of higher strength, and
is in fact mostly equivalent to:
type 'a fakearray = 'a ref list
The only difference is with datatypes, where:
datatype 'a fakearray = fake of 'a ref list
builds a constructor fake : 'a ref list -> 'a fakearray
,
whereas declaring
datatype '1a fakearray = fake of '1a ref list
would declare the constructor with the more restrictive type fake : '1a ref list -> 'a fakearray
.
string
is the type of all character
strings. There is no
character type in HimML. However, all
one-character strings are pre-allocated, and play the role of
characters. There is no limit on the length of strings except the
length of the largest free block in memory.
int
is the type of integers. Integers are machine
integers only, for now. They will be replaced by arbitrary
precision integers (bignums) in a future version, but for
now all overflows merely give rise to undefined behavior,
contrarily to what the definition of Standard ML prescribes (such
overflows should raise exceptions). Integers are considered as
considered as counters or as bit-fields, but not as proper
numbers, which are represented by objects of numerical types.
Machine integers are in the range [min_int
,max_int
],
and are represented with nbits
bits.
num
is the type of all numbers, that is, numerical values
with no dimension. Numerical values are stored as complex
numbers with real and imaginary parts in
floating point format, double
precision (64 bits).
There is no real
type as in Standard ML.
All Standard ML functions using
real
arguments, or producing
real
arguments use or produce numerical values in HimML,
whether of type num
or of a more general numerical
type.
bool
is the datatype of boolean values:
datatype bool = false | true
Note that false
and true
are datatype constructors, and thus may be used in patterns
to match themselves. Contrarily to Standard
ML, while
loops, if
conditionals are not defined as syntactic sugar, and so
don't change their behavior under a redefinition of
true
and false
. This is also valid of all new
comprehension, quantification and iteration constructs
provided in HimML.
list
is the predefined datatype of lists. It is
declared by:
datatype 'a list = nil | :: of 'a * 'a list
::
is furthermore declared as having infix status,
being right associative, and having precedence 5. This is as
though we had typed:
infixr 5 ::
option
is the predefined datatype of optional
values. It is declared by:
datatype 'a option = NONE | SOME of 'a
NONE
represents the absence of value, and
SOME x
represents the presence of the value x
.
void
is a predefined empty datatype. It
is declared by:
datatype void = VOID of void
and has as sole constructor VOID : void -> void
.
As you may check, it is impossible to construct a ML value
of type void
; this is why it is a void type.
Stupid as it may seem, it is useful at least in the following case.
Assume you want to provide a function quit
that never
returns. Because quit
never returns, it is likely that its
type will be of the form τ -> 'a
, where 'a
is
a new type variable. Now, you may want to declare the type of
such functions that never return, as follows:
type never_returns :
τ -> 'a
where τ is the argument type, and 'a
is a new type variable,
making it clear that such a function indeed cannot return. Unfortunately, you
cannot write this in ML, because 'a
should be given as a parameter to
the type constructor never_returns
, i.e., it forces you to write
type 'a never_returns =
..., which is probably not what you want.
The void
is a workaround:
type never_returns :
τ -> void
indeed states that a function of type never_returns
can only return an
object of type void
, namely something that does not exist. On the other
hand, if f
is a function of this type, expressions like
if
... then f() else g()
won't type-check if g()
is an expression that returns a result; the
solution is to change f()
into, say, the sequence (f(); raise NonSense)
, which now has any type, and where exception
NonSense
cannot be raised.
ref
is the pseudo-datatype of polymorphic mutable references. You may
think of it as declared with:
datatype '1a ref = ref of '1a
which also declares a constructor ref : '1a -> '1a ref
.
array
is the type of polymorphic mutable arrays.
They were introduced in Standard ML of New Jersey, and are somehow a generalization of
ref
s, with multiple indexed mutable entries.
intarray
is the type of (non-polymorphic) arrays of
integers. Integer arrays take less space than objects of type
int array
and put less pressure on the garbage collector, but are less
flexible in that they can only contain objects of type int
.
exn
is the extensible pseudo-datatype
of exceptions.
Exceptions are declared with the exception
declaration,
are raised with the raise
construct and caught with
the handle
construct.
promise
is the abstract type of
promises. Promises are a poor man's
implementation of lazy data
structures. When e is an HimML expression of type τ,
the expression delay
e builds a promise, that is
a structure representing the future evaluation of e, of type
τpromise
. Evaluation of e if forced
with the force : 'a promise -> 'a
function.
An expression inside a promise is only evaluated once: forcing a
promise stores the result in place of e inside the promise, and
sets a flag indicating that the promise has been forced.
Considering that HimML has memoizing functions, delay
e is roughly equivalent to
memofn () =>
e; then force
is fn p => p()
, with
'a promise = unit -> 'a
. However, promises are more space-efficient
than memoizing functions, and delay
and force
are more
explicit.
Promises come from the algorithmic language Scheme
[12]. Promises do not admit
equality.
cont
is the abstract type of
continuations, inspired from
Scheme [12], and typed as in
[2]. A continuation is
reified with the callcc
function (or the catch
function) and
thrown with the
throw
function.
dynamic
is the special type of
dynamic values. Dynamic values may be
thought as couples (v,σ), where e is a value and σ
is a type, such that e if of type σ. Dynamics are built
with the
pack
pseudo-datatype constructor.
Dynamics are the basis for functions that must operate on data of
different types, but do something depending on the actual type of
the data. A different sort of dynamics was introduced in
[1], ours is more in the spirit of Mycroft (cited
in the paper). It corresponds rather precisely to the dynamic
construct (with type dyn
) of CAML
[7]. Uses are type-safe communications of data
between independent processes, checkpointing
processes (saving data in a file to retrieve them later),
polymorphic printing functions, and
so on.
dynamic
does not admit equality, because this would not make sense in the general case.
unit
is the type of the 0-ary tuple ()
.
Contrarily to Standard ML, this is not the
same as the empty record |[]|
(noted {}
in Standard
ML; this notation was abandoned in HimML
because it was incompatible with the map notation).
*
is the infix n-ary type product constructor
(n≥ 2): τ1 *
… *
τn is
the type of all tuples
(
e1,
…,
en)
with
e1 :
τ1, …, en :
τn.
As an extension to the Standard ML type
system, extensible tuple types are introduced in HimML:
τ1 *
… *
τn * ... : 'rest
is the type of all tuples
(
e1,
…,
em)
with m≥ n
and e1 :
τ1, …, en :
τn
(an exception is the case n=1, where it is not the type of any 1-ary
tuple, only 2-ary and higher tuples, because there are just no such
things as 1-ary tuples in HimML).
This allows the HimML type system to lift some restrictions on the
typing of tuple
patterns with an
ellipsis: the pattern
(
p1,
…,
pn,...)
has
type τ1 *
… *
τn * ... : 'a
,
if pi :
τi for all i. As a
special case, the selector
function #
n (n≥ 1) is fully
polymorphic in HimML (not in Standard
ML), and has type:
#
n : 'a
1 *
… * 'a
n * ... : 'rest -> 'a
n
The type qualifying the ellipsis
must be a type variable. This type variable
can only be subtituted for by a tuple type. If it admits
equality (say, ''rest
),
then all undescribed components are to be of equality types;
imperative and weak type
variables (say,
'
nrest
, with n≥ 1), constrains all undescribed
components to be of imperative types with
strengths at least n;
Combinations are possible, too.
A tuple type admits equality if
and only if all its component types admit equality (including its
possible ellipsis). The strength of a tuple type is the minimum strength of its
component types (and of its ellipsis, if present).
Extension of extensible tuple types is
done by instantiating the extension variable. However, extensions are
mostly built by the type inferencing engine. Single
inheritance is automatic. To
take an example, the selector function #1
has
type 'a * ... : 'rest -> 'a
. When applied to, say,
(1,"abc",false) : num * string * bool
the type of #1
is automatically specialized to become num * string * bool -> num
.
- For any n types τ1, …, τn, and any
distinct label identifiers lab1, …,
labn (n≥ 0),
|[
lab1 :
τ1,
…,
labn :
τn]|
is the type of all records having exactly the labi
as field labels, the field labi having type
τi. Labels are ordinary ML identifiers, that is
sequences of letters, digits, quotes or underscores beginning with a letter;
contrarily to Standard ML, positive integers (so-called
numerical labelsfields;numerical;) are not considered labels. In
particular, tuples are not special cases of records. Moreover, the
delimiting characters for records and record
types in HimML are |[
and ]|
instead of {
and }
;
this choice was made so as not to conflict with the map
constructions of HimML.
As for tuples, extensible record types are provided:
|[
lab1 :
τ1,
…,
labn :
τn,... : 'rest]|
is the type of all records having at
least the labi as labels, the field labi having type
τi. The extension variable
'rest
may be constrained to admit equality, or to be imperative or weak, as for
tuple extensions. This type variable can only be subtituted for by a record
type.
This enables full typing of
record patterns with
ellipsis, which Standard
ML restricts because of its less powerful type system.
This extension of Standard ML typing is a weaker
extension than the one proposed in [13].
As in Standard ML, the field selector
#
lab is the function that picks the field lab out of the record
passed as an argument. Contrarily to Standard ML, it
is fully typable:
#
lab : |[
lab : 'a, ... : 'rest]| -> 'a
Contrarily to Standard ML again,
records are entirely different from tuples.
In Standard ML, tuples are just records with only
numerical labels.
In HimML, record labels cannot be numerical. This was
dictated by a choice of implementation of records entirely different from
the implementation of tuples. Witness the semantic difference between
ellipsis in
tuple and record types: were tuples to be special cases of records, an
extensible tuple type like a * b * ... : 'r
would actually stand for
|[1 : 'a, 2 : 'b,... : 'r]|
, which would then contain the type
|[1 : 'a, 2 : 'b,extra : 'c]|
, which is not the type of any tuple,
because extra
is not an integer. We want to consider extensible
tuple types as containing only tuple types.
A record type admits equality if
and only if all its field types admit equality (including those
subsumed by its ellipsis, if any). The strength of a record type is the minimum strength of its
field types (and of its ellipsis, if present).
Extension of extensible record types is
done by instantiating the extension variable. However, extensions are
mostly built by the type inferencing engine. Multiple
inheritance is automatic:
if (
α11,…,αm11,ρ1)
τ1,
…, (
α1n,…,αm1n,ρn)
τn are all extensible record types (where by convention, the
extension variable is ρj, the last variable), and τ' is a
record type (extensible or not), the construct:
(
α11,…,αm11,(
…(
α1n,…,αm1n,τ')
τn)
…τ1)
is a new record type, having all the fields of the types τ1,
…, τn, τ', and which is extensible if and only
if τ' is extensible. The resulting labels are associated with the
types they have in the τi and τ'. If two types provide
the same label, they must associate it with the same type. Notice also
that the types involved may not be records, but more complicated types
involving extensible record types.
We get automatic multiple inheritance on records with this scheme, thanks to type inference.
For example, if lab is an identifier, the selector
#
lab is the function that takes as argument any record with a
label lab, and returns the corresponding field. Thus, #
lab
has type |[
lab : 'a,... : 'r]| -> 'a
in HimML (it is
not typable in Standard ML). Then, a function
such as:
fn data => #multiplicand(data) * #multiplier(data)
has the effect that the variable data
inherits the types
|[multiplicand : '#a,... : 'r1]|
and |[multiplier : '#a,... : 'r2]|
.
Thus, this function has type:
|[multiplicand : '#a,multiplier : '#b,... : 'r]| -> '#a ` '#b
->
is the infix function type
constructor. All functions in ML are basically
unary. Zero-ary functions may be simulated as functions taking
()
as argument. n-ary functions (n≥ 2) are
functions taking n-tuples as arguments. Variable
arity functions may be coded as functions taking list
arguments. If arguments to a function must be
referred to with keywords, it is possible to use record
arguments, where the record labels serve as
keywords. If a n-ary function does not depend on the order or
multiplicity of their arguments, it is natural to encode the
arguments as a set (for the logical 'or' or 'and' on
formulas, for an equality function, etc.).
`
is the numerical multiplication type constructor. If τ and τ'
are numerical types, then
τ`
τ' is the numerical type of the product of any two
values of types τ and τ'. For example, if we have
dimension intensity(A) and time(s)
then 3`A * 4`s
is of type intensity`time
.
The `
symbol is also used as the scale multiplication operator: the value of 3`A * 4`s
is 12`A`s
. There can be
no confusion between this and the type multiplication
operator.
^
is the numerical power type constructor. If n is any number (that is, any
complex number, without dimension), and τ is a numerical
type, then τ^
n is a numerical type;
e.g., we may write dimension distance(m)
and type area = distance^2
.
The power type operator may be used also to specify inverse
types, for instance in dimension time(s)
and
type frequency = time^ ~1
(notice the space between ^
and
~
, which informs the ML parser that we didn't intend to write the
symbol ^~
).
The ^
symbol is also used as the power operator for scales; e.g., the value of 1/100 s
is 0.01`s^ ~1
. There can be no confusion between the two uses of
^
, as with the ^
infix string
concatenation operator.
-m>
is the infix map type
constructor. If τ and τ'
are types, and on the condition that τ admits
equality, then τ -m>
τ'
is the type of maps mapping objects of type τ to
objects of type τ'. The resulting map type admits
equality whenever τ'
admits equality.
When τ' is unit
, τ -m> unit
is written τ set
, and represents the types
of finite sets of objects of type τ.
- polymorphic datatypes and abstract
datatypes are provided exactly as
in Standard ML. An abstract datatype
declaration has the form:
abstype
implementation in
interface end
where implementation is a standard datatype binding list (what
usually follows a datatype
keyword), and interface is a
list of declarations.
In Standard ML, no declared abstract datatype
admits equality, to preserve its
abstract character. In HimML however, equality
declarations may be used inside the interface part, to define the equality predicate on an abstract
datatype. Equality declarations take the form:
eqtype
τ1 [and
τi]*
where the τi are amongst the declared abstypes. This
declaration must be the first in the interface. It is an error to
declare a τi as admitting equality with this declaration, if
it did not admit equality
according to the rules of equality of datatype bindings (see
[6] and [9]).