Chapter 2
Differences with Standard ML

2.1 Sets

The type constructor for maps is noted -m>, and is infixed just like the -> function type constructor. When declaring a type t1 -m> t2, the type t1 must admit equality (that is, the equality operation = must be applicable to elements of type t1); this is because maps rely on comparison of elements of their domain.

The type of sets of objects of type t (which admits equality) is t set, which is an abbreviation for t -m> unit (unit is the type of the empty tuple ()).

Any map type t1 -m> t2 admits equality as soon as t2 admits equality (t1 always admits equality).

Special notations are provided to build sets and maps:

{x => y, x' => y', x'' => y''} is an example of a map described by enumeration. This maps x to y, x' to y', and x'' to y''. If some of x, x' or x'' are the same, for instance x=x', then the resulting map may associate x with either y or y' nondeterministically.

As a shortcut, we can write x instead of x => (), so that {x,y,z} is the set containing exactly x, y and z.

{f(x) => g(x) | x in set s} is an example of a map described by comprehension. The x in set s is the domain description. In this example, all maplets f(x) => g(x) are formed for x ranging over the set s (actually s may be a map, but only its domain matters), and are collected to form a map. The x in set s domain description is a particular case of the in map domain description: the domain description x => y in map m makes x range over the domain of m, and simultaneously binds y to the image of x by m, so x in set s is syntactic sugar for x => _ in map s. In x => y in map m, the => y part may be omitted when y is the constant ().

Other domain descriptions are of the form x in list l, which makes x sweep through the list l, and x sub map m, which makes x sweep through the set of all maps that are included in the map m. The order in which sets are swept through (with in set, in map or sub map) is not specified, though sweeping through the same set twice in the same HimML session will be done in the same order. This is important since if there are two values of x that make f(x) equal, then only the last is retained in the result. To reverse this behavior (taking the first of the two values), we use the notation <{, as in <{f(x) => g(x) | x in set s}.

Domain descriptions may be combined with the and connector. For example, the expression

{f(x,y) => g(x) | x in set s1 and y sub map S}

maps f(x,y) to g(x) for all x in the domain of the map s1, and for all submaps y of S. This is used to build cross products, like in {(x,y) => () | x in set s1 and y in set s2}. The order in which s1 and s2 are swept through is nondeterministic, except that the same domain description will always sweep through the same data in the same order during one HimML session, and that if we look at all the values of y produced for one given value of x, they are all produced in the same order as if there were only the descriptor y in set s2 (same thing if we fix y and look at the sequence of values for x).

Domain descriptions may be filtered with a such that phrase. For example, the expression

{(x,y) => x+y | x in set s1 and y in set s2 such that x⋆y=3}

maps only pairs (x,y) such that x⋆y=3 to their sum.

As for definition by enumeration, writing f(x) instead of f(x) => () is allowed. For instance,

{x | x sub map f}

is the set of all submaps of the map f (the powerset of f when f is a set).

2.1.1 Set and Map Expressions, Comprehensions

In general, we have the following form of set and map comprehension:

{expression [ => expression] ||||
pattern1 [ => pattern1] ( in map|in set|in list|sub map ) expression1
( and patterni [ => patterni] ( in map|in set|in list|sub map ) expressioni))*
[ such that expression′′]}

where signs and words in typewriter style are meant literally, where parentheses serve as grouping marks, brackets enclose an optional construct, a vertical bar means an alternative and a star denotes zero or more occurrences of a construct. The [ => patterni] forms are only allowed if followed by in map. The expressions expressions, expression, expressioni and expression′′ are ordinary HimML expressions. The patterni and patterni are HimML patterns, that are essentially Standard ML patterns, augmented with special set and map patterns (see section 2.1.2).

This expression executes as follows:

The order in which the domains are swept through has the following properties:

An alternate form of set or map enumeration uses a <{ instead of the opening brace:

<{expression1 [ => expression1]
(expressioni [ => expressioni])*}

The same holds for comprehension:

{expression [ => expression] ||||
pattern1 [ => pattern1] (in map|in set|in list|sub map) expression1
( and patterni [ => patterni] (in map|in set|in list|sub map) expressioni))*
[such that expression′′]}

The semantics are the same, except that overwriting is replaced by “underwriting”: in case of a collision, the first maplet is retained instead of the last.

The comprehension notation has been extended to list comprehensions:

[expression ||||
pattern1 [ => pattern1] (in map|in set|in list|sub map) expression1
( and patterni [ => patterni] (in map|in set|in list|sub map) expressioni))*
[such that expression′′]]

collects values of expression in a list, in the order they are encountered. This subsumes the map functional of Standard ML:

fun map f l = [f(x) | x in list l]  
        

Comprehensions have also been extended to handle logical structures. In this case, they are called quantifications. We have universal quantification:

all expression ||||
pattern1 [ => pattern1] (in map|in set|in list|sub map) expression1
( and patterni [ => patterni] (in map|in set|in list|sub map) expressioni))*
[ such that expression′′] end

which sweeps through the domains as usual, and returns true if expression always evaluates to true, and false otherwise. Evaluation of the universal quantification stops as soon as expression returns false (all pending elements of the domains are ignored), or when all domains have been traversed. The all quantifier can be seen as a generalization of the andalso logical connective.

The existential quantification is dual (by negation) of the universal one:

exists expression ||||
pattern1 [ => pattern1] (in map|in set|in list|sub map) expression1
( and patterni [ => patterni] (in map|in set|in list|sub map) expressioni))*
[ such that expression′′] end

This returns true if for some value in the domains expression evaluates to true, and false otherwise. Evaluation of the existential quantification stops as soon as expression returns true (all pending elements of the domains are ignored), or when all domains have been traversed. The exists quantifier can be seen as a generalization of the orelse logical connective.

Close to existential quantification, we have the choice:

some expression ||||
pattern1 [ => pattern1] (in map|in set|in list|sub map) expression1
( and patterni [ => patterni] (in map|in set|in list|sub map) expressioni))*
[ such that expression′′] end

This looks for the value e of expression in the first environment where all patterni match elements of Di while making expression′′true. All other elements of the domains are ignored. If e is found, then it returns SOME e. If there is no such expression, NONE is returned. (NONE and SOME are the constructors of the option datatype; see Section 3.)

So, existential quantification is a special case of choice: exists  end could be written

case (some /$\ldots$\verb/ end) of NONE => false | SOME _ => true

A last form of comprehension generalizes the sequence connector ;. It is iteration:

iterate expression ||||
pattern1 [ => pattern1] (in map|in set|in list|sub map) expression1
( and patterni [ => patterni] (in map|in set|in list|sub map) expressioni))*
[ such that expression′′] end

It sweeps though the domains in the usual order, and evaluates expression in all matching environments. It is interesting essentially only if expression produces or depends on side-effects. It returns (). This subsumes the app functional of Standard ML:

fun app f l = iterate f(x) | x in list l end  
                      

Corresponding to each form of comprehension, we finally have a form of imperative comprehension, mostly useful with programs having or depending on side-effects:

{expression [ => expression]
 | while expression′′′
[ such that expression′′]}

This evaluates expression′′′, and returns the empty set {} if it is false. Otherwise, it evaluates expression′′, and if it is true, then expression and expression. All maplets that we get this way (such that expression′′ evaluates to true), are collected in a map, which is returned as soon as expression′′′ evaluates to false. Collisions are resolved by an overwriting strategy (last maplet wins).

It is syntactic sugar for (expression′′ being assumed true when not present, and expressionbeing assumed () if absent):

let fun collect s=

if expression′′′

then if expression′′

then collect (s ++ {expression => expression})

else collect s

else s

in

collect {}

end

assuming s is a new variable (++ is the infix map overwrite function).

Similarly, the underwriting imperative map comprehension notation is:

<{expression [ => expression]
 | while  expression′′′
[ such that expression′′]}

which is syntactic sugar for:

let fun collect s=

if expression′′′

then if expression′′

then collect ({expression => expression} ++ s)

else collect s

else s

in

collect {}

end

The imperative list comprehension notation is:

[expression | while expression′′′
[ such that expression′′]]

which is syntactic sugar for:

let fun collect l=

if expression′′′

then if expression′′

then expression :: collect l

else collect l

else l

in

collect []

end

where l is a new variable, and :: is the infix list constructor (cons is Lisp; note that in HimML as in Standard ML, a::l evaluates a before l).

The imperative universal quantification notation is:

all expression | while expression′′′
[ such that expression′′]
end

which is syntactic sugar for:

let fun quantify () =

if expression′′′

then if expression′′

then expression andalso quantify ()

else quantify ()

else true

in

quantify ()

end

The imperative existential quantification notation is:

exists expression | while expression′′′
[ such that expression′′]
end

which is syntactic sugar for:

let fun quantify () =

if expression′′′

then if expression′′

then expression orelse quantify ()

else quantify ()

else false

in

quantify ()

end

The imperative choice notation is:

some expression  | while expression′′′
[ such that expression′′]
end

which is syntactic sugar for:

let fun quantify () =

if expression′′′

then if expression′′

then SOME expression

else quantify ()

else NONE

in

quantify ()

end

The imperative iteration notation is:

iterate expression | while expression′′′
[ such that expression′′]
end

which is syntactic sugar for:

let fun quantify () =

if expression′′′

then if expression′′

then (expression; quantify ())

else quantify ()

else ()

in

quantify ()

end

Notice that when expression′′ is left out (it does really serve no special purpose, but is there for orthogonality), this is exactly the same as the classical while construct:

while expression′′′ do expression

2.1.2 Set and Map Patterns

To help in writing programs with sets, some standard patterns decomposing sets are provided in HimML.

First, the empty set notation may serve as a pattern, which matches only the empty set. Thus:

case s of  
     {} => f()  
   | _ => g(s)

executes f() when s is the empty set, and g(s) if s is not empty.

A pattern of the form {p [ => q]} matches exactly those maps (or sets) of the form {x [ => y]}, where p matches x and q matches y (q and y are assumed to be () if not written). So:

case s of  
     {} => f()  
   | {x => y} => h(x,y)  
   | _ => g(s)

executes f() when s is the empty set, h(x,y) if s is the map {x => y}, and g(s) if s contains at least two elements.

The U infix function computes the union of two sets. To mimic this behavior, the U infix pattern operator splits a map (and not only a set) in two disjoint parts, the first one being matched by the pattern on the left of the U, the second one being matched by the pattern on the right. For instance:

fn {x => y} U rest => g(x,y,rest)

is a function that decomposes its argument by extracting x from its domain, letting y be the value that x is mapped to, and rest being the argument with x removed. It then applies g on the tuple (x,y,rest).

Note that the U may be computationally non trivial. It may be required from U to backtrack to find a match. For instance:

fn {ref x => y} U rest => g(x,y,rest)

is equivalent to:

fn s => case (some (x,y) | ref x => y in map s end) of  
            SOME (x,y) =>  
              let val rest = {x} <-| s  
              in  
                 g(x,y,rest)  
              end  
          | NONE => raise Match

(? is the map application function, <-| is the infix “domain restrict by” function).

Other map patterns are expressible as syntactic sugar for patterns using U:

{p1 [ => q1] (,pi [ => qi])*} [ tt U r]

is equivalent to:

{p1 [ => q1]} ( U {pi [ => qi]})*[ tt U r]

which is not ambiguous, as U associates to the right. Notice that, because of this translation, the pattern {x,y} matches sets of cardinal exactly 2 (x may never be equal to y), for example.

The pattern on the left of a U is actually constrained to be of the form {p1 [ => q1] (,pi [ => qi])*}. More general patterns could have been considered, but would have led to a much more complicated evaluator, and were not deemed worth the trouble.

Another map pattern construction generalizes the ellipsis construct already found in Standard ML record and tuple patterns:

{p1 [ => q1] (,pi [ => qi])*,...}

is equivalent to:

{p1 [ => q1] (,pi [ => qi])*} U _

A case not covered by the above definition, but which is a natural extension, is the pattern {...} which matches all maps, but no other object.

In general, a pattern of the form {x [ => y]}, where x [ => y] does not match all maplets inside the value to be matched, may be slow. Indeed, in this case, the evaluator may have to backtrack to find the correct match. A particularly simple case is the case when x is a constant pattern, like in {"abc" => y} U rest. In this situation, the evaluator is optimized so as not to sweep through the map in argument, but to find directly the element mapped to the constant "abc", match it with the pattern y, and match the rest of the map with the pattern rest.

Note that map patterns never get called repetitively, they only return the first correct match. Hence, for example:

{x | {x,...} in set s}

where s is a set of sets, does not compute the distributed union of elements of s, but a set containing one element taken in each non empty element of s.

Finally, the system order may be defined with map patterns like this:

fun system_less (x,y) = let val {x',...} = {x,y} in x'<>y end

which works for any pair of objects of the same equality type (an equality type is a type that admits equality). Beware that x may be less than y in the system order during one session and greater than y in another session. This may be a problem when transmitting data from a HimML process to another, or when reading back data that was saved from a file. However, it is a convenient way to get a total order on objects of a given equality type, when the real nature of this order does not matter.

2.1.3 Typing Sets and Maps

In the following, the expressions ei are assumed of type τ, and ei are of type τ:

2.2 Numbers and Physical Units

Apart from integers, there is only one numerical object in HimML: the complex number with real and imaginary parts represented as floating-point numbers (as C doubles, i.e. on 64 bits usually). 1.0, ~2.5, 1.0E5 and 1.0: ~2.0 are all numbers in HimML, their values in standard mathematical notation are 1, -2.5, 105 and 1 + 2i. From version 1.0α11 on, there are also integers, but we shall not call them numbers to avoid confusing matters, and we shall consider the latter rather as counters or sets of bit-fields. 1,  3 are integers, not floating-point numbers.

To help programmers write scientific code, typically to model computations on physical dimensions, a system of typing with physical units has been created in HimML (see [5]).

We conceive a numerical type as any type that represents a physical quantity, whatever it may be. The simplest numerical type is num, which is the type of numbers without a unit (all numbers above have type num). We shall see that numerical types are monomials over a set of basic dimensions (basic numerical types, like mass, speed, etc.), that can be declared with the dimension keyword. Units or scales are scale factors inside a dimension (kilogram, ton or ounce are units, or scales of the dimension called mass).

A numerical type variable can be instantiated only by a numerical type. It is written with a sharp (#) sign in the name, following all quotes, underscores and digits. For instance, '#a is a numerical type variable, ''#a is a numerical type variable with the equality attribute (this is futile: all numerical types admit equality), '1#a is an imperative numerical type variable of strength 1.

The declaration: dimension mass(kg) declares a new base dimension, called mass, which is incomparable with all other base dimensions. As it is a declaration, it obeys all the usual lexical scoping rules of declarations like val, fun, type, It also declares a scale by default, called kg (this scale is optional; if it is not provided, as in dimension person, then the default scale has the same name as the dimension). Scale names can be appended to numbers to indicate their type: typing 1‘kg; at the toplevel after the above declaration will result in the display:

it : mass

it = 1‘kg

Numerical types can be multiplied or raised to a power (actually, the definition of a numerical type is any product of powers of dimensions and numerical type variables). For any numerical types t1 and t2, t1‘t2 is the product of both types, and t1^n is the power of t1 to the number n. For instance, we can write:

dimension distance(m) and time(s);

type speed = distance‘time^ ~1;

type acceleration = distance‘time^ ~2;

Scales too can be multiplied and raised to a power. We may declare new scales by multiplying and raising old ones to a power, with the scale declaration (obeying the lexical scoping rules of declarations):

scale 1‘km = 1000‘m; (⋆ 1 kilometer is 1000 meters ⋆)

scale 1‘h = 3600‘s; (⋆ 1 hour is 2600 seconds ⋆)

scale 1‘kph = 1‘km‘h^ ~1; (⋆ kilometer per hours ⋆)

Scale declarations interfere with syntax analysis, so now 130‘km‘h^ ~1 is understood by the HimML toplevel, which answers:

it : distance‘time^ ~1

it = 36.111111111111111‘m‘s^ ~1

Thus HimML manipulates not only numbers but numerical values, that is pairs of numbers with a scale. Though, scaling is done during elaboration (the typing phase), so the evaluator never bothers with different scales and is therefore as fast as if there were no scaling facility.

Numerical typing is also fully polymorphic, thanks to numerical type variables, and the typing algorithm infers most general types, even with numerical types (the notion of most general must however be relativized). For instance, addition is + : '#a ⋆ '#a -> '#a, that is, it takes two numerical values representing the same physical dimension, and returns a value of the same type. This effectively prohibits adding distances with energies for instance, while allowing programmers not to declare what dimensions they want to add. More complex types can be used; for instance, division is / : '#a ⋆ '#b -> '#a ‘ '#b^ ~1, which means that it takes two numerical values having independent types, and returns a value having the quotient type.

Transcendental functions like log, or special functions like floor cannot having fully polymorphic numerical types, because it would make no sense. This is why the type of log is num -> num, as the type of floor. Note that type conversions are possible: for example, if x is of type mass, we can convert it to a distance by writing x⋆1‘m‘kg ^ ~1.

A special case for which we have to be careful is the power operation ⋆⋆. Indeed, ⋆⋆ is basically a transcendental function, calling exp and log, so is of type num ⋆ num -> num. But this type is too restrictive: writing x⋆⋆2 would imply that x ought to be of type num. To alleviate this, the typing algorithm knows the special case when the second argument to ⋆⋆ is an explicit numerical constant n (with no scale): the operation fn x => x⋆⋆n is then of type '#a -> '#a^n.

Some other subtleties worth mentioning are the following. First, all dimensionless constants (like 1.0, ~3.5) have type num, except 0.0, whose type is '#a; in other words, 0.0 has all numerical types. This is consistent with the semantics of units. However, 0‘kg still has only the mass type, for example. Notice, by the way, that we write dimensionless constants with a dot and a zero figure after the dot (like 0.0 instead of 0). This is to disambiguate between the number 0.0 (of type num) and the integer 0 (of type int); if you put scales after the constant, this is not needed since there is no ambiguity.

Then, it is possible to declare scales with negative or even complex scale factors. For example, the declaration scale 1‘a = ~2‘b is legal, provided b is an already declared scale. This poses a problem, in that comparison functions like < then seem to be ill-defined: indeed, assume 1‘b>0 holds, then as ~0.5‘a=1‘b, we should have ~0.5‘a>0, which is odd. The solution is to say that all comparisons (as well as all other functions on numerical values) are done with respect to the default scale. So, in the example, if b is the default scale, then 1‘b>0 and ~0.5‘a>0.

For a list of dimensions and scales, look at the file ‘units.ml’. Note that units can be used not only to represent physical dimensions, but also multiplier prefixes (scale 1‘k = 1000.0, so that 1‘km=1‘k‘m), or more abstract dimensions (dimension memory(byte), dimension apples and oranges, etc.).

2.3 Other Differences

Here we list other differences between HimML and Standard ML.