The HimML Reference Manual
version 1.0 alpha 17

Jean Goubault-Larrecq

Abstract: This is the HimML reference manual. HimML is a close variant of the Standard ML language, with set-theoretic types, data structures and operators, and an extendible physical unit system.

Keywords: ML, sets, physical units.

Résumé


Ceci est le manuel de référence de HimML. HimML est une proche variante du langage Standard ML, avec des types, des structures de données et des opérateurs ensemblistes, et un système d'unités physiques extensible.

Mots-clés : ML, ensembles, unités physiques.

Copyright (c) 1992–1999 Jean Goubault-Larrecq and Bull S.A., 2000–2002 Jean Goubault-Larrecq et LSV/CNRS UMR 8643 & ENS Cachan
Bison parser generator Copyright (c) 1986 Free Software Foundation, Inc.
Regular expression matcher Copyright (c) 1986 by University of Toronto

Contents

Chapter 1  Overview

The HimML language is a close cousin to the Standard ML language [6, 9]. Actually, it is Standard ML without its module system (as of October, 1992), but with some extensions and with a specially designed run-time system. This document is a reference for HimML version 1.0 alpha 17. It does assume some knowledge of the syntax, static and dynamic semantics of the Standard ML language, and builds on this knowledge.

The plan of the document is as follows. In the rest of chapter 1, we quickly describe what makes the originality of HimML. Chapter 2 is devoted to the differences between HimML and Standard ML; they are mainly extensions, but some are mere differences. Chapter 3 lists the types and values provided upon booting the HimML executable; others are defined in source form, with their own documentation. There is no guide to the syntax of HimML, for which the reader is referred to [6], and to the appropriate sections of this report when HimML syntax differs from the Standard ML syntax. Chapter 4 discusses running the system, and also how you can contribute to improve it, either by reporting bugs or suggesting changes.

The run-time system is architectured around the concept of maximal sharing, or systematic hash-consing [4], in which two identical objects in the heap lie exactly at the same address. This makes comparison fast, though memory management may suffer a bit occasionally. However, this approach has a lot of advantages in terms of memory usage (reuse of objects) and of speed (via memoization and computation sharing); these points are addressed in [4].

The main feature added to Standard ML in HimML is the collection of set and map operations, types and syntactic constructs. A set in HimML means a finite collection of objects of the same type, where order and multiplicity are irrelevant. A map looks like a set: it is a finite collection of associations, called maplets, from objects of a type t1 to objects of a type t2, forming a many-to-one relation. Because maps subsume sets, sets of objects of type t are just defined in HimML as maps from objects of type t to the special object (), the empty tuple. The need for sets and maps as basic type constructors in programming languages is advocated in [3]. In HimML, sets and maps form the core of the language, and, thanks to maximal sharing, operations on sets and maps are very fast. It is encouraged to write HimML programs as executable specifications in a set-theoretic framework, for it is easy and in general more efficient than we think it could be in advance.

Another feature added in HimML is its typing of numerical values with a notion of measure units. This scheme allows to detect inconsistencies in scientific programs, where the problem lies not in the structure of data (numbers) but in the nature of what they describe (units).

We shall often refer to the Standard ML language, as defined in [6] and [9]. Sometimes, it will be necessary to distinguish particular features of a prominent implementation of Standard ML, called Standard ML of New Jersey or SML-NJ (unpublished documentation, provided with the 1991 release of SML-NJ).

Note that there is a difference between the sign …, that is used to describe a repetition of objects in the syntax, and the sign ..., which is a lexeme, used in tuple and record patterns, and also in extensible types (an extension to Standard ML's type system).

Chapter 2  Differences with Standard ML



2.1  Sets



The type constructor for maps is noted -m>, and is infixed just like the -> function type constructor. When declaring a type t1 -m> t2, the type t1 must admit equality (that is, the equality operation = must be applicable to elements of type t1); this is because maps rely on comparison of elements of their domain.

The type of sets of objects of type t (which admits equality) is t set, which is an abbreviation for t -m> unit (unit is the type of the empty tuple ()).

Any map type t1 -m> t2 admits equality as soon as t2 admits equality (t1 always admits equality).

Special notations are provided to build sets and maps:

{x => y, x' => y', x'' => y''} is an example of a map described by enumeration. This maps x to y, x' to y', and x'' to y''. If some of x, x' or x'' are the same, for instance x=x', then the resulting map may associate x with either y or y' nondeterministically.

As a shortcut, we can write x instead of x => (), so that {x,y,z} is the set containing exactly x, y and z.

{f(x) => g(x) | x in set s} is an example of a map described by comprehension. The x in set s is the domain description. In this example, all maplets f(x) => g(x) are formed for x ranging over the set s (actually s may be a map, but only its domain matters), and are collected to form a map. The x in set s domain description is a particular case of the in map domain description: the domain description x => y in map m makes x range over the domain of m, and simultaneously binds y to the image of x by m, so x in set s is syntactic sugar for x => _ in map s. In x => y in map m, the => y part may be omitted when y is the constant ().

Other domain descriptions are of the form x in list l, which makes x sweep through the list l, and x sub map m, which makes x sweep through the set of all maps that are included in the map m. The order in which sets are swept through (with in set, in map or sub map) is not specified, though sweeping through the same set twice in the same HimML session will be done in the same order. This is important since if there are two values of x that make f(x) equal, then only the last is retained in the result. To reverse this behavior (taking the first of the two values), we use the notation <{, as in <{f(x) => g(x) | x in set s}.

Domain descriptions may be combined with the and connector. For example, the expression
{f(x,y) => g(x) | x in set s1 and y sub map S}
maps f(x,y) to g(x) for all x in the domain of the map s1, and for all submaps y of S. This is used to build cross products, like in {(x,y) => () | x in set s1 and y in set s2}. The order in which s1 and s2 are swept through is nondeterministic, except that the same domain description will always sweep through the same data in the same order during one HimML session, and that if we look at all the values of y produced for one given value of x, they are all produced in the same order as if there were only the descriptor y in set s2 (same thing if we fix y and look at the sequence of values for x).

Domain descriptions may be filtered with a such that phrase. For example, the expression
{(x,y) => x+y | x in set s1 and y in set s2 such that x*y=3}
maps only couples (x,y) such that x*y=3 to their sum.

As for definition by enumeration, writing f(x) instead of f(x) => () is allowed. For instance,
{x | x sub map f}
is the set of all submaps of the map f (the powerset of f when f is a set).

2.1.1  Set and Map Expressions, Comprehensions



In general, we have the following form of set and map comprehension:

{expression [ => expression'] ||||
pattern1 [ => pattern'1] ( in map|in set|in list|sub map ) expression1
( and patterni [ => pattern'i] ( in map|in set|in list|sub map ) expressioni))*
[ such that expression'']}


where signs and words in typewriter style are meant literally, where parentheses serve as grouping marks, brackets enclose an optional construct, a vertical bar means an alternative and a star denotes zero or more occurrences of a construct. The [ => pattern'i] forms are only allowed if followed by in map. The expressions expressions, expression', expressioni and expression'' are ordinary HimML expressions. The patterni and pattern'i are HimML patterns, that are essentially Standard ML patterns, augmented with special set and map patterns (see section 2.1.2).

This expression executes as follows: The order in which the domains are swept through has the following properties: An alternate form of set or map enumerationsets;defined by enumeration; uses a <{ instead of the opening brace:

<{expression1 [ => expression'1]
(expressioni [ => expression'i])*}


The same holds for comprehension:

{expression [ => expression'] ||||
pattern1 [ => pattern'1] (in map|in set|in list|sub map) expression1
( and patterni [ => pattern'i] (in map|in set|in list|sub map) expressioni))*
[such that expression'']}


The semantics are the same, except that overwriting is replaced by “underwriting”: in case of a collision, the first maplet is retained instead of the last.

The comprehension notation has been extended to list comprehensions:

[expression ||||
pattern1 [ => pattern'1] (in map|in set|in list|sub map) expression1
( and patterni [ => pattern'i] (in map|in set|in list|sub map) expressioni))*
[such that expression'']]
collects values of expression in a list, in the order they are encountered. This subsumes the map functional of Standard ML:

fun map f l = [f(x) | x in list l]
        


Comprehensions have also been extended to handle logical structures. In this case, they are called quantifications. We have universal quantification:

all expression ||||
pattern1 [ => pattern'1] (in map|in set|in list|sub map) expression1
( and patterni [ => pattern'i] (in map|in set|in list|sub map) expressioni))*
[ such that expression''] end
which sweeps through the domains as usual, and returns true if expression always evaluates to true, and false otherwise. Evaluation of the universal quantification stops as soon as expression returns false (all pending elements of the domains are ignored), or when all domains have been traversed. The all quantifier can be seen as a generalization of the andalso logical connective.

The existential quantification is dual (by negation) of the universal one:

exists expression ||||
pattern1 [ => pattern'1] (in map|in set|in list|sub map) expression1
( and patterni [ => pattern'i] (in map|in set|in list|sub map) expressioni))*
[ such that expression''] end
This returns true if for some value in the domains expression evaluates to true, and false otherwise. Evaluation of the existential quantification stops as soon as expression returns true (all pending elements of the domains are ignored), or when all domains have been traversed. The exists quantifier can be seen as a generalization of the orelse logical connective.

Close to existential quantification, we have the choice:

some expression ||||
pattern1 [ => pattern'1] (in map|in set|in list|sub map) expression1
( and patterni [ => pattern'i] (in map|in set|in list|sub map) expressioni))*
[ such that expression''] end
This looks for the value e of expression in the first environment where all patterni match elements of Di while making expression'' true. All other elements of the domains are ignored. If e is found, then it returns SOME e. If there is no such expression, NONE is returned. (NONE and SOME are the constructors of the option datatype; see Section 3.)

So, existential quantification is a special case of choice: exists end could be written
case (some /$\ldots$\verb/ end) of NONE => false | SOME _ => true
A last form of comprehension generalizes the sequence connector ;. It is iteration:

iterate expression ||||
pattern1 [ => pattern'1] (in map|in set|in list|sub map) expression1
( and patterni [ => pattern'i] (in map|in set|in list|sub map) expressioni))*
[ such that expression''] end
It sweeps though the domains in the usual order, and evaluates expression in all matching environments. It is interesting essentially only if expression produces or depends on side-effects. It returns (). This subsumes the app functional of Standard ML:

fun app f l = iterate f(x) | x in list l end
                      


Corresponding to each form of comprehension, we finally have a form of imperative comprehension, mostly useful with programs having or depending on side-effects:

{expression [ => expression']
| while expression'''
[ such that expression'']}


This evaluates expression''', and returns the empty set {} if it is false. Otherwise, it evaluates expression'', and if it is true, then expression and expression'. All maplets that we get this way (such that expression'' evaluates to true), are collected in a map, which is returned as soon as expression''' evaluates to false. Collisions are resolved by an overwriting strategy (last maplet wins).

It is syntactic sugar for (expression'' being assumed true when not present, and expression' being assumed () if absent):
let fun collect s=
if expression'''
then if expression''
then collect (s ++ {expression => expression'})
else collect s
else s
in
collect {}
end

assuming s is a new variable (++ is the infix map overwrite function).

Similarly, the underwriting imperative map comprehension notation is:

<{expression [ => expression']
| while expression'''
[ such that expression'']}


which is syntactic sugar for:
let fun collect s=
if expression'''
then if expression''
then collect ({expression => expression'} ++ s)
else collect s
else s
in
collect {}
end

The imperative list comprehension notation is:

[expression | while expression'''
[ such that expression'']]


which is syntactic sugar for:
let fun collect l=
if expression'''
then if expression''
then expression :: collect l
else collect l
else l
in
collect []
end

where l is a new variable, and :: is the infix list constructor (cons is Lisp; note that in HimML as in Standard ML, a::l evaluates a before l).

The imperative universal quantification notation is:

all expression | while expression'''
[ such that expression'']
end


which is syntactic sugar for:
let fun quantify () =
if expression'''
then if expression''
then expression andalso quantify ()
else quantify ()
else true
in
quantify ()
end

The imperative existential quantification notation is:

exists expression | while expression'''
[ such that expression'']
end


which is syntactic sugar for:
let fun quantify () =
if expression'''
then if expression''
then expression orelse quantify ()
else quantify ()
else false
in
quantify ()
end

The imperative choice notation is:

some expression | while expression'''
[ such that expression'']
end


which is syntactic sugar for:
let fun quantify () =
if expression'''
then if expression''
then SOME expression
else quantify ()
else NONE
in
quantify ()
end

The imperative iteration notation is:

iterate expression | while expression'''
[ such that expression'']
end


which is syntactic sugar for:
let fun quantify () =
if expression'''
then if expression''
then (expression; quantify ())
else quantify ()
else ()
in
quantify ()
end

Notice that when expression'' is left out (it does really serve no special purpose, but is there for orthogonality), this is exactly the same as the classical while construct:

while expression''' do expression


2.1.2  Set and Map Patterns



To help in writing programs with sets, some standard patterns decomposing sets are provided in HimML.

First, the empty set notation may serve as a pattern, which matches only the empty set. Thus:
case s of
     {} => f()
   | _ => g(s)
executes f() when s is the empty set, and g(s) if s is not empty.

A pattern of the form {p [ => q]} matches exactly those maps (or sets) of the form {x [ => y]}, where p matches x and q matches y (q and y are assumed to be () if not written). So:
case s of
     {} => f()
   | {x => y} => h(x,y)
   | _ => g(s)
executes f() when s is the empty set, h(x,y) if s is the map {x => y}, and g(s) if s contains at least two elements.

The U infix function computes the union of two sets. To mimic this behavior, the U infix pattern operator splits a map (and not only a set) in two disjoint parts, the first one being matched by the pattern on the left of the U, the second one being matched by the pattern on the right. For instance:

fn {x => y} U rest => g(x,y,rest)


is a function that decomposes its argument by extracting x from its domain, letting y be the value that x is mapped to, and rest being the argument with x removed. It then applies g on the tuple (x,y,rest).

Note that the U may be computationally non trivial. It may be required from U to backtrack to find a match. For instance:
fn {ref x => y} U rest => g(x,y,rest)
is equivalent to:
fn s => case (some (x,y) | ref x => y in map s end) of
            SOME (x,y) =>
              let val rest = {x} <-| s
              in
                 g(x,y,rest)
              end
          | NONE => raise Match
(? is the map application function, <-| is the infix “domain restrict by” function).

Other map patterns are expressible as syntactic sugar for patterns using U:

{p1 [ => q1] (,pi [ => qi])*} [ tt U r]


is equivalent to:

{p1 [ => q1]} ( U {pi [ => qi]})*[ tt U r]


which is not ambiguous, as U associates to the right. Notice that, because of this translation, the pattern {x,y} matches sets of cardinal exactly 2 (x may never be equal to y), for example.

The pattern on the left of a U is actually constrained to be of the form {p1 [ => q1] (,pi [ => qi])*}. More general patterns could have been considered, but would have led to a much more complicated evaluator, and were not deemed worth the trouble.

Another map pattern construction generalizes the ellipsis construct already found in Standard ML record and tuple patterns:

{p1 [ => q1] (,pi [ => qi])*,...}


is equivalent to:

{p1 [ => q1] (,pi [ => qi])*} U _


A case not covered by the above definition, but which is a natural extension, is the pattern {...} which matches all maps, but no other object.

In general, a pattern of the form {x [ => y]}, where x [ => y] does not match all maplets inside the value to be matched, may be slow. Indeed, in this case, the evaluator may have to backtrack to find the correct match. A particularly simple case is the case when x is a constant pattern, like in {"abc" => y} U rest. In this situation, the evaluator is optimized so as not to sweep through the map in argument, but to find directly the element mapped to the constant "abc", match it with the pattern y, and match the rest of the map with the pattern rest.

Note that map patterns never get called repetitively, they only return the first correct match. Hence, for example:

{x | {x,...} in set s}


where s is a set of sets, does not compute the distributed union of elements of s, but a set containing one element taken in each non empty element of s.

Finally, the system order may be defined with map patterns like this:

fun system_less (x,y) = let val {x',...} = {x,y} in x'<>y end


which works for any couple of objects of the same equality type (an equality type is a type that admits equality). Beware that x may be less than y in the system order during one session and greater than y in another session. This may be a problem when transmitting data from a HimML process to another, or when reading back data that was saved from a file. However, it is a convenient way to get a total order on objects of a given equality type, when the real nature of this order does not matter.

2.1.3  Typing Sets and Maps



In the following, the expressions ei are assumed of type τ, and e'i are of type τ':

2.2  Numbers and Physical Units



Apart from integers, there is only one numerical object in HimML: the complex number with real and imaginary parts represented as floating-point numbers (as C doubles, i.e. on 64 bits usually). 1.0, ~2.5, 1.0E5 and 1.0: ~2.0 are all numbers in HimML, their values in standard mathematical notation are 1, −2.5, 105 and 1+2i. From version 1.0α11 on, there are also integers, but we shall not call them numbers to avoid confusing matters, and we shall consider the latter rather as counters or sets of bit-fields. 1,  3 are integers, not floating-point numbers.

To help programmers write scientific code, typically to model computations on physical dimensions, a system of typing with physical units has been created in HimML (see [5]).

We conceive a numerical type as any type that represents a physical quantity, whatever it may be. The simplest numerical type is num, which is the type of numbers without a unit (all numbers above have type num). We shall see that numerical types are monomials over a set of basic dimensions (basic numerical types, like mass, speed, etc.), that can be declared with the dimension keyword. Units or scales are scale factors inside a dimension (kilogram, ton or ounce are units, or scales of the dimension called mass).

A numerical type variable can be instantiated only by a numerical type. It is written with a sharp (#) sign in the name, following all quotes, underscores and digits. For instance, '#a is a numerical type variable, ''#a is a numerical type variable with the equality attribute (this is futile: all numerical types admit equality), '1#a is an imperative numerical type variable of strength 1.

The declaration: dimension mass(kg) declares a new base dimension, called mass, which is incomparable with all other base dimensions. As it is a declaration, it obeys all the usual lexical scoping rules of declarations like val, fun, type, … It also declares a scale by default, called kg (this scale is optional; if it is not provided, as in dimension person, then the default scale has the same name as the dimension). Scale names can be appended to numbers to indicate their type: typing 1`kg; at the toplevel after the above declaration will result in the display:

it : mass

it = 1`kg

Numerical types can be multiplied or raised to a power (actually, the definition of a numerical type is any product of powers of dimensions and numerical type variables). For any numerical types t1 and t2, t1`t2 is the product of both types, and t1^n is the power of t1 to the number n. For instance, we can write:

dimension distance(m) and time(s);

type speed = distance`time^ ~1;

type acceleration = distance`time^ ~2;

Scales too can be multiplied and raised to a power. We may declare new scales by multiplying and raising old ones to a power, with the scale declaration (obeying the lexical scoping rules of declarations):

scale 1`km = 1000`m; (* 1 kilometer is 1000 meters *)

scale 1`h = 3600`s; (* 1 hour is 2600 seconds *)

scale 1`kph = 1`km`h^ ~1; (* kilometer per hours *)

Scale declarations interfere with syntax analysis, so now 130`km`h^ ~1 is understood by the HimML toplevel, which answers:

it : distance`time^ ~1

it = 36.111111111111111`m`s^ ~1

Thus HimML manipulates not only numbers but numerical values, that is couples of numbers with a scale. Though, scaling is done during elaboration (the typing phase), so the evaluator never bothers with different scales and is therefore as fast as if there were no scaling facility.

Numerical typing is also fully polymorphic, thanks to numerical type variables, and the typing algorithm infers most general types, even with numerical types (the notion of most general must however be relativized). For instance, addition is + : '#a * '#a -> '#a, that is, it takes two numerical values representing the same physical dimension, and returns a value of the same type. This effectively prohibits adding distances with energies for instance, while allowing programmers not to declare what dimensions they want to add. More complex types can be used; for instance, division is / : '#a * '#b -> '#a ` '#b^ ~1, which means that it takes two numerical values having independent types, and returns a value having the quotient type.

Transcendental functions like log, or special functions like floor cannot having fully polymorphic numerical types, because it would make no sense. This is why the type of log is num -> num, as the type of floor. Note that type conversions are possible: for example, if x is of type mass, we can convert it to a distance by writing x*1`m`kg ^ ~1.

A special case for which we have to be careful is the power operation **. Indeed, ** is basically a transcendental function, calling exp and log, so is of type num * num -> num. But this type is too restrictive: writing x**2 would imply that x ought to be of type num. To alleviate this, the typing algorithm knows the special case when the second argument to ** is an explicit numerical constant n (with no scale): the operation fn x => x**n is then of type '#a -> '#a^n.

Some other subtleties worth mentioning are the following. First, all dimensionless constants (like 1.0, ~3.5) have type num, except 0.0, whose type is '#a; in other words, 0.0 has all numerical types. This is consistent with the semantics of units. However, 0`kg still has only the mass type, for example. Notice, by the way, that we write dimensionless constants with a dot and a zero figure after the dot (like 0.0 instead of 0). This is to disambiguate between the number 0.0 (of type num) and the integer 0 (of type int); if you put scales after the constant, this is not needed since there is no ambiguity.

Then, it is possible to declare scales with negative or even complex scale factors. For example, the declaration scale 1`a = ~2`b is legal, provided b is an already declared scale. This poses a problem, in that comparison functions like < then seem to be ill-defined: indeed, assume 1`b>0 holds, then as ~0.5`a=1`b, we should have ~0.5`a>0, which is odd. The solution is to say that all comparisons (as well as all other functions on numerical values) are done with respect to the default scale. So, in the example, if b is the default scale, then 1`b>0 and ~0.5`a>0.

For a list of dimensions and scales, look at the file `units.ml'. Note that units can be used not only to represent physical dimensions, but also multiplier prefixes (scale 1`k = 1000.0, so that 1`km=1`k`m), or more abstract dimensions (dimension memory(byte), dimension apples and oranges, etc.).

2.3  Other Differences

Here we list other differences between HimML and Standard ML.

Chapter 3  Core Types


1
In Standard ML, no abstract type admits equality; this restriction is lifted in HimML.

Chapter 4  The Standard Library

4.1  General Purpose Exceptions and Functions

The following exceptions are defined: General functions are:

4.2  Lists



The exception:
exception Nth


is defined. It is raised whenever it is attempted to access an element outside of a list. Recall that lists are defined by the datatype declaration:

datatype 'a list = nil | :: of 'a * 'a list


:: : 'a * 'a list -> 'a list is the list constructor, and is declared infix, right associative of precedence 5.

4.3  Sets and Maps



Exceptions related to sets and maps are: Functions on sets and maps are: There is now an alternative data type for maps in HimML, the table type. This implements imperative maps: unlike the map type, objects of type table are updated in a destructive way, just like hash-tables. Objects of type table, which we shall just call tables, are not really faster than using applicative references, but they put less stress on the sharing mechanism and the garbage collector of HimML, which may result in space and even time savings.

The type of tables mapping objects of type ''a to 'b is
type (''a, 'b) table
You can think as a kind of equivalent of the type (''a -m> 'b) ref. Associated functions are:

4.4  Refs and Arrays



Refs and arrays are the fundamental mutable data structures of HimML, as in Standard ML of New Jersey (only refs exist in the definition of Standard ML). Ref and array types always admit equality, even if their argument types do not. Equality is decided according to the following rules: In short, refs and arrays are not shared, and are compared by their addresses1.

Some syntax extensions are defined to allow you to write a more readable code to handle arrays. In particular: There is however no similar syntactic sugar for non-polymorphic integer arrays (of type intarray), only for polymorphic arrays (of type 'a array, for any 'a).

The following exception is defined:
exception Subscript


It is raised when accessing an element outside of an array, by sub or update, or by isub or iupdate.

The following functions are provided:

4.5  Strings



Strings are ordered sequences of characters (we call size of the string the length of the sequence). ML does not provide a type of characters, though one-character strings may serve this purpose. Strings are assumed coded according to the 7-bit ASCII standard. However, characters are usually 8 bits wide: the characters of code greater than 127 are interpreted in a system-dependent fashion. The string length is encoded separately from the sequence of characters that composes it: no special character is reserved to represent the end of a string, in particular, any string may contain the NUL character (\000).

The following exceptions are defined: The functions are:

4.6  Integer Arithmetic

HimML integers are machine integers, that is, integers in the range [min_int,max_int], and are represented with nbits bits.

Exceptions on integer operations can be: Non-negative integers are 0, 1, 2, ..., 42, ...Negative integers are written with the ~ negation operator in front.

The following operations are defined. Basic operations like +, -, etc. are not overloaded as in Standard ML. For example, + is always an integer function. For analogous numerical (floating-point) functions, see Section 4.7. For example, floating-point addition is #+, not +.

4.7  Numerical Operations

The real and imaginary parts of numbers are floating-point numbers, which are classified according to the IEEE754 standard. A floating-point number may be: The fact that an infinity or a Nan is produced (in IEEE754 systems), or an Arith exception is raised (in non-IEEE754 systems) by a computation means usually that the computation is wrong. In this case, the exception system is better suited to find numerical bugs, provided we have a means of locating the place where the exception is raised.

However, in production code, these error conditions may still crop up for limit situations. It is then unacceptable for code to stop functioning because of this: so infinities and NaNs should be used. In general, users had better use a system obeying the IEEE754 standard (or its successors). The IEEE754 standard defines trapping mechanisms to be used for debugging numerical codes; they are not used in HimML now, but may be in a future version of the debugger.

Classification of numbers is accomplished with the auxiliary types:
datatype numClass = MNaN of int | MInf | MNormal | MDenormal
   | Zero | PDenormal | PNormal | PInf | PNaN of int


numClass encodes both the class and the sign of a real number. In the case of a NaN, the NaN code is also provided as an integer. NaN codes are system-dependent; only NaN codes from 1 to 7 are recognized in HimML; unrecognized codes are represented by 0. NaN codes and the constructors MNaN, MInf, MDenormal, PDenormal, PInf, PNan are never used in a non-IEE754 system. The exception:
exception Arith


is raised for all arithmetical errors in non-IEEE754 systems, and is never used in IEEE754 systems.

On a different subject, remember that testing complex numbers for equality is hazardous. Therefore, = is not to be used lightly on numbers (for example, 0.2 * 5 may print as 1 but differ from 1 internally). Normally, though, computations on sufficiently small integers should be exact (at least on real IEEE754 systems). This warning applies not only to the explicit = equality function, but also to all operations that use internally the equality test, in particular the test for being inside a set inset or the application of a map to an object ?, when this object contains numbers.

Moreover, following the principle, valid in HimML, that an object is always equal to itself, +inf = +inf, ~inf = ~inf, and all NaNs are equal to themselves, though they are incomparable with any number (for example, +NAN(0) <= +NAN(0) is false, but +NAN(0) = +NAN(0) is true).

Numerical constants are entered and printed in HimML as x or x:y, where x and y represent real numbers, possibly followed by a scale. The notation x:y denotes the complex number x+i.y. There can be no confusion with the : character used in type constraints, because no type may begin with a character lying at the start of a number. A real number is defined by the regular expression (in lex syntax):
{INT}E{INT}
{INT}\.{DIG}+
{INT}\.{DIG}+E{INT}
[+~]inf
[+~]NAN\([0-7]\)

where DIG = [0-9] and INT = (\~?{DIG}+), with the obvious interpretations. Infinities and NaNs cannot be entered in a non-IEEE754 system.

The exception:

exception Complex


is raised whenever an operation defined only on real values is applied on a complex with a non-zero imaginary part (even if it is denormalized).

The numerical functions are:

4.8  Large Integer Arithmetic

HimML includes a port of Arjen Lenstra's large integer package, which provides arbitrary precision arithmetic, i.e., arithmetic over a type Int that represents actual integers, without any size limitation as with the int type.

The exception:
exception Lip of int
is raised whenever an error occurs in any of the functions of this section. The numbers as arguments to the Lip exception are as follows: The functions are classified into several subgroups.

Conversions

Large Integer Arithmetic

Bit Manipulation



Large integers also serve as bit vectors, of varying size. Every non-negative integer represents a finite set of bits, those that are set in the binary representation of the integer; while every negative integer represents a cofinite set of bits (i.e., a set of bits whose complementary is finite), those that are set in the two's complement binary representation of the integer. For example, 23 is … 00010111 in binary, and represents the set {0, 1, 2, 4}, since 23 = 24 + 22 + 21 + 20. And −23 is … 11101001, and represents the set {0, 3, 5, 6, 7, …}.

Euclidean Algorithms

Standard Modular Arithmetic

Montgomery Modular Arithmetic



Modular multiplications can be done division free and therefore somewhat faster (about 20%), if the Montgomery representation is used [10]. Converting to and from Montgomery representation takes one Montgomery multiplication each, so it only pays to use Montgomery representation if many multiplications have to be carried out modulo a fixed odd modulus.

To use Montgomery arithmetic, first initialize the modulus N by using zmontstart, and convert all operands to their Montgomery representation by ztom, but do not convert exponents. Use the addition, subtraction, multiplication, squaring, division, inversion, and exponentiation functions, which all start with zmont, below on the converted operands, just as you would use the ordinary modular functions (starting with zm). The results can be converted back from Montgomery representation to ordinary numbers modulo N using zmonttoz.

Primes, Factoring

4.9  Input/Output



Two types are defined, first that of output streams outstream:

type 'rest outstream = |[put:string -> unit,... : 'rest]|

means that output streams are extensible records (classes) with at least a put method, to output a string to the stream. For example, to make a stream that prints to a string stored inside a ref sr:string ref:

|[put = fn text => (sr := !sr ^ text)]|

The type of input streams instream is:

type 'rest instream : |[get:int -> string,... : 'rest]|

providing at least a get method, that reads up to n characters from the stream, n being the number in argument. If less than n characters have been read, it usually means that an end of stream has been reached. (After the end of stream is reached, repeated calls to get will return the empty string.) For example, an input stream reading from a string s:string, with current position stored in a ref posr:num ref is:
|[get = fn n => let val goal = !posr+n
                    val newpos = (if goal<size s
                                     then goal
                                  else size s)
                in
                   substr(s,!posr,newpos)
                   before (posr := newpos)
                end]|
Note that if you want to read from a string, the instring primitive does this faster.

The following exception is defined:

exception IO of int

is raised when the system was unable to open a file or any other I/O-related operation; it returns as argument the error code returned by the operating system. (Note that this is OS-dependent.)

The other methods that may be present in predefined streams are: The input/output values and functions are:

4.10  Miscellaneous

Interesting values are:
1
Actually, all objects in HimML are compared by their addresses, because any two equal objects are located at the same address.

Chapter 5  Running the System

5.1  Starting up



HimML runs on Unix systems, and Amigas. Previous versions also worked on Apple Macintoshes, but this one lacks some functions. On a Mac, the only way to launch a HimML session is to double-click on the HimML icon; a text window opens, asking to enter Unix command-style arguments: enter the arguments to the HimML command, except the command's name itself. From then on, all work happens in this console, at toplevel, as on Amiga and Unix systems.

On Amigas and Unix boxes, type himml followed by a list of arguments. The legal arguments are obtained by typing himml ?, to which HimML should answer:
Usage: himml [-replay replay-file] [-mem memory-size]
  [-cmd ML-command-string] [-init ML-init-string] [-path path]
  [-col number-of-columns]
  [-grow memory-grow-factor] [-maxgrow max-memory-grow-factor]
  [-nthreads max-cached-threads] [-threadsize thread-cache-size]
  [-maxcells max-cells] [?] [-gctrace file-name]
  [-pair-hash-size #entries] [-int-hash-size #entries]
  [-real-hash-size #entries]
  [-string-hash-size #entries] [-array-hash-size #entries]
  [-pwd-prompt format-string] [-core-trace] [-data-hash-size #entries]
  [-c source-file-name] [-inline-limit max-inlined-size] [-nodebug]
  [-- arguments...]
and exit. Launching HimML without any arguments is fine. There are other HimML tools, used to compile, link and execute bytecode compiled files; they are listed at the end of this section.

To load a file, the use keyword may be used; it begins a declaration, just like val or type, that asks HimML to load a file and interpret it as if it were input at the keyboard (except it does not use stdin. The path that use uses can be extended on the command line by the -path switch, or inside HimML by changing the contents of usepath : string list ref, which is a reference to the list of volumes or directories in which to search for files, from left to right.

The explanation of the various options are:

5.2  Compiling, Linking, Finding Dependencies



As said earlier, the HimML distribution includes other tools to compile, link and run bytecode compiled files:

5.3  Debugger



HimML contains a debugger, as shown by consulting the set #debugging features, which should be non-empty. It can be called by the break function: Another way of entering the debugger is when an exception is raised but not caught by any handler.

There are two ways of entering the debugger. These are shown on entry by a message, stop on break (we entered the debugger through break, or by typing control-C or DEL when evaluating an expression), or stop at...(we entered the debugger at a breakpoint located just before the execution of an expression).

In any case, the debugger enters a command loop, under which you can examine the values of expressions, see the call stack, step through code, set breakpoints, resume or abort execution. The debugger presents a prompt, normally (debug). It then waits for a line to be typed, followed by a carriage return, and executes the corresponding command. These commands are: The way that the interpreter gives control to the debugger is by means of code points, which are points in the code where the compiler adds extra instructions. These instructions usually do nothing. When you set a breakpoint, they are patched to become the equivalent of break. Alternatively, these instructions also enter the debugger when we are single-stepping through some code.

These instructions are added by default by the compiler, but they tend to slow the interpreter. If you wish to dispense with debugging information, you may issue the directive:
(*$D-*)
which turns off generation of debugging information (of code points). If you wish to reinclude debugging information, type:
(*$D+*)


These directives are seen as declarations by the compiler, just like val or type declarations. As such, they obey the same scope rules. It is recommended to use them in a properly scoped fashion, either inside a let or local expression, or confined in a module.

5.4  Profiler



The way that the interpreter records profiling information is by means of special instructions that do the tallying.

These instructions are not added by default by the compiler, since they tend to slow the interpreter by roughly a factor of 2, and you may not wish to gather profiling information of every piece of code you write. To use the profiler, you first have to issue the directive:
(*$P+*)
which turns generation of profiling instructions on. The functions that will be profiled are exactly those that were declared with the fun or the memofun keyword.

If you wish to turn it off again, type:
(*$P-*)


These directives are seen as declarations by the compiler, just like val or type declarations. As such, they obey the same scope rules. It is recommended to use them in a properly scoped fashion, either inside a let or local expression, or confined in a module. Usually, you will want to profile a collection of modules. It is then advised to add (*P+*) at the beginning of each. Time spent in non-profiled functions will be taken into account as though it had been spent in their profiled callers.

Then, the HimML system provides the following functions to help manage profiling data: What can you do with profile information? The main goal is to detect what takes up too much time in your code, so as to focus your efforts of optimization on what really needs it. A good strategy to do this is the following:

5.5  Conditional Compilation

HimML offers a feature known as conditional compilation. A language like C, through the use of its preprocessor, provides directives named #if...[#else]...#endif, which may be used to compile one chunk of code or another, depending on the condition after the #if keyword being true or false.

Because HimML cannot have exactly the same features on each platform, this is a desirable feature to have to ensure portability of HimML applications across different OSes. This is already the main use of these directives in C: we can use the fork() call on Unix systems, but it is usually impossible even to emulate on other systems, like the Amiga or the Macintosh.

This portability concern also extends to different versions of HimML, even on the same platform. Some versions of HimML use a type system slightly different from other versions, or some may include a module system, or some may be interfaced with a graphical user interface library, ...And we want the HimML programmer to be able to write code that will correctly use the available facilities on each version of HimML. Detecting these differences, whether related to the processor type, operating system, or HimML version, can be done by examining the value of the special variable features.

Finally, having conditional compilation directives allows one to parameterize one's applications with respect to a file of global declarations. For example, if you want one version of your code with and one without debugging code interspersed, you might define a variable to be true in one case, false in the other, and then test it at all points where conditional behaviour should occur.

There is no such feature in Standard ML, probably because having conditional compilation would pose too many problems in general. Using if...then...else won't work in general. Consider:
if "callcc" inset #continuations features
then callcc (fn k => ...)
else ...
which is intended to test whether we have callcc in the implementation, and if so, to use it. This presents two problems. First, the type checker will be run on the whole expression, not just the part that will indeed be executed: if callcc is not provided in the implementation, then the the type-checker will just fail on the second line of the example. Then, even if we could overcome this problem, we would need an optimizer to recognize that the test expression above is actually a constant, and that only one branch of the test has to be compiled; and there are versions of HimML with no such optimizer (actually, none yet has one).

Instead, you should use an alternative conditional, built with #if...[#elsif...][#else]...#endif, namely the same preprocessor directives as in C. The #else and #elsif parts are optional, but don't forget the #endif: HimML needs to see it to know that the #if clause is over. Each keyword must lie at the beginning of the line. If this keyword is #if or #elsif, then the rest of the line is taken to be a test expression, in HimML syntax. Otherwise, the rest of the line is ignored. (So, don't write HimML code on the same line as an #else or an #endif!)

This conditional works mostly as in C, with the following important difference: the expression tested by #if or #elsif, which is the one after the keyword and extending to the end of the line, can be any HimML expression, which is evaluated to determine its truth value. (In C, we can only test for what the preprocessor knows, namely #defines and a few arithmetic comparisons.) But note well that these expressions are always evaluated in the toplevel environment, not the current environment.

For example:
fun f x =
#if x=3
    frozzle ()
#else
    foo ()
#endif
;
does not evaluate x as the value of the argument to f, which cannot be guessed at compile-time. Instead, the value is taken in the toplevel environment. Most likely, x won't be defined in the toplevel environment. This won't cause an error, though: If the test expression is ill-typed or gives rise to an uncaught exception, then it is assumed that its value if false. So, in this case, it is likely that f x will be compiled as just calling foo (), although the possibility that it will be compiled as frozzle () is not zero. It is doubtful that this is the intended code.

The same problem happens in the following less clear situation:
let val x=3
in
#if x=3
    frozzle ()
#else
    foo ()
#endif
end;
because the local binding introduced by let is not a toplevel binding.

The final case where an unexpected behaviour can occur is when writing toplevel declarations not ended by a semicolon (;). For example:
val x=3 (* and not 'val x=3;' *)
#if x=3
    frozzle ()
#else
    foo ()
#endif
;
will also have the same problem. The reason is that the HimML parser only processes declarations when it sees a semicolon or an end of file. Then, it parses, type-checks, compiles and evaluates all declarations before this semicolon or this end of file, as a whole. This is for efficiency reasons. Then, when it tries to evaluates the #if test above, the previous declaration val x=3 has not yet been processed, since it did not end with a semicolon.

The keywords #if, #else, #elsif, #endif must lie at the beginning of the line to be taken into account as conditional compilation directives. This is because #elsif and #endif are otherwise perfectly valid HimML expressions, namely the functions that select the field named elsif, resp. endif, from a record argument. If you wish to use the #elsif field selection function, you can do it by invoking (#elsif), its parenthesized version. It is strongly recommended not to use these keywords as fields, in fact.

For indenting reasons, any number of spaces or tabulations are allowed before the sharp (#) sign, and between the sharp sign and the if, else, elsif, or endif part.

You may test the conditional compilation directives in an interactive toplevel session, but it is generally advised not to use the conditional compilation directives at the toplevel prompt. If you have no prompt after typing return, most probably there is an unclosed #if expression (type #endif on a line by itself), or you haven't typed a semi-colon at the end of the line, so that the parser does not know that you have completed your input.

Finally, the test expression e may have side-effects, or may loop, or may raise an exception, in which case the whole when clause has side-effects, loops or raise the exception, but it is advised to avoid these behaviours.

Examples of uses of #if are as follows:

5.6  Separate Compilation and Modules

5.6.1  Overview

The main goal of the HimML module system is to implement separate compilation, where you can build your program as a collection of modules that you can compile independently from each other, and then link them together.

The HimML module system was designed so that it integrated well with the rest of the core language, while remaining simple and intuitive. At the time being, the HimML module system does not provide the other feature that modules are useful for, namely management of name spaces. The module system of Standard ML seems best for this purpose, although it is much more complex than the HimML module system.

Consider the following example. Assume that your program consists naturally of three files, a.ml, b.ml and c.ml. The most natural way of compiling it would be to type:
use "a.ml";
use "b.ml";
use "c.ml";
But, b.ml will probably use some types and values that were defined in a.ml, and similarly c.ml will probably use some types and values defined in a.ml or b.ml. In particular, if you want to modify a definition in a.ml, you will have to reload b.ml and c.ml to be sure that everything has been updated.

This is not dramatic when you have a few files, and provided they are not too long. But if they are long or many, this will take a lot of time. Separate compilation is the cure: with it, you can compile a.ml, b.ml, and c.ml separately, without having to reload other files first.

The paradigm that has been implemented in HimML is close to that used in CaML, and even closer in spirit to the C language. In particular, modules are just source files, as in C. Two new keywords are added to HimML: extern and open. Note that the Standard ML module system also has an open keyword, but there is no ambiguity as it is followed by a structure identifier like Foo in Standard ML, and by a module name like "foo" in HimML.

The extern keyword specifies some type or some value that we need to compile the current file, telling the type-checker and compiler that it is defined in some other file. Otherwise, if you say, for example, val y=x+1 in b.ml, but that x is defined in a.ml, the type-checker would complain that x is undefined when compiling b.ml. To alleviate this, just precede the declaration for y by:
extern val x : int
This tells the compiler that x has to be defined in some other file, and that it will know its values only when linking all files together. This is called importing the value of x from another module.

Not only values, but datatypes can be imported:
extern datatype foo
imports a datatype foo. The compiler will then know that some other module defines a datatype (or an abstype) of this name. However, it won't know whether this datatype admits equality, i.e. whether you can compare objects of this datatype by =. If you wish to import foo as an equality-admitting datatype, then you should write:
extern eqtype foo
Of course, if foo is a parameterized datatype, you have to declare it with its arity, for example:
extern datatype 'a foo
for a unary (not necessarily equality-preserving) datatype, or
extern eqtype ('a, 'b) foo
for an equality-preserving datatype with two type parameters.

Finally, dimensions can be imported as well:
extern dimension foo
imports foo as a dimension (type of a physical quantity, typically).

Given this, what does the following mean? We write a file "foo.ml", containing:
extern val x:int;
val y = x+1;
Then this defines a module that expects to import a value named x, of type int (alternatively, to take x as input), and will then define a new value y as x+1 and export it.

Try the following at the toplevel (be sure to place file "foo.ml" above somewhere on the load path, as referenced by the variable usepath):
val x = 4;
open "foo";
You should then see something like:
x : int
y : int
x = 4
y = 5
Opening "foo" by the open declaration above proceeded along the following steps: A variant on open is open*, which does just the same, except it does not try to recompile the source file "foo.ml": it just assumes that "foo.mlx" is up to date, or fails. This is useful when shipping compiled bytecode modules, and is used internally in the himmlpack and himmllnk tools.

Assume now that we didn't have any value x handy; then open would still have precompiled and opened the resulting object module "foo.mlx". Only, it would have failed to link it to the rest of the system. If you wish to just compile "foo.ml" without loading it and linking it, issue the directive:
#compile "foo.ml"
at the toplevel. (The # sign must be at the start of the line.) This compiles, or re-compiles, "foo.ml" and writes the result to "foo.mlx".

5.6.2  Header Files

Another problem pertaining to separate compilation is how to share information between separate modules. For example, you might want to define again three modules a.ml, b.ml and c.ml, where a.ml would define some value f (say, a function from string to int), and b.ml and c.ml would use it.

A first way to do this would be to write: but this approach suffers from several defects. First, no check is done that the type of f is the same in all three files; in fact, the check will eventually be performed at link time, that is, when doing:
open "a";
open "b";
open "c";
but we had rather be warned when first precompiling the modules.

Then, whenever the type of f changes in a.ml, we would have to change the extern declarations in all other files, which can be tedious and error-prone.

The idea is then to do as in the C language, namely to use one header file common to all three modules. (This approach still has one defect, and we shall see later one how we should really do.) That is, we would define an auxiliary file "a_h.ml" (although the name is not meaningful, the convention in HimML is to add _h to a module name to get the name of a corresponding header file), which would contain only extern declarations. This file, which contains in our case:
extern val f : string -> int;
is then called a header file.

We then write the files above as: This way, there is only one place where we have to change the type of f in case we wish to do it: the header file a_h.ml.

What is the meaning of using a_h.ml in a.ml, then? Well, this is the way that type checks are effected across modules. The meaning of extern then changes: in a.ml, f is defined after having been declared extern in a_h.ml, so that f is understood by HimML not as being imported, rather as being exported to other modules. This allows HimML to type-check the definition of f against its extern declaration, and at the same time to resolve the imported symbol f as the definition in a.ml. This is more or less the way it is done in C.

On thing that still does not work with this scheme, however, is how we can share datatypes. This is because datatype declarations are generative. Try the following. In a_h.ml, declare a new datatype:
datatype foo = FOO of int;
extern val x:foo;
In a.ml, define the datatype and the value x:
use "a_h.ml";

val x = FOO 3;
Now in b.ml, write:
use "a_h.ml";

val y = x : foo;
Then, open "a", then "b". This does not work: why? The reason is that the definition of the datatype foo in a_h.ml is read twice, once when compiling a.ml, then when compiling b.ml, and that both definitions created fresh datatypes (which just happen to have the same name foo). These datatypes are distinct, hence in val y = x : foo, x has the old foo type, whereas the cast to foo is to the new foo type.

The remedy is to avoid useing header files, and to rather open them. So write the following in a.ml:
open "a_h";

val x = FOO 3;
and in b.ml:
open "a_h";

val y = x : foo;
Opening a_h produces a compiled module a_h.mlx, which holds the definition for foo and the declaration for x. In the compiled module, the datatype declaration for foo is precompiled, so that opening a_h does not re-generate a new datatype foo each time a_h is opened, rather it re-imports the same.

Technically, imagine that fresh datatypes are produced by pairing their name foo with a counter, so that each time we type datatype foo = FOO of int at the toplevel, we generate a type (foo, 1), then (foo, 2), and so on. This process is slightly changed when compiling modules, and the datatype name is paired with the name of the module instead, say, (foo, a_h). Opening a_h twice then reimports the same datatype.

The same works for exceptions, except there is no extern exception declaration. The reason is just that it would do exactly the same as what exception already does in a module. If you declare:
exception Bar of string;
in a_h.ml, and import a_h as above, by writing open "a_h" in a.ml and b.ml, then both a.ml and b.ml will be able to share the exception Bar. Typing the following in a_h.ml would not work satisfactorily, since Bar would not be recognized as a constructor in patterns:
extern val Bar : string -> exn;
That is, it would then become impossible to write expressions such as:
f(x) handle Bar message => #put stdout message
in a.ml. However, if you don't plan to use pattern matching on Bar, then the latter declaration is perfectly all right.

5.6.3  Summary

The following commands are available in HimML: It is easier to compile modules by typing the following under the shell:
himml -c foo.ml
which does exactly the same as launching HimML, and typing #compile "foo"; quit 0; under the HimML toplevel.

You can then use himml as a HimML standalone compiler, and compile each of your modules with himml -c. This is especially useful when using the make utility. A typical makefile would then look like:
.mlx : %.ml
        himml -c $<

a_h.mlx: a_h.ml
a.mlx: a.ml a_h.mlx
b.mlx: b.ml a_h.mlx
pack.mlx: pack.ml a.mlx b.mlx
The first lines define a rule how to make compiled HimML modules from source files ending in .ml. It has a syntax specific to GNU make. If your make utility does not support it, replace it by:
.SUFFIXES: .mlx .ml
.mlx.ml:
        himml -c $<
The last lines of the above makefile represent dependencies: that a.mlx depends on a.ml and a_h.mlx means that make should rebuild a.mlx (from a.ml, then) whenever it is older than a.ml or a_h.mlx. Such dependencies can be found automatically by the himmldep utility. For example, the dependency line for a.mlx was obtained by typing:
himmldep a.ml
at the shell prompt.

There is no specific way to link compiled modules together, since open already does a link phase. To link a.mlx and b.mlx, write a new module, say pack.ml, containing:
open "a";
open "b";
then compile pack.ml. The resulting pack.mlx file can also be executed, provided it has no pending imported identifiers, either by launching HimML, opening pack, and running main (); (provided pack.ml exports one such function), but it is even easier to type the following from the shell:
himmlrun pack
Under Unix, every module starts with the line:
#!/usr/local/bin/himmlrun
assuming that /usr/local/bin is the directory where himmlrun was installed, so that you can even make pack.mlx have an executable status:
chmod a+x pack.mlx
and then run it as though it were a proper executable file:
pack.mlx
This will launch himmlrun on module pack.mlx, find a function main and run it.

5.7  Editor Support



Any ASCII text editor can be used to write HimML sources. But an editor can also be used as an environment for HimML. In GNU Emacs, there is a special mode for Standard ML, called `sml-mode.el' and that comes with the Standard ML of New Jersey distribution, that can be adapted to deal with HimML: this is the `ml-mode.el' file. However, it was felt that it did not indent properly in all cases, because of the complicated nature of the ML syntax. A replacement version is in the works, called `himml-mode.el'; it is not yet operational.

5.8  Bugs

Remember: a feature is nothing but a documented bug! You may therefore consider the following as features :-).

5.9  Common Problems

5.9.1  Problems When Installing HimML

P:
When I type make, nothing happens except that I get a message telling me to type a sequence of commands.

This is normal. The installation procedure needs to make configuration files, for interpreting your favorite options (in file OPTIONS) or for determining system or compiler behaviours. So, just do as indicated.

P:
I don't understand the meaning of an option in file OPTIONS.

Then leave it alone. Most options have reasonable default values.

P:
When I run HimML, it just stops on abort: attempt to longjmp() to lower stack or a similar message.

See next question.

P:
After typing make, I get messages such as:
mksyscc: 20847 Abort - core dumped
longjmp() is brain-damaged (won't allow you to jump to a lower stack)
trying to find a standard patch...
Some operating systems (mostly BSD systems, although the only example I know is AIX) implement a “smart” longjmp() routine that first checks whether the current stack pointer is lower than the one it is trying to restore, and aborts if this is not the case. HimML needs to be able to do just that, in order to implement continuations (and continuations are heavily used internally, even if you don't plan to use them). The best solution I've come up with on AIX is to write a small patching utility (dpxljhak) that hunts for a specific piece of code in the prologue of the longjmp() function and puts no-ops instead. A better solution would be to rewrite the function in assembler, but I've been unable to do this.

If this happens to you, try to rewrite longjmp() so that it does not check for stack levels and link your new definition. Or write a patch, just like me; you'll need to experiment a bit.

Please also contribute your modification so that I can include it in the next HimML release. (See MAINTENANCE at the end of the OPTIONS file to know whom to write to.)

P:
My machine is a Cray/VMS machine/PC-Dos machine, and I cannot manage to make the darn thing compile or execute.

Cray machines have a weird stack format, and my scheme for capturing continuations has no hope of working on these machines. If it's absolutely necessary for you, I'll see what I can do, provided you promise to tell me whether it works or not. (See MAINTENANCE at the end of the OPTIONS file to know my address.)

I don't have any VMS machine handy, so I cannot test HimML on it. The HimML implementation is pretty much centered around Unix, so I would be surprised if it worked without changes. Please tell me what you have been forced to do to make it work.

PC-Dos machines won't do. 640K is not enough for HimML, and HimML has no knowledge of extended or expanded memory. HimML must run in one segment only, lest its sharing mechanism be defeated by one physical address having two distinct representations (from two different segments). This may work on 486's or higher, which can use large segments, but the operating system (Dos or Windows, any version until now) is the stumbling block. Your best bet is to change for Linux or any other Unix for PCs. Windows/NT or OS/2 is expected not to pose any problem.

P:
When I run HimML, it just core dumps.

Check the OPTIONS file: there is no safeguard against illegal values there (in particular stack values). Put back the default values; if this does not work, try to increase the stack parameters (notably SAFETY_SIZE and SECURITY). See also previous questions; it is quite likely that this is due to stack problems. If nothing works, mail me (goubault@lsv.ens-cachan.fr, see MAINTENANCE at the end of the OPTIONS file).

5.9.2  Problems When Running HimML

P:
I have typed a command line at the toplevel prompt, then typed return, but nothing happens.

Most probably, you have not terminated your command line with a semicolon (;). Although the syntax of Standard ML makes semicolons optional between declarations, the toplevel parser has no way of knowing that input is complete unless it finds a terminating semicolon (or an end of file). Consider also all the ways to complete input such as, say, 1: if you write a semicolon afterwards, then this is an abbreviation of val it=1;, but if you write +2;, even on the following line, then you really meant val it=1+2;, and if you type return just after 1, the parser has no way to know which possibility you intended.

It may happen that typing a semicolon does not cure the problem. This may happen is you have not closed all parentheses and brackets. Consider (frozzle (): if you type a semicolon afterwards, then your input is still incomplete, as you may want to write, say, (frozzle (); foo). The semicolon is not only a declaration separator, but also the sequence instruction.

Finally, it may be the case that you are in the middle of a conditionally compiled phrase. See Section 5.5 for details.
P:
When opening modules that open header modules, I keep getting type errors, and the explanation is that some datatypes are not the same in each type?

First, check that you are not defining or declaring datatypes (or dimensions) in header files that you use instead of opening. Each time you use a given file, it creates new versions of the datatypes or dimensions inside it. To avoid it, open the file instead; this creates unique stamps for the datatype (or dimension), which it records in a file of the same name, with .mlx at the end. This will work only if your header file can be compiled separately, so be prepared to modularize your code.

If the above does not apply, it may happen that your .ml files have inconsistent modification dates. The module system always tries to recompile a .ml file when the .ml file appears to be newer than the corresponding .mlx file. Therefore, if the last modification date of the .ml file is some future date, it will always recompile it, as many times as it is opened; and this leads to the same problem as above. A quick fix is to set the modification date manually (with touch on Unix, or setdate on Amigas; there's probably a public-domain utility to fix this on Macintoshes, but I don't know). In any case, there's probably something wrong with the way the date is set up on your system, and it's worth having a look at it.

5.10  Reporting Bugs, Making Suggestions



This is an alpha revision of HimML. This means that I do not consider it as a distributable version. This means that I deem the product robust enough to be given only to my friends, counting on their comprehensive support, mostly as far as bugs are concerned. This also means that I want some feedback on the usability of the language, and on reasonable ways to improve the implementation.

To help me improve the implementation (and possibly the language, though I am not eager to), you can submit a note to the person in charge of maintaining the system (type #maintenance features at the toplevel to know who, where and when). The preferred communication means is electronic mail, but others (snail-mail notably) are welcome. If you think you have found a bug in HimML, or if you want something changed in HimML, you should send the person in charge a message that should contain:

Appendix A  Precedence Table

This is the default set of precedences when you launch HimML:
infix   0 before
infix   3 o :=
infix   4 = <> < > <= >= #< #> #<= #>= inset inmap submap subset strless
infixr  5 @ O
infixr  5 ::
infix   6 + - ^ #+ #- ++ U <| <-| |> |->
infix   7 * div mod divmod #* #/ fdiv fmod fdivmod & \ delta intersects
infixr  8 #^
infix   9 nth to sub
Note that @ is declared infixr, that is, right-associative, although the Definition of Standard ML dictates that it is infix, i.e. left-associative. This should not make much differences.

References

[1]
Martín Abadi, Luca Cardelli, Benjamin Pierce, and Gordon Plotkin. Dynamic typing in a statically-typed language. In 16th ACM Symposium on Principles of Programming Languages, pages 213–227. ACM, January 1989.

[2]
Bruce F. Duba, Robert Harper, and David Mac Queen. Typing first-class continuations in ML. In 17th ACM Symposium on Principles of Programming Languages. ACM, January 1990.

[3]
Jean Goubault. Ensembles dans les langages et méthodes de spécification formelle : le cas de VDM et Z. Technical report, Bull S.A. Corporate Research Center, June 1992.

[4]
Jean Goubault. Implementing functional languages with fast equality, sets and maps: an exercise in hash consing. Technical report, Bull S.A. Corporate Research Center, June 1992.

[5]
Jean Goubault. Inférence d'unités physiques en ML. In Journées Francophones des Langages Applicatifs, pages 3–20. INRIA, janvier 1994.

[6]
Robert Harper, Robin Milner, and Mads Tofte. The Definition of Standard ML. MIT Press, 1990.

[7]
Xavier Leroy and Michel Mauny. Dynamics in ML. Rapport de Recherche 1491, INRIA, Domaine de Voluceau, Rocquencourt, BP 105, France, juillet 1991.

[8]
T.G. Lewis and W.H. Payne. Generalized feedback shift register pseudorandom number algorithm. Journal of the ACM, 20(3):456–468, July 1973.

[9]
Robin Milner and Mads Tofte. Commentary on Standard ML. MIT Press, 1991.

[10]
Peter L. Montgomery. Modular multiplication without trial division. Mathematical Computing, 44:519–521, 1985.

[11]
William Pugh and Tim Teitelbaum. Incremental computation via function caching. In 16th ACM Symposium on Principles of Programming Languages, pages 315–328. ACM, January 1989.

[12]
Jonathan A. Rees and William Clinger. The revised3 report of the algorithmic language scheme. ACM SIGPLAN Notices, 21(12):37–79, December 1986.

[13]
Didier Rémy. Typechecking records and variants in a natural extension of ML. In Proceedings of the 1989 Conference on Principles of Programming Languages, pages 77–88. ACM, 1989.

[14]
Jean-Pierre Talpin and Pierre Jouvelot. The type and effect discipline. In Proceedings of the 1992 Conference on Logics in Computer Science. IEEE, 1992.

Keyword Index

& ` ^ ^ ~ <= <> <-| = > >= | |> || |-> |[]| |] - -> -- ; := :: ! ? .:= .( () [| {} @ * * \ #^ #~ #<= #> #>= #- #/ #* #+ + ++ ++|[ abort abs ach acos advance all andalso app app append appendfile arg args args Arith Arith array array -array-hash-size arrayoflist Ascii ash asin atan ath b band band before Bind bool bor break break bxor c -c callcc card catch Catch cd ceil ch choose choose_rng chr classify clear_profiles close cmd -col #compile #compile Complex concat conj cont cont continuations continue convert cos creator d (*$D-*) (*$D+*) debugging delay delete delete delete delta denormal dimension dir div divmod dom down dynamic e elems #else #elsif empty Empty #endif erase exception exists exn exp exp1 explode extern fabs false fdiv fdivmod features filetype floatformat floor flush fmax fmin fmod fold force force frexp fsqr fsqrt gc -gctrace get getenv getline -grow h handle hd help himml -c himmldep himmldep himmllnk himmlrun himmlrun iarray iarrayoflist identity #if ilength im implode inc inc inds infile -init -inline-limit inline_limit inline_limit in list inmap in map inoutprocess inprocess inset in set instream instring int int int Int Int intarray integer inter intersects -int-hash-size Intofset intofstring inv IO iomsg irand isub it iterate iupdate l ldexp leftmargin len length Lip list list log log1 lsl lsl lsl -m> maintenance major_gc map mapadd mapaddunder MapGet mapoflist mapremove match Match matches matchsub max -maxcells -maxgrow max_int max_int maxprintlines maybe MDenormal -mem memofn memofun min MInf min_int min_int MNaN MNormal mod modf n natural nbits nbits next nil nil -nodebug NoMem NONE NonSense not nth Nth -nthreads nthtail null num num numbers numClass numformat numofstring o O open open open* option ord ordof orelse OS outfile outprocess outstream outstring overflown overwrite p (*$P-*) (*$P+*) pack -pair-hash-size ParSweep -path -path PDenormal pi PInf PNaN PNormal pretty primes print print profiling promise pwd -pwd-prompt q quit quit raise rand random Range re RE real real -real-hash-size rec recompiled ref ref reftyping regexp remsg rename -replay report_profiles reset_profiles ReturnToTheFuture rev revappend revfold rightmargin rng s sb seek seekend set setofInt sh showbreakpoints sin size some SOME split sqr sqrt Stack stamp_world started stderr stdin stdout step string -string-hash-size StringNth strless structures sub submap sub map Subscript subset subst substr such that system system_less table table tan t_collect tell t_get th -threadsize throw times t_iter tl to t_put t_put_behind t_remove t_reset true truncate u U underwrite union unit up update use usepath val version void VOID w where while zabs zadd zadd zaddbit zand zasl zasr zasr1 zbastoz zbezout zbit zcat zchirem zcompare zdivmod zecm Zero zgcd zgcde zhighbits zint zjacobi zlog zlowbits zlsl1 zm2exp zmadd zmakeodd zmdiv zmexp zmexp2 zmexpm zminv zmmul zmod zmontadd zmontdiv zmontexp zmontexp2 zmontexpm zmontinv zmontmul zmontsqr zmontstart zmontsub zmonttoz zmsexp zmsmul zmsqr zmsub znbits zneg znot znum zodd zor zpollardrho zprime zrand zrand1 zrandfprime zrandfprime1 zrandgprime zrandgprime1 zrandl zrandl1 zrandom zrandpprime zrandpprime1 zrandprime zrandprime1 zrembit zreverse zroot zsadd zsadd zsbastoz zsdivmod zsexp zsexp zsign zsjacobi zsmexp zsmmul zsmod zsmontmul zsnbits zsqr zsqrt zsreverse zstobas zstosymbas zsub zsweight ztobas ztomont ztosymbas ztrydiv zweight zxor


This document was translated from LATEX by HEVEA.