Before we submerge into the code (just under 600 lines), some “useful” Q/A:

  1. How do I print in SML? With the print : string -> unit function. When printing more than just a string use this notation (notice the parentheses): val () = print("hi:" ^ Int.toString x ^ "\n")
  2. When should I compile? Very often. Like after every line of code you write. Otherwise you’ll be fighting the compiler (and believe me, it prints out a ton of cryptic messages)
  3. Why is this so basic? Post comments on what you want to hear, useful things to cover, more detailed explanations, etc.

Ok, now where to start? As all things ML, we’ll start at the top and work our way down, noting things as we go along..

(*  1*) structure MiniML :> 
(*  2*)    sig 
(*  3*)       val interpreter : unit -> unit 
(*  4*)    end = 
(*  5*) 
(*  6*) struct

Already there are a couple of unfamiliar keywords. A structure in SML is loosely like a Class in Java/C++. They are a grouping of functions and values. Optionally they have a signature which acts like interfaces in Java (the sig ... end block). In this case, only one val, interpreter is exposed. A signature can be either defined in the structure (as in this case), or in an external file. The vals are accessed like static members of a Java class: MiniML.interpreter. Note the type unit in the signature. There is only one kind of unit and it is expressed as a zero-tuple (just an empty set of parentheses)

(*  7*)   structure T = TypeCheck 
(*  8*)   structure TC = TypeContext 
(*  9*)   structure A = AbSyn 
(* 10*)   structure PP = PrettyPrinter 
(* 11*)   structure TypePP = TypePP 
(* 12*) 
(* 13*)   exception UnboundVariable of string 
(* 14*)   exception InternalError of string 
(* 15*)

Lines 7-11 declare a shorthand for using other structures similar to java’s import keyword. Both Absyn.isNil and A.isNil are equivalent. Lines 13-14 declare exception types. They must be declared before use. The throw and catch keywords in Java have SML counterparts: raise and handle

(* 16*) (* values *) 
(* 17*) 
(* 18*) datatype conId = True_v | False_v | Nil_v | None_v | Some_v | Cons_v 
(* 19*) 
(* 20*) datatype env = Env of A.exp -> value 
(* 21*) and 
(* 22*)  value = 
(* 23*)     Int_v of int     (* an int or a char *) 
(* 24*)   | Real_v of real   (* a real *) 
(* 25*)   | String_v of string (* a string *) 
(* 26*)   | Char_v of char (* a char *) 
(* 27*)   | Tuple_v of value list (* a tuple *) 
(* 28*)   | Record_v of ( * value) list (* a record *) 
(* 29*)   | DataCon_v of (conId * (value option)) (* constructor *) 
(* 30*)   | Closure_v of A.exp * * (env ref) (* "code" * arg * runtime env *) 
(* 31*)   | Ref_v of (value ref) 
(* 32*) 
(* 33*)

Line 18 creates a type conId. This type works like a Java/C++ enumeration since none can take in a value. Note the use of [and] on line 21. This is because the definitions of [env] and [value] are dependent on each other, env requires the value type, and some values (lines 27-31) require an env.

This is also the first time we see [of]. [of] defines what type of arguments each member takes, like a constructor. Line 27 is the first list type definition. This type is read “Tuple_v takes a list of value’s”. Line 29 uses the option type.

Unlike other languages, SML does not allow null (no nullpointer exceptions!), so optional arguments are given in a option datatype, which can be either [SOME(x) | NONE].

Line 30 is the first use of [ref] datatype. It is the only mutable datatype in SML. One can assign/reassign values of references. Normally in ML, one would recreate the tuple/value, but sometimes it is much more convenient to just change the value of a ref. Later on we’ll see how to create a new ref. The type of “env ref” is read “A ref whose value is of type env”

(* 34*) (* Used to generate "fresh" variable names during pattern match 
(* 35*) (* compilation. *) 
(* 36*)     local 
(* 37*)         val counter = ref 0 
(* 38*)     in 
(* 39*)         fun gensym() = 
(* 40*)             let val c = Int.toString(!counter) 
(* 41*)             in 
(* 42*)                 counter := (!counter) + 1; 
(* 43*)                 "%temp"^c 
(* 44*)             end 
(* 45*)     end 
(* 46*)

Lines 36-45 define a counter function. The [local] keyword works like a [let], so the counter value can never be accessed outside of the [gensym] function.

Line 37 creates a new reference with initial value 0, like the new keyword in OO languages. All refs must have an initial value. Note the reuse of “ref”, earlier we used it as a datatype, and now it is used as a constructor. While this may be confusing it is worth noting.

The “!” in lines 40 and 42 is used to dereference a ref, much like * is used to dereference a pointer in C and its type is [int ref -> int] (it is polymorphic, so the int’s are actually ‘a, or arbitrary types). The “:=” in line 42 is an assignment and its type is [(int ref * int) -> unit] (like the arithmetic and string concatenations, it is in infix notation which you can look up on your own). The professor could have also used [ : int ref -> unit]. The “^” in line 43 is string concatenation operation (type: [string * string -> string]).

Next I’ll go over the helper functions in detail. Remember to post comments or I’ll stop writing!