August 2006

Here’s some random stuff I found useful after discussing with Phil on “how to start writing ML.”

1. use eval_be to handle binary operations and eval_ue for unary operations.

2. use conv() to handle A.Const_e and A.DataCon_e

3. patMatch() basically handle two cases: “Let val …” and “case … of …” both of these requires pattern matching (that’s why it is there).

4. Before you handling A.Let_e(A.Val_d(…)) in patMatch, it is a good idea to study how A.Case_e() works in patMatch. That should give you some hint on how to proceed.

5. Don’t worry about the A.Id_t(“pat”) or A.Id_t(“patType”) that you found in interp.sml. They are not used at all.

6. contains useful function to use when handling A.Tuple_p and A.Record_p in pat_compile. Specifically you need to walk down the tuple/records and perform matching one by one. ListPair is different from List where ListPair walk down two lists simultaneously (your root and p) whereas List do the same thing on one list.

7. Lastly, the assignment statement “x := 3” returns a unit “()”.


I’m leaving the country Tuesday Aug 29 until Sept 21. JC has volunteered to add everyone who wishes to contribute to the projects posts and post scribe notes while I’m gone.

Best of luck!

For Sale!!! 1 MiniML interpreter for the LOW LOW price of just countless hours!! Seriously though, here are some (possibly) useful notes on how to tackle this beast (Make sure you start early)

Basically there are two parts, evaluate and pattern_compile. For evaluate, I’d suggest the following (in order):

  1. Const/Id
  2. Unop/Binop
  3. Fn/App
  4. Ref/Deref/Assign
  5. Tuple/Ith
  6. Datacon/If
  7. Let(Fun)
  8. Case/Let(Val)
  9. Record/Field

You won’t need to touch patterns until Case/Let(Val). And you’ll need DataCon to do If, and everything before to do Case/Let(Val).

ML-Dev Screenshot

A lot of time with SML, at least initially, will probably be spent fighting with the compiler. Specifically the cryptic compiler errors that are spit out. For those that are not fearful of IDE’s here’s a nifty plugin for eclipse that parses your SML and puts thosse infamous squigglys under the BAD parts of code. It works (hehe) and has been tested but I offer little support for the plugin. Attached is the zip file that you extract unto your eclipse/plugins dir.

The plugin creates a SML perspective and does all the nifty syntax highlighting for your sml (and sig) files. If you have smlnj installed, you can also optionally run the program from withing eclipse, or use MLton to typecheck every time you save (useful, but possibly time consuming). If there is interest, I’ll post how to do all of those.

Unlike the java editor however, it has no debugging/breakpoint support. After all, it’s only written by two people (but eclipse hackers are welcome to contribute!)

So, without further ado, here’s the link to the zip file:

And the user guide (a little older):

If anyone wants to check their project, pasting the code into smlnj is the best way, but if you’re on a linux machine you can run my compiled binary. It has a debug option that will print out the expression tree on every iteration of eval (type “:d” into the interpreter to toggle)

Changes to get interpreter to work on MLton:

  1. Run MLton versions of ml-lex and ml-yacc (smlnj uses Unsafe structure)
  2. Change some matches in typecheck.sml
  3. Add mlb, and main SML file that just runs the interpreter

The following tar.gz file has the MLton modifications and my compiled interpreter:

OK, I think I’m just about done with the project (anyone else? or have you not started yet?)

You can run the miniML code (for sanity sake) in SML/NJ simply by doing the following:

  1. type [datatype 'a list = Nil | Cons of 'a * 'a list;] into SML/NJ
  2. In a separate editor, open the test case and replace “\nval” with “\n;val” (so the values are printed to stdout) (but don’t save the file since miniML doesn’t like “;”)
  3. Paste the code into SML/NJ.

Time permitting, I may compile compiled my interpreter (it’s NOT A REFERENCE IMPL) using MLton ( and post linux/windows binaries.

Before we submerge into the code (just under 600 lines), some “useful” Q/A:

  1. How do I print in SML? With the print : string -> unit function. When printing more than just a string use this notation (notice the parentheses): val () = print("hi:" ^ Int.toString x ^ "\n")
  2. When should I compile? Very often. Like after every line of code you write. Otherwise you’ll be fighting the compiler (and believe me, it prints out a ton of cryptic messages)
  3. Why is this so basic? Post comments on what you want to hear, useful things to cover, more detailed explanations, etc.

Ok, now where to start? As all things ML, we’ll start at the top and work our way down, noting things as we go along..

(*  1*) structure MiniML :> 
(*  2*)    sig 
(*  3*)       val interpreter : unit -> unit 
(*  4*)    end = 
(*  5*) 
(*  6*) struct

Already there are a couple of unfamiliar keywords. A structure in SML is loosely like a Class in Java/C++. They are a grouping of functions and values. Optionally they have a signature which acts like interfaces in Java (the sig ... end block). In this case, only one val, interpreter is exposed. A signature can be either defined in the structure (as in this case), or in an external file. The vals are accessed like static members of a Java class: MiniML.interpreter. Note the type unit in the signature. There is only one kind of unit and it is expressed as a zero-tuple (just an empty set of parentheses)

(*  7*)   structure T = TypeCheck 
(*  8*)   structure TC = TypeContext 
(*  9*)   structure A = AbSyn 
(* 10*)   structure PP = PrettyPrinter 
(* 11*)   structure TypePP = TypePP 
(* 12*) 
(* 13*)   exception UnboundVariable of string 
(* 14*)   exception InternalError of string 
(* 15*)

Lines 7-11 declare a shorthand for using other structures similar to java’s import keyword. Both Absyn.isNil and A.isNil are equivalent. Lines 13-14 declare exception types. They must be declared before use. The throw and catch keywords in Java have SML counterparts: raise and handle

(* 16*) (* values *) 
(* 17*) 
(* 18*) datatype conId = True_v | False_v | Nil_v | None_v | Some_v | Cons_v 
(* 19*) 
(* 20*) datatype env = Env of A.exp -> value 
(* 21*) and 
(* 22*)  value = 
(* 23*)     Int_v of int     (* an int or a char *) 
(* 24*)   | Real_v of real   (* a real *) 
(* 25*)   | String_v of string (* a string *) 
(* 26*)   | Char_v of char (* a char *) 
(* 27*)   | Tuple_v of value list (* a tuple *) 
(* 28*)   | Record_v of ( * value) list (* a record *) 
(* 29*)   | DataCon_v of (conId * (value option)) (* constructor *) 
(* 30*)   | Closure_v of A.exp * * (env ref) (* "code" * arg * runtime env *) 
(* 31*)   | Ref_v of (value ref) 
(* 32*) 
(* 33*)

Line 18 creates a type conId. This type works like a Java/C++ enumeration since none can take in a value. Note the use of [and] on line 21. This is because the definitions of [env] and [value] are dependent on each other, env requires the value type, and some values (lines 27-31) require an env.

This is also the first time we see [of]. [of] defines what type of arguments each member takes, like a constructor. Line 27 is the first list type definition. This type is read “Tuple_v takes a list of value’s”. Line 29 uses the option type.

Unlike other languages, SML does not allow null (no nullpointer exceptions!), so optional arguments are given in a option datatype, which can be either [SOME(x) | NONE].

Line 30 is the first use of [ref] datatype. It is the only mutable datatype in SML. One can assign/reassign values of references. Normally in ML, one would recreate the tuple/value, but sometimes it is much more convenient to just change the value of a ref. Later on we’ll see how to create a new ref. The type of “env ref” is read “A ref whose value is of type env”

(* 34*) (* Used to generate "fresh" variable names during pattern match 
(* 35*) (* compilation. *) 
(* 36*)     local 
(* 37*)         val counter = ref 0 
(* 38*)     in 
(* 39*)         fun gensym() = 
(* 40*)             let val c = Int.toString(!counter) 
(* 41*)             in 
(* 42*)                 counter := (!counter) + 1; 
(* 43*)                 "%temp"^c 
(* 44*)             end 
(* 45*)     end 
(* 46*)

Lines 36-45 define a counter function. The [local] keyword works like a [let], so the counter value can never be accessed outside of the [gensym] function.

Line 37 creates a new reference with initial value 0, like the new keyword in OO languages. All refs must have an initial value. Note the reuse of “ref”, earlier we used it as a datatype, and now it is used as a constructor. While this may be confusing it is worth noting.

The “!” in lines 40 and 42 is used to dereference a ref, much like * is used to dereference a pointer in C and its type is [int ref -> int] (it is polymorphic, so the int’s are actually ‘a, or arbitrary types). The “:=” in line 42 is an assignment and its type is [(int ref * int) -> unit] (like the arithmetic and string concatenations, it is in infix notation which you can look up on your own). The professor could have also used [ : int ref -> unit]. The “^” in line 43 is string concatenation operation (type: [string * string -> string]).

Next I’ll go over the helper functions in detail. Remember to post comments or I’ll stop writing!

Next Page »