Parametric polymorphism is a basic mechanism in ML for writing code that is generic, i.e., that can be used on multiple different types. To get the basic idea of how what parametric polymorphism is, think about the following simple example.

  1. module M : sig
  2. (* Takes a list and returns a stuttered version, e.g., [1;2;3] is mapped to [1;1;2;2;3;3] *)
  3. val double : 'a list -> 'a list
  4. end = struct
  5. let rec double = function
  6. | [] -> []
  7. | hd :: tl -> hd :: hd :: double tl
  8. end

In the type signature for double, the expression 'a is a type variable, meaning that this function can be used with an arbitrary type plugged in for 'a. The reason that the type variable shows up is that the code of double doesn't depend in any way on the properties of the elements of the list. At first glance, parametric polymorphism doesn't seem all that powerful. After all, how useful is it to write functions that can only be generalized over all possible types? More often than not you want to write functions that can be used over some narrow universe of types with particular properties. Object oriented languages provide this functionality with subtyping, and Haskell lets you get at this with type-classes. What do you do in ML?

It turns out that ML does allow you to write functions like this in a quite straightforward and ordinary way. A simple example can be found be examining the signature of a standard sort function:

  1. val sort : cmp : ('a -> 'a -> int) -> 'a list -> 'a list

The signature above can be used on a value of any type, provided that you can also provide a comparison function for that type. In other words, a polymorphic function can take advantage of the idiosyncratic capabilities of the types it deals with, but the ability to take advantage of those capabilities must be passed in along with the values in question.

This is used in small ways throughout any non-trivial ML codebase. But we can use this in a more structured way by creating what are sometimes called type-indexed values. A type-indexed value is a value used to represent a set of capabilities associated with a type. Here's an example of a simple type-indexed value for capturing number-ness. In what follows, the type-indexed value is Num.Type.t, and the rest of the Num module is just utility functions to make the interface pretty.

  1. module Num : sig
  2. module Type : sig
  3. type 'a t
  4. val int : int t
  5. val float : float t
  6. end
  7.  
  8. val (+) : 'a Type.t -> 'a -> 'a -> 'a
  9. val (-) : 'a Type.t -> 'a -> 'a -> 'a
  10. val ( * ) : 'a Type.t -> 'a -> 'a -> 'a
  11. val neg : 'a Type.t -> 'a -> 'a
  12. val zero : 'a Type.t -> 'a
  13. val sum : 'a Type.t -> 'a list -> 'a
  14. val sum_product 'a Type.t -> 'a list -> 'a list -> 'a
  15. end = struct
  16. module Type = struct
  17. module T = struct
  18. type 'a t = {
  19. plus : 'a -> 'a -> 'a;
  20. mul : 'a -> 'a -> 'a;
  21. neg : 'a -> 'a;
  22. zero : 'a;
  23. }
  24. end
  25. open T
  26.  
  27. let int = { plus = Int.(+);
  28. neg = Int.(-);
  29. zero = Int.zero;
  30. mul = Int.mul; }
  31.  
  32. let float = { plus = Float.(+);
  33. neg = Float.(-);
  34. zero = Float.zero;
  35. mul = Float.mul; }
  36.  
  37.  
  38. end
  39. open Type.T
  40.  
  41.  
  42. let (+) typ x y = typ.plus x y
  43. let neg typ x = typ.neg x
  44. let zero typ = typ.zero
  45. let ( * ) typ x y = typ.mul x y
  46.  
  47. (* Some derived operations *)
  48. let (-) typ x y = typ.plus x (typ.neg y)
  49. let sum typ l = List.fold_left ~init:typ.zero ~f:typ.plus l
  50. let sum_product typ l1 l2 = sum typ (List.map2 ~f:typ.mul l1 l2)
  51. end

You'll note that the definition above of Type.int and Type.float are basically boilerplate. Because the modules in question themselves have a fairly standardized interface, we could instead use a functor to create these type-indexed values without the boilerplate:

  1. module type Arith = sig
  2. type t
  3. val (+) : t -> t -> t
  4. val neg : t -> t
  5. val zero : t
  6. end
  7. module Build_type(M:Arith) = struct
  8. let typ x = { Type.
  9. plus = M.(+);
  10. neg = M.(-);
  11. zero = M.zero;
  12. }
  13. end
  14.  
  15. let int = let module Z = Build_type(Int) in Z.typ
  16. let int64 = let module Z = Build_type(Int64) in Z.typ
  17. let int32 = let module Z = Build_type(Int32) in Z.typ
  18. let native = let module Z = Build_type(Native_int) in Z.typ
  19. let float = let module Z = Build_type(Float) in Z.typ
  20. let complex = let module Z = Build_type(Complex) in Z.typ

This is yet another advantage one gets from having standardized interfaces.

If type indexed-values look similar to Haskell's type-classes, it's because they are. In my limited understanding of Haskell, the implementation is similar as well, in that under the cover, Haskell passes around dictionaries of functions which play the same role that the Type.ts play here.

The number typeclass described above is just an example, and not something I've felt the need for in practice. But here are some places where we've used type-indexed values to good effect:

Serialization

The latest (unreleased) version of the bin_prot macros that we use for binary serialization and deserialization now come with a type-indexed value that ties together all the little bits that you need to use the library. Before we did that, one could only instantiate useful bin-prot functionality using the module language. Now, we can do it using ordinary polymorphic functions.

Little languages

Sometimes we design domain-specific languages embedded in the type system. It is often useful to have values representing the different types that can be generated in the language. For example, we use this as part of a set of SQL bindings to represent types that we know how to convert to and from SQL.

Containers

We've started experimenting with type-indexed values representing the container-hood of a given object. This is a little trickier than the previous examples, since the type-indexed value has two type parameters, one for the type of the container, and one for the type of the elements of the container. In the end, this let's you write functions with signatures like

  1. max: ('a,'b) Container.t -> cmp:('a -> 'a -> int) -> 'b -> 'a

and use it to find the maximum element of a list (where the type-indexed value has type ('a, 'a list) Container.t) or an array (('a, 'a array) Container.t) or a string ((char,string)
Container.t
).

Type-indexed values obviously have their downsides: they can be somewhat inconvenient syntactically, since you need to explicitly pass them along; and they sacrifice some performance because it leads you to call closures where you could otherwise call static functions that could be inlined instead. But overall, they are a flexible and elegant way of writing generic code in ML.