If you've ever wondered what it means in OCaml when there is a + or - in front of a type variable, read on.

OCaml has subtyping, which is a binary relation on types that says roughly, if type t1 is a subtype of type t2, then any value of type t1 can be used anywhere a value of type t2 was expected. In OCaml, subtyping arises from polymorphic variants. For example, [ A ] is a subtype of [ A | B ] (typographical note: in this post, polymorphic variants are written without the leading grave accent (backtick) due to formatting issues). Why? Because if you have a piece of code that knows how to deal with A and B, well then surely it can deal with just A.

One can ask OCaml to check subtyping using a coercion expression:

  1. (e : t1 :> t2)

For example:

  1. let f x = (x : [ A ] :> [ A | B ])

A coercion expression actually does two things; it verifies that t1 is a subtype of t2, and causes the type of the entire expression to be t2.

Just as the OCaml typechecker has rules for deciding when a let expression or a function call typechecks, it has rules for deciding when the subtyping in a coercion expression is allowed. The subtyping rule for polymorphic variant types is clear -- one polymorphic variant t1 is a subtype of another polymorphic variant t2 if every constructor in t1 also appears in t2 (and with the same type argument). OCaml then has rules to extend the subtyping relation to more complex types. E.g. for tuples the rule is

  1. if t1 :> t1' and t2 :> t2' then (t1, t2) :> (t1', t2')

For example:

  1. let f x = (x : [ A ] * [ B ] :> [ A | C ] * [ B | D ])

The rule for subtyping on tuples makes intuitive sense if you think about what code could do with a tuple -- it can take apart the pieces and look at them. So, if a tuple has fewer kinds of values in both its first and second components, then code dealing with the tuple would still be fine.

For arrow types the subtyping rule is:

if ta' :> ta and tr :> tr' then ta -> tr :> ta' -> tr'

For example:

  1. let f x = (x : [ A | B ] -> [ C ] :> [ A ] -> [ C | D ])

Again, the rule makes sense if you think about what code can do with a function that it has. It can feed the function arguments, and observe the results. So, if a function can accept more kinds of inputs or returns fewer kinds of outputs, then the code dealing with the function would still be fine.

For types in OCaml like tuple and arrow, one can use + and -, called "variance annotations", to state the essence of the subtyping rule for the type -- namely the direction of subtyping needed on the component types in order to deduce subtyping on the compound type. For tuple and arrow types, one can write:

  1. type (+'a, +'b) t = 'a * 'b
  2. type (-'a, +'b) t = 'a -> 'b

One sometimes hears the term "covariant" to describe to a type variable annotated with a + and "contravariant" to describe to a type variable annotated with a -. For example, the tuple type is covariant in all its arguments. The arrow type constructor is contravariant in its first argument and covariant in its second argument.

If you don't write the + and -, OCaml will infer them for you. So, why does one need to write them at all? Because module interfaces are designed to express the contract between implementor and user of a module, and because the variance of a type affects which programs using that type are type correct. For example, suppose you have the following:

  1. module M : sig
  2. type ('a, 'b) t
  3. end = struct
  4. type ('a, 'b) t = 'a * 'b
  5. end

Should the following typecheck or not?

  1. let f x = (x : ([ A ], [ B ]) M.t :> ([ A | C ], [ B | D ]) M.t)

If one knows that ('a, 'b) M.t = 'a * 'b, then yes, it should type check. But the whole point of an interface is that a user only knows what the interface says. And it does not say that ('a, 'b) M.t = 'a * 'b. So in fact it does not type check.

Variance annotations allow you to expose the subtyping properties of your type in an interface, without exposing the representation. For example, you can say:

  1. module M : sig
  2. type (+'a, +'b) t
  3. end = struct
  4. type ('a, 'b) t = 'a * 'b
  5. end

This will give enough information to the OCaml type checker to typecheck uses of M.t for subtyping so that the following will type check.

  1. let f x = (x : ([ A ], [ B ]) M.t :> ([ A | C ], [ B | D ]) M.t)

When you use variance annotations in an interface, OCaml will check that the implementation matches the interface, as always. For example the following will fail to typecheck:

  1. module M : sig
  2. type (+'a, +'b) t
  3. end = struct
  4. type ('a, 'b) t = 'a -> 'b
  5. end

Whereas the following will typecheck:

  1. module M : sig
  2. type (-'a, +'b) t
  3. end = struct
  4. type ('a, 'b) t = 'a -> 'b
  5. end

Hopefully this sheds some light on a somewhat obscure corner of the language.