People (myself included) like to say that OCaml isn't really an optimizing compiler, that it has a pretty straight-ahead compilation strategy, and for the most part, you get what you it looks like you get when you write the code.

But it turns out, OCaml does a little more magic than I'd counted on. Consider the following code:

  1. let f x y =
  2. match x,y with
  3. | (0,0) ->
  4. true | \_, \_ -> false

I had thought that this actually allocated a tuple, and I was getting ready to push to try to get this fixed in the compiler. Before making a fool of myself, I thought I'd go and look at the generated assembly first, and lo and behold, I was wrong! The compiler does what one would hope and avoids the needless allocation. To see what the code looked like if I forced the allocation of a tuple, I changed the code to pass the tuple to a tuple-taking function.

  1. let sum (x,y) = x + y
  3. let f x y =
  4. match x,y with
  5. | (0,0) as pair -> ignore (sum pair); true
  6. | \_, \_ -> false

I then generated the assembly, and looked again, only to discover that the function had been inlined, thus defeating the need for allocation. So, I tried again, this time adding a string constant to the body of sum, which prevents inlining (a deficiency that ocamlpro is working on).

  1. let sum (x,y) =
  2. ignore "z";
  3. x + y
  5. let f x y =
  6. match x,y with
  7. | (0,0) as pair -> ignore (sum pair); true
  8. | \_, \_ -> false

I'd prevented the inlining, but there was still no allocation! Why? Well, it turns out that OCaml can optimize a tuple-taking function to get the elements of the tuple passed in via registers, which is exactly what happened. And again, the compiler realized that no allocation was required.

Finally, I was able to trigger an allocation by changing sum to refer to the tuplified form of its arguments explicitly:

  1. let sum ((x,y) as _p) =
  2. ignore "z";
  3. x + y

And this finally triggers the allocation.

Anyway, none of this is that surprsing --- indeed, other people at Jane Street knew perfectly well that OCaml did these optimizations. But it was a pleasant surprise for me nonetheless.