Quickcheck for Core

Automated testing is a powerful tool for finding bugs and specifying correctness properties of code. Haskell’s Quickcheck library is the most well-known automated testing library, based on over 15 years of research into how to write property-base tests, generate useful sources of inputs, and report manageable counterexamples. Jane Street’s Core library has not had anything comparable up until now; version 113.00 of Core finally has a version of Quickcheck, integrating automated testing with our other facilities like s-expression reporting for counterexample values, and support for asynchronous tests using Async.

Motivation

There are at least two other Quickcheck implementations on OPAM. Why write implementation N+1?

Before working at Jane Street, I did some research relating to randomized testing (Eastlund 2009, Klein et al. 2012). In both cases, users of the software packages involved found it easy to write tests that use random data, but hard to write random distributions that actually produce values that are useful to test.

This Quickcheck clone started as an attempt to address those concerns by building tools for tuning random distributions of values. One way I’ve done that is building in tools for “remembering” sets of chosen values; for example, random tests won’t repeat values, and it is easy to build distributions that generate sets or sorted lists and maintain uniqueness. This design is still in early stages, I don’t know if these are the sort of tools that will be needed most, and I’m eager to get feedback from more users. The library has also evolved to integrate Quickcheck-style testing with the Jane Street Core and Async libraries. Over time I hope this will also produce useful random distributions for more of the Core types, so they will be easily testable.

Example

For those unfamiliar with Haskell’s Quickcheck, the library allows users to write tests of some property of a function, then test that property on many automatically-generated input values. For example, we might want to test that the optimized implementation of list-append in Core is associative:

TEST_UNIT "associativity of list append" =
      Quickcheck.test Quickcheck.Generator.(tuple3 (list int) (list int) (list int))
        ~sexp_of:<:sexp_of>
        ~f:(fun (xs, ys, zs) ->
          <:test_eq>
            (List.append xs (List.append ys zs))
            (List.append (List.append xs ys) zs))

The test above randomly generates three lists of integers, appends them together two different ways, and tests that the results are equal. This process is repeated with different randomly chosen lists each time, until an error is reported or the default trial count is reached. Let’s break down the parts of the code here.

TEST_UNIT is a camlp4 syntax for unit tests.
Quickcheck.test is the main entry point for running a test using Quickcheck. It takes two required, unnamed arguments. The first is a generator, specifying the probability distribution of values to choose from when generating inputs for the test. The second is a function that consumes the generated input values and runs a test. The function returns () if successful and raises an exception otherwise.
Quickcheck.Generator.(tuple3 (list int) (list int) (list int)) constructs the generator that we want to use here. Most of the functions in the Quickcheck.Generator module are named after the types they generate; here, default probability distributions for lists of ints, combined using tuple3.
We provide the optional named argument ~sexp_of to Quickcheck.test. This argument is used to render the first generated value that triggers an error. The <:sexp_of< ... >> expression is camlp4 syntax for the default sexp conversion for a type.
The final argument to Quickcheck.test is a function that takes the tuples of lists produced by our generator, appends them two different ways, and compares the output. <:test_eq< ... >> is camlp4 syntax for an equality test.

The example above uses the s-expression conversions and camlp4 syntax extensions that are common in Jane Street’s libraries, but these things are not necessary for using Quickcheck. Quickcheck.test just needs a generator built from the functions in Quickcheck.Generator and a function that raises an exception on failure, and it will return () if successful or raise an exception describing the nature of the failure if not.

Generators

The primary data structure used by Quickcheck is the generator, or 'a Quickcheck.Generator.t. This corresponds to an implementation of the Arbitrary type class in Haskell’s Quickcheck. Primarily, a generator represents a random distribution of values of type 'a, although in our implementation there is a little more metadata besides that under the hood. The Quickcheck.Generator module provides default distributions of several types, and tools for creating more distributions or customizing the existing ones.

In our example above, we generated three lists of integers using the following expression.

Quickcheck.Generator.(tuple3 (list int) (list int) (list int))

Looking at the implementation of Core.Std.List.append, we can see that the implementation works in chunks of 5 elements, and changes behavior after 1000 chunks. So we might want to change our generator to make sure we get lists of the lengths we want to test.

let open Quickcheck.Generator in
let list_int = list int ~length:(`Between_inclusive (4900,5100)) in
tuple3 list_int list_int list_int

Some experimentation might show us that this still doesn’t hit the list lengths we want, as often as we want. The [Quickcheck.Generator.int_between] function, however, is documented as stressing boundary conditions, so we should be able to use it to get values at the upper and lower ends of the range we want. Here, it helps us that generators form a monad. If we combine generators using monadic bind, we get a weighted composition of their probability distributions. We can use that to first generate lengths for our lists, then use those randomly-generated lengths to build generators for the lists themselves.

let open Quickcheck.Generator in
let list_int =
  int_between ~lower_bound:(Incl 4900) ~upper_bound:(Incl 5100)
  >>= fun len ->
  list int ~length:(`Exactly len)
in
tuple3 list_int list_int list_int

Now we have a generator for three lists of integers, each list with a length between 4900 and 5100 inclusive, weighted toward the ends of that range. This is probably sufficient for our purposes. But if we want to go further down, if we decide that we need a very specific probability distribution, we can build one from scratch ourselves. Here is a rather idiosyncratic example that demonstrates the tools available in Quickcheck.Generator.

let open Quickcheck.Generator in
let rec ranges_of_five_between lower upper =
  if upper - lower < 10
  then of_list (List.range lower upper ~start:`inclusive ~stop:`inclusive)
  else weighted_union
         [ 5., singleton (lower + 0)
         ; 4., singleton (lower + 1)
         ; 3., singleton (lower + 2)
         ; 2., singleton (lower + 3)
         ; 1., singleton (lower + 4)
         ; 1., of_fun (fun () -> ranges_of_five_between (lower + 5) (upper - 5))
         ; 1., singleton (upper - 4)
         ; 2., singleton (upper - 3)
         ; 3., singleton (upper - 2)
         ; 4., singleton (upper - 1)
         ; 5., singleton (upper - 0)
         ]
in
let list_int =
  ranges_of_five_between 4900 5100
  >>= fun len ->
  list int ~length:(`Exactly len)
in
tuple3 list_int list_int list_int

This example uses a few more functions from Quickcheck.Generator. The of_list function takes a list of values and produces a generator that makes a uniform choice among them. weighted_union creates a probability distribution representing a weighted choice among the probability distributions of the associated sub-generators. singleton produces constant-valued generators, and of_fun produces a lazily-evaluated (but not memoized) generator. (Memoizing during random testing causes some unfortunate space leaks, it is important to be able to release resources after a batch of tests.) While this peculiar generator is probably not of practical use, it shows that when we need to, we can dig down into the interface and build whatever probability distribution we want.

Of course, it is also useful to construct generators for new types.

type bst = Leaf | Node of bst * int * bst
let gen : bst Quickcheck.Generator.t =
  let open Quickcheck.Generator in
  recursive (fun self ->
    let node =
      self >>= fun l ->
      int  >>= fun k ->
      self >>| fun r ->
      Node (l, k, r)
    in
    union [ singleton Leaf; node ])

The function Quickcheck.Generator.recursive is a fixed-point generator that helps build simply-recursive generators that need to invoke themselves and don’t have additional arguments. The function union is like weighted_union, but for uniform choice.

Observers

In Haskell’s Quickcheck, there is a duality between the type class Arbitrary for generating random values and Coarbitrary for observing inputs to random functions. Our version of Quickcheck mirrors Generator with Observer. Most tests using Quickcheck do not need an observer, but if you want to generate a random input for a higher-order function, you will need an observer for the function’s input type.

TEST_UNIT "function composition" =
  let open Quickcheck.Generator in
  Quickcheck.test
    (tuple3
       (fn Quickcheck.Observer.int    char)
       (fn Quickcheck.Observer.string int)
       string)
    ~f:(fun (f, g, x) ->
      <:test_eq< char >>
        ((Fn.compose f g) x)
        (f (g x)))

Here, Quickcheck.Generator.fn creates a generator for functions. It takes two arguments: an observer for the function’s input type and a generator for the function’s output type.

Think of an observer as a “generator of decision trees”. For instance, Quickcheck.Observer.int might randomly generate any of the following decision trees:

?

 x>5
 / \
?   ?

  x>4
  / \
 x>2 ?
 / \
?   ?

These decision trees control how a randomly-generated function will use its input. The generator for the function’s output is used to fill each of the ?s in with a concrete value. The result is a family of functions operating on the appropriate types, making randomly-chosen observations on the input and producing randomly-chosen outputs based on those observations.

If you need to build an observer for a custom type, there are tools for that as well.

type bst = Leaf | Node of bst * int * bst
let obs : bst Quickcheck.Observer.t =
  let open Quickcheck.Observer in
    recursive (fun self ->
      unmap (variant2 unit (tuple3 self int self))
        ~f:(function
          | Leaf -> `A ()
          | Node (l, k, r) -> `B (l, k, r))
        ~f_sexp:(fun () -> Atom "variant2_of_bst"))

As with generators, there is a fixed point function Quickcheck.Observer.recursive that helps for simply-recursive types. The function unmap transforms an input of some new type into an input for which we already have an observer. Variant types can be transformed to polymorphic variants, which have default observers variant2 through variant6. Records and constructor arguments can be transformed to tuples, which have default observers tuple2 through tuple6.

Work in Progress

Our OCaml adaptation of Quickcheck is new and still evolving. We already have some changes to the library internally which will be released over time, such as moving default generators and observers out of the Quickcheck module and into the modules for each type. For example, Quickcheck.Generator.int becomes Int.gen.

There are still some pragmatic lessons to learn about how best to use our formulation of the library, how to calibrate our default distributions, and what other distributions we might want to provide. As always, we hope to get feedback from anyone who tries out this library so that we can improve it.

Happy testing!