I've recently been thinking about the world of JavaScript and web applications. That's odd for me, since I know almost nothing about the web. Indeed, Jane Street's use of web technologies is quite minimal -- nearly all of our user interfaces are text based, and all told we've been pretty happy with that.

But there are real limitations to console apps, and if you need something richer that's cross-platform, the web is pretty appealing. For us it's made yet more appealing by the fact that OCaml, our language of choice, compiles into JavaScript via js_of_ocaml.

So recently, when a few people internally got interested in trying out JavaScript-based UIs, I dug in a little to try to understand the landscape, and help us figure out what approach to take.

Virtual DOM, all at once

I started by trying to understand more about the approaches taken in the wider world for building JavaScript apps. One idea the struck me as particularly interesting was virtual DOM. Virtual DOM showed up first in Facebook's React, but has since inspired other implementations, like Matt Esch's virtual-dom library, which in turn is the basis of the Mercury web framework and Elm's Html library. Other libraries have wrapped and extended React, like ClojureScript's Om framework.

To understand the appeal of virtual DOM, you first need to understand what the world looks like without it. JavaScript applications in the browser are fundamentally tied to the DOM, which is the tree of objects that reflects the structure of the HTML that the page is built from. The DOM is wired into the browser, and mutating the DOM is how you change what's shown on the screen, a necessity for a dynamic web app.

But working directly with DOM can be awkward. For one thing, it encourages you to write your display logic twice: once to create the initial state of your page, and then again for the code that updates the DOM in response to external events.

But it's worse than just having to write your logic twice: the second time is also trickier. That's because, for performance reasons, you need to minimize changes to the DOM, since those changes can cause expensive reflows and the like. Doing this kind of change minimization by hand can be pretty painful.

The goal of virtual DOM is to let you write your display logic just once, and to do so in a convenient, straight-ahead style. The idea is simple: instead of modifying the DOM directly, you create immutable trees that represent the DOM you'd like to have, a virtual DOM. Every time your state changes, you recompute the virtual DOM, and then diff the new virtual DOM against the old. The resulting patch can then be applied to the actual DOM, which minimizes changes to the DOM proper.

This makes it easier to express your display logic cleanly. Elm uses virtual DOM as part of the Elm architecture. In that approach, the state of the application is kept in a model value, which abstractly represents the state of the application, omitting presentation details. A separate view function is used to convert the model into a virtual DOM tree that describes the page that should be shown to the user.

The application is made dynamic by adding an action type which summarizes the kinds of changes that can be made to the model. Actions are enqueued either from callbacks embedded in the virtual DOM, or by other jobs that communicate with the outside world via web requests and the like. When the action is applied to the model, a DOM update is done by recomputing the virtual DOM, and then diffing and patching the real DOM accordingly.

Virtual DOM, incrementally

As described, the above approach involves computing the virtual DOM from scratch after every action. This is done even if the change to the DOM implied by the action is small, which is the common case. Essentially, every key press and mouse click causes the entire virtual DOM to be recomputed.

In a world where DOM updates are the only expense that matters, this isn't so bad. And for sufficiently small web applications, that's almost right. But once you're creating large, dynamic UIs, this simple story falls apart, and the cost of recreating the virtual DOM every time matters.

That's why in all of these virtual DOM APIs and frameworks, there's some form of incrementalization built in, a way to avoid paying the full cost of rebuilding the virtual DOM when the logical changes are small.

In React, for example, the state is organized into a set of hierarchical components, each with its own render function. These components are structured to match the structure of the HTML that they generate, with the idea that you'll only have to re-render the few components whose input data has actually changed. React effectively memoizes the render computation at the component level.

Elm, rather than tying the incrementalization directly to a framework-level notion of component, lets you introduce memoization in the construction of individual virtual DOM nodes. To do this, Elm's Html module exposes a set of "lazy" functions with roughly these signatures (shown with OCaml syntax).

  1. val lazy1 : ('a -> Html.t) -> 'a -> Html.t
  2. val lazy2 : ('a -> 'b -> Html.t) -> 'a -> 'b -> Html.t
  3. val lazy3 : ('a -> 'b -> 'c -> Html.t) -> 'a -> 'b -> 'c -> Html.t

Here, the first argument is the render function, and the remaining arguments are the values to be passed to the render function.

The idea is that a call to one of these lazy functions won't call the render function immediately. Instead, it creates a special node that stores the render function and its arguments for later. The render function is only called as part of the process of diffing two virtual DOM trees. When the diff gets to the point of comparing two such nodes, it first compares the things that the node was built from, i.e., the render function and its arguments. If they're the same, then the diff is empty. If they differ, then the render function is run to compute more of the tree, and the diffing process continues from there.

It's worth noting that forcing the render function for a given node to run will create more of the virtual DOM tree, but it won't necessarily build everything below that node. In particular, the tree that's created may contain yet more lazy nodes, which won't be forced until the diffing process gets to them.

By making enough nodes lazy in this way, you can incrementalize the computation of the entire virtual dom tree, only forcing the recomputation of parts of the virtual dom that could have changed given the changes in the underlying data model.

Elm's approach has some limitations. While it doesn't limit memoization to a particular notion of a component, it does tie it to nodes in the DOM tree. This can be limiting, since it prevents you from sharing other parts of the computation that don't result concretely in DOM nodes.

It's also a little anti-modular, in that you basically need to call your lazy function on simple values and top-level functions, so ordinary functional programming modularization techniques, which often rely on passing around closures, don't work as well as you'd hope.

Beyond virtual DOM

Virtual DOM isn't the only approach to simplifying the process of programming the DOM. Another example I ran across is Mike Bostock's amazing D3 library. D3 has some of the same goals as virtual DOM, in that it aims to provide a nice way to construct complex web pages based on some more abstract data model. Like virtual DOM, D3's approach lets you specify the view logic once, while producing a view that responds efficiently to changing data. D3 is doing this in the service of data visualization, but the approach it takes is not limited to that domain.

Where virtual DOM encourages you to think of your view calculation as an all at once affair, D3 makes you think about incrementalization explicitly where it matters. In particular, when you specify how the view changes in response to data, you do so by explicitly specifying what happens in three cases: enter, update, and exit. The enter case corresponds to new data points arriving, update corresponds to data points that are changing, and exit corresponds to data being removed.

These transformations are specified using a spiffed up version of the DOM selectors API, which lets you can select a collection of nodes by stating conditions that those nodes satisfy. You can then specify ways of transforming those nodes, and, somewhat surprisingly, specify the creation of nodes that don't exist yet. This is done using the append operation, and is all part of what's is called data binding in the D3 world.

If this sounds confusing, well, I found it confusing too. But the D3 approach has some good things going for it. For one thing, it gives you a natural way of thinking about animations, since you can specify simple animations to run on the enter/exit/update actions, which is more awkward in virtual DOM based approaches.

To borrow an analogy from circuits, virtual DOM is level-triggered, meaning the view depends only on the current value of the state; but D3 is edge-triggered, meaning that the display logic can depend on the precise transition that's occurring. This is a real difference in the models, but I'm not sure how important it is in practice.

To some degree, you can get around this issue on the virtual DOM side by expressing more time-dependent information in the model. Also, you can add edge-triggered events on top of your virtual DOM, which React does. That said, it's not as front-and-center in the Virtual DOM API as it is with D3, where edge-triggered animations are an easy and natural part of the design.

Incrementality everywhere

Given that incrementality seems to show up in one form or another in all of these web frameworks, it's rather striking how rarely it's talked about. Certainly, when discussing virtual DOM, people tend to focus on the simplicity of just blindly generating your virtual DOM and letting the diff algorithm sort out the problems. The subtleties of incrementalization are left as a footnote.

That's understandable, since for many applications you can get away without worrying about incrementalizing the computation of the virtual DOM. But it's worth paying attention to nonetheless, since more complex UIs need incrementalization, and the incrementalization strategy affects the design of a UI framework quite deeply.

The other benefit of thinking about incrementalization as a first class part of the design is it can lead you in new directions. In that vein, I've been experimenting with using self-adjusting computations, as embodied by our Incremental library, as another approach to incrementalizing computation of the virtual DOM.

Self-adjusting computations is a general purpose approach to building efficient on-line computations developed by Umut Acar in his dissertation. Thinking about Incremental in the context of GUI development has lead us to some new ideas about how to build efficient JavaScript GUIs, and some new ideas about how Incremental itself should work. I hope to write more about that in an upcoming post.

(You can check out the next post here.)

Thanks

Most of what I've written here comes from talking to people who know a lot more about the web than I do, and I wanted to thank them. I had some very enlightening conversations with Jordan Walke about React's design and history. I've also talked a bunch to Evan Czaplicki about Elm, and I'm indebted to Spiros Eliopoulos, who helped me learn a bit about D3, in part through his ocaml-d3 bindings. Also thanks to Hugo Heuzard and Izzy Meckler for writing a bunch of useful code and helping me learn about js_of_ocaml and more generally about various details of JavaScript and modern browsers.