Skip to content

Hydration Design Documentation

Jason Miller edited this page Jan 30, 2020 · 2 revisions

Hydrating Text

Hydrating DOM Text nodes is tricky, primarily because HTML cannot express adjacent text:

<div id="p">ab</div>
<script>
  p.childNodes.length // 1
  p.firstChild.data   // "ab"
</script>

In Preact, it's extremely common to render adjacent Text (string) children. This means hydrating from server-rendered or pre-rendered HTML requires potentially reconciling multiple Text children with a single DOM Text representation. How do we do this?

Preact 8

In Preact versions 8 and prior, adjacent Text children were merged together by h(). This meant that Preact's client-side diffing logic was doing the same normalization as serialized HTML: collapsing all Text siblings into a single "node" representation. Here's how that worked in Preact 8:

const tree = <div id="p">{"a"}{"b"}</div>
tree.children  // ["ab"]
scratch.innerHTML = renderToString(tree)
render(tree, scratch) // no mutations, since p.firstChild.data is already "ab"

Preact X

Preact X moved to granular Text nodes, likely as a performance improvement:

const tree = <div id="p">{"a"}{"b"}</div>
tree.props.children // ["a", "b"]
tree._children // [{props:"a"}, {props:"b"}]

This allows the renderer to perform fine-grained updates on Text. When re-rendering, rather than merging changed and unchanged text together and re-applying the result to the DOM, Preact X updates only the DOM Text node corresponding to each changed string value in the Virtual DOM.

render(<div id="p">{"a"}{"b"}</div>, scratch)
render(<div id="p">{"a"}{"c"}</div>, scratch) // p.lastChild.data = "c"

Before we get into its hydration implications, it's worth calling out that this design decision was a tradeoff at the time, and is worth continually revisiting. There is memory overhead associated with constructing more Text nodes, even if the resulting rendered Text is identical. There's also potentially cases where granular Text updates could be slower than the previous merged-siblings approach, such as multiple updated adjacent Text values. Finally, it's possible that preserving Arrays of Text-only VNodes causes the containing Element VNode to fail memoization checks and re-render unncessarily, since the children Array is reconstructed every render.

The Decision

However, the primary reason this design is important is that HTML merges adjacent Text siblings, but Preact X does not. When hydrating, the renderer should never perform DOM mutations (for performance reasons) - however it's generally the case that the renderer will encounter mismatched Text as a result of this difference. There are two solutions to this problem:

  1. bring back adjacent Text sibling merging; or
  2. "backport" the DOM's merged structure into the VDOM tree being hydrated

Option 1. Merge Adjacent Text Siblings

Preact X is actually very sell-situated to revisit the adjacent Text sibling merging approach. Instead of mutating developer-visible VNodes, Preact now maintains a separate internal tree representation within a private vnode._children property. String children are converted into Text VNodes with a type value of null and vnode.props corresponding to their content. The renderer's flattened and normalized vnode._children Array is an ideal place to also perform adjacent Text merging, since it's hidden from developers and contains the necessary type information to determine when merging can occur.

Here's the change that would be required in the toChildArray() function to implement adjacent text merging:

export function toChildArray(children, callback, flattened) {
  if (children == null || typeof children === 'boolean') {
    /* snip */
  } else if (typeof children === 'string' || typeof children === 'number') {
    // this is where we convert strings to Text VNodes
-   flattened.push(callback(createVNode(null, children, null, null)));
+   const len = flattened.length;
+   const last = len && flattened[len - 1];
+   // if the previous VNode is Text, append this string child to it:
+   if (last && last.type === null) {
+     // Ensure it's a string, to avoid toChildArray([1,2]) returning [3]
+     last.props += String(children);
+   }
+   else {
+     // Same as today for Text nodes not preceded by other Text:
+     flattened.push(callback(createVNode(null, children, null, null)));
+   }
  } else if (children._dom != null || children._component != null) {
    /* snip */
  }
  return flattened;
}

One important note: diffChildren() currently relies on the above callback being invoked for every Virtual DOM node, since it uses toChildArray both to produce vnode._children and to iterate over it. This breaks with adjacent text merging, since callback is invoked on the first Text child before its Text siblings have been merged into it. Because callback triggers diffing for each child when it is invoked, it's too late to append content to the previous Text sibling in the next iteration of toChildArray's loop.

This can be fixed by removing the callback codepath from toChildArray and modifying diffChildren to use a for loop. This was how the new diff in Preact X worked prior to release, so we know it's possible.

Option 2. Coerced Adjacent Text Hydration

First, let's illustrate how Text hydration currently works. In particular, notice that hydration must perform DOM mutations in order to "split apart" HTML-serialized Text nodes to match the Virtual DOM tree:

const tree = <div id="p">{"a"}{"b"}</div>
tree.children  // ["a", "b"]

scratch.innerHTML = renderToString(tree)
scratch.childNodes // ["ab"]  // HTML (from SSR/prerendering) merges them together

// hydrate from the HTML:
hydrate(tree, scratch) // p.firstChild.data = "a", p.append("b")

Option 2 applies a specific patch to work around this during hydration, effectively deferring the DOM tree "correction" until the next non-hydrate render. Leveraging the fact that hydration should never mutate the DOM, we can ignore all Text differences encountered during hydration. This is accomplished relatively easily, because there are already separate codepaths for handling Text updates and creating new Text nodes.

function diffElementNodes( /* snip */ ) {
  /* snip */
  if (dom == null) {
+   if (newVNode.type === null) {
+     // This is the codepath for creating new Text nodes, however during
+     // hydration we never want to create nodes, so we bail out.
+     // As a result, hydration only attaches/hydrates to the first Text VNode.
+     // Any subsequent updates will trigger diffing to split it into separate nodes.
+     if (isHydrating) return null;

      return document.createTextNode(newProps);
    }
    /* snip */
  if (newVNode.type === null) {
-   if (oldProps !== newProps && dom.data != newProps) {
-     dom.data = newProps;
-   }
+   // Any Text *mutations* during hydration are also now ignored.
+   // Instead, we modify the VNode's text to reflect the DOM's merged text.
+   // Any future update will see this "incorrect" text value, and trigger
+   // diffing to "correct" the DOM value and create any additional Text nodes.
+   if (isHydrating) {
+     newVNode.props = dom.data;
+   }
+   else if (oldProps !== newProps) {
+     dom.data = newProps;
+   }
  }
  /* snip */
}

With this approach, diffElementNodes actually replaces the VNode's text value (vnode.props) with the DOM's value (node.data), which we know contains all adjacent Text. Since we've also prevent new DOM Text Nodes from ever being created during hydration, this means the initial Virtual DOM tree is partially derived from the DOM tree and reflects its text-merged state.

It's easiest to understand this by pretending Preact's renderer has a "pre-processing step". Essentially, this approach turns a hydrate call with adjacent Text nodes into a hydrate call with one Text node and adjacent holes:

// what the developer authored:
hydrate(<div>{"a"}{"b"}</div>)

// what the renderer turns it into:
hydrate(<div>{"ab"}{null}</div>)

Eventually, the renderer needs to correct this and create the Text nodes as defined in the Virtual DOM. Because our hydration took values out of the DOM and put them back into the Virtual DOM tree, when the next non-hydrate render touches these Text nodes, it will see their "incorrect" state. The first Text values will be updated to reflect its VDOM string value, the remaining Text nodes will be constructed with their parts of the string.

Effectively, the next render() will explode the collapsed Text node back out into its parts automatically, because it will see that there is a difference between the old and new children:

render(<div id="p">{"a"}{"b"}</div>) // mutations:  p.firstChild.data="a", p.append("b")

Notice that the first text VNode's text is corrected in-place, whereas the remaining VNodes are created anew. This is actually the exact same operations that are currently performed during hydrate() - they're just deferred until the first side-effecting render pass after hydration has completed.