Space Cowboy

生死去来 棚头傀儡 一线断时 落落磊磊

0%

Vue3 源码阅读笔记(五)—— diff 算法解析

Sketch

If you have watched the Vue3 Live Release video, you will know the Vue core team rewrote the diff algorithm.
The new diff algorithm refers to the ivi and inferno, whitch make virtual nodes diff quicker. You can visit this test result to check the accurate data.

As a whole, the algorithm can divide into three parts:

  • patch the repetitive prefix & suffix nodes
  • sync the old & new virtual nodes lengths
  • diff subsequence of the same length

Next, we will explain the algorithm with the source code of Vue3.

prefix & suffix

At first, we need to patch the repetitive prefix & suffix virtual nodes. It is not strictly part of the algorithm, but it can avoid running the diff algorithm during some situations. This strategy can improve the program’s performance.

So, what is the repetitive prefix & suffix? Suppose we have the following two texts:

1
2
I want golden wind
I want silver wind

Look at these two texts, and we will find that each of these texts has the same piece of text at the head & end. These texts are what we call repetitive prefix & suffix. After we patched the repetitive part, we can only just diff the middle part which we call it unknown subsequence.

1
2
golden
silver

In this example, the diff algorithm can just replace the words to complete the diff process. But it is not always such simple.

We can show this process through a figure.

prefix & suffix

  • start: the last index of repetitive prefix nodes
  • prevEnd: the first index of previous virtual nodes’s repetitive suffix
  • nextEnd: the first index of next virtual nodes’s repetitive suffix

The question is: Why just one index at left, but two indexes at right? Because the previous virtual nodes’ length is not always the same as the next virtual nodes.

Talk is cheap, show me the code. The following code is from Vue3, but I change some place to make it readable.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
let start = 0;
let prevEnd = prevChildren.length - 1;
let nextEnd = nextChildren.length - 1;

/**
* 1. sync from start
* (a b) c
* (a b) d e
**/
while (start <= prevEnd && start <= nextEnd) {
const prevVNode = prevChildren[start];
const nextVNode = nextChildren[start];

if (isSameVNodeType(prevVNode, nextVNode)) {
patch(prevVNode, nextVNode, container);
} else {
break;
}

start++;
}

As the code shows, we traverse both the old and new virtual nodes. In this process, we compare the old and new virtual node. If they are the same type, we invoke the patch function to handle them.
In Vue3, the isSameVNodeType function is:

1
2
3
export function isSameVNodeType(n1: VNode, n2: VNode): boolean {
return n1.type === n2.type && n1.key === n2.key
}

Next, we need to handle repetitive suffix virtual nodes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
/**
* 2. sync from end
* a (b c)
* d e (b c)
**/
while (start <= prevEnd && start <= nextEnd) {
const prevVNode = prevChildren[prevEnd];
const nextVNode = nextChildren[nextEnd];

if (isSameVNodeType(prevVNode, nextVNode)) {
patch(prevVNode, nextVNode, container);
} else {
break;
}

prevEnd--;
nextEnd--;
}

The logic is similar to the previous code, but we start from the end of virtual nodes this time.

After the above process, we will get a pair of virtual nodes whose lengths are not necessarily equal. This means that we may need to add or remove virtual nodes.

First type: common sequence + mount / unmount

Before we explain code, we need to research the shape of virtual nodes. The shape is formed by the head and tail lines of the old and new virtual node groups.

For example, suppose we have two sets of virtual nodes:

real virtual nodes

The shape of these will like this:
virtual nodes shape

So, if we have a pair of virtual nodes which shape is like this. This means we need to mount nodes.
mount nodes

Also, you need to pay attention to the current index which need to be within the prevEnd and nextEnd scope.
The following code is to mount new virtual nodes.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
if (start > prevEnd) {
/**
* 3. common sequence + mount
* (a b)
* (a b) c
* start = 2, prevEnd = 1, prevEnd = 2
*
* (a b)
* c (a b)
* start = 0, prevEnd = -1, nextEnd = 0
**/
if (start <= nextEnd) {
// fetch the real DOM node
const ref = nextChildren?.[nextEnd + 1]?.elm;
while(start <= nextEnd) {
mount(nextChildren[start], container, ref || undefined);
start++;
}
}
}

Same logic, for this shape, you need to unmount nodes.
unmount nodes

The code is like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
/**
* 4. common sequence + unmount
* (a b) c
* (a b)
* start = 2, prevEnd = 2, prevEnd = 1
*
* c (a b)
* (a b)
* start = 0, prevEnd = 0, nextEnd = -1
**/
else if (start > nextEnd) {
while (start <= prevEnd) {
unmount(prevChildren[start], container);
start++;
}
}

Second type: unknown subsequence

After the previous treatment, we will get an unknown subsequence. But before we start coding, we need to analyze the unknown subsequence.

Specifically, we need to deal with the following possible situations.

  • unmount the node which does not exist in the new subsequence
  • mount new node
  • reorder the remaining node

So, the first question is: How can I know the situation of the node?
We can iterate over the old node group and determine whether the node still exists in the new node group.
But if you have thought over this logic, you will find the most performance-consuming part is the determine process. If we use the traditional way, like iterate over new node group for every node from old node group and determine it, the complexity is going to be O(n^2). We certainly don’t accept that.
However, if we can determine the node quickly, there will be no problems with this scheme.
The solution is: Add unique key for each node, and determine node through hash table.
Why does this work? Because the hash table lookup is O(1), and determine the node’s key also be O(1). This improvement can greatly increase the speed of finding the same node.

The user passes in the node’s key, and we will store it in the h function, which generates a virtual node.

We can use a Map to store the key-index pair. The index is the index of the node in the old node group.

1
2
3
4
5
6
7
8
9
// the length of new or old virtual node group is not always the same
const prevStart = start;
const nextStart = start;

const keyToIndexMap: Map<string | number, number> = new Map();

for (let i = nextStart; i <= nextEnd; i++) {
keyToIndexMap.set(nextChildren[i].key, i)
}

Now, we need to loop through old node group left to be patched and try to patch matching nodes & remove nodes that are no longer present.

But why should we loop through the old node group? This because all node move operations operate on the old node group, so this operation can minimize the number of moves.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
const needPatch = nextEnd - nextStart + 1;
let patched = 0;

let moved = false;
let maxNewindexSoFar = 0;

// works as Map<newIndex, oldIndex>
// 1. determin if the node need to unmount
// 2. determin longest stable subsequence for node movement
const newIndexToOldIndexMap = new Array(needPatch).fill(0);

for (let i = prevStart; i <= prevEnd; i++) {
const prevVNode = prevChildren[i];
if (patched >= needPatch) {
unmount(prevVNode, container);
continue;
}

let newIndex;
if (!!prevVNode.key) {
newIndex = keyToIndexMap.get(prevVNode.key);
} else {
// handle key-less node
for (let j = prevStart; j <= prevEnd; j++) {
if (newIndexToOldIndexMap[j - prevStart] && isSameVNodeType(prevVNode, nextChildren[j])) {
newIndex = j;
break;
}
}
}

if (newIndex) {
newIndexToOldIndexMap[newIndex - nextStart] = i + 1;

if (newIndex >= maxNewIndexSoFar) {
maxNewIndexSoFar = newIndex;
} else {
moved = true;
}

patch(prevVNode, nextChildren[newIndex], container);
patched++;
} else {
unmount(prevVNode, container);
}
}

The second question is: How do we deal with nodes which is in a different order?
Well, our goal is to minimize DOM movement. So, we need to find a way to extract information about the virtual node’s relative position.
Plus, if you read the code above, you’ll notice the moved variable. The variable’s type is boolean, which means if the node has moved. Specifically, if the newIndexToOldIndexMap’s value is increasing, that means the node has not moved, the other way around, it moves.
So, we need to find out how many nodes in the node group are not moving. The solution is: Longest Increasing Subsequence.

In computer science, the Longest Increasing Subsequence problem is to find a subsequence of a given sequence in which the subsequence’s element are in sorted order, lowest to highest, and in which the subsequence is as long as possible.

Why is it work? Because Longest Increasing Subsequence’s mathematical sense is: whether the node remains locally ordered. The problem is complicated, so I’ll treat it as a black box for now.

The code will be:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
/**
* 3. common sequence + mount
* (a b)
* (a b) c
* start = 2, prevEnd = 1, prevEnd = 2
*
* (a b)
* c (a b)
* start = 0, prevEnd = -1, nextEnd = 0
**/

// lis function will generate longest increasing subsequence
const seq = moved ? lis(newIndexToOldIndexMap) : [];
let remaining = seq.length - 1;

for(let i = needPatch - 1; i >= 0; i--) {
const nextVNode = nextChildren[nextStart + i];
// If the value is 0, that means this is a new node
if (newIndexToOldIndexMap[i] === 0) {
mount(nextVNode, container, nextChildren?.[nextStart + i + 1]?.elm || undefined);
} else if (moved) {
if (remaining < 0 || i != seq[remaining]) {
container.insertBefore(nextVNode.elm as Node, nextChildren?.[i + nextStart + 1]?.elm || null);
} else {
remaining--;
}
}
}

Conclusion

In this article, we talk about diff algorithm in Vue3. Here is the point:

  • Before the algorithm execute, we can patch the repetitive prefix & suffix nodes first.
  • If we get common sequence + other node, we need to mount or unmount node.
  • If we get unknown sequence, we need to find its Longest Increasing Subsequence. Move the node that need to move, mount the node that need to mount.

Reference

  1. vue-next: https://github.com/vuejs/vue-next
  2. 渲染器: http://hcysun.me/vue-design/zh/renderer-diff.html#inferno-所采用的核心-diff-算法及原理
  3. Longest increasing subsequence wikipedia: https://en.wikipedia.org/wiki/Longest_increasing_subsequence