9.5.4 逃逸分析

遍历有向图

当数据流有向图构建完成之后，编译器便会基于该结构进行逃逸分析。编译器首先遍历所有顶点，依次分析指向该顶点的路径（Path）上的所有顶点是否需要逃逸，编译器内部通过两个 LIFO Queue 来实现该双重循环。入口函数 walkAll() 也是外层循环，其代码如下：

func (b *batch) walkAll() {
    todo := make([]*location, 0, len(b.allLocs)+1)
    enqueue := func(loc *location) {
        if !loc.queued {
            todo = append(todo, loc)
            loc.queued = true
        }
    }

    // 将所有的顶点压入队列
    for _, loc := range b.allLocs {
        enqueue(loc)
    }
    // 将代表堆地址的顶点压入队列
    enqueue(&b.heapLoc)

    var walkgen uint32
    // 从队列尾部开始遍历队列，并调用方法 walkOne 对当前顶点进行分析
    for len(todo) > 0 {
        root := todo[len(todo)-1]
        todo = todo[:len(todo)-1]
        root.queued = false

        walkgen++
        b.walkOne(root, walkgen, enqueue)
    }
}

该方法用于构建外层队列，内层循环及分析逻辑在方法 walkOne() 中。可以发现入队函数 enqueue() 也作为参数传递给了 walkOne(), 这意味着对顶点 root 的分析过程中，有可能会将某些顶点再次压入队列（这里说再次是因为前面通过遍历 b.allLocs 列表，已经将所有的顶点压入队列了）。

方法 walkOne() 以参数 root 为根节点，逐个分析指向该顶点的路径（Path）上的所有顶点，例如路径：root <- A … <- P … <- Z 中，如果 root 需要逃逸，而 root 实际上持有的是 P 的指针的话，那么 P 也需要逃逸。而此时我们就需要通过参数 enqueue 将顶点 P 再次压入外层队列，因为 walkOne() 只分析 root 与路径上各个顶点的关系，如果我们发现路径上某节点 P 需要逃逸的话，那么就有必要将 P 作为根节点再次进行分析（可能外围队列之前已经遍历过顶点 P 了，而当时 P 可能不需要逃逸，所以这里是再次分析），所以我们需要将顶点 P 再次压入外层队列。

内部队列的作用是让我们能够遍历指向根节点 root 路径上的所有顶点，抛开逃逸分析的逻辑，构建以及遍历逻辑如下：

func (b *batch) walkOne(root *location, walkgen uint32, enqueue func(*location)) {
    root.walkgen = walkgen
    root.derefs = 0
    root.dst = nil

    todo := []*location{root} // LIFO queue
    // 遍历队列
    for len(todo) > 0 {
        l := todo[len(todo)-1]
        todo = todo[:len(todo)-1]

        // 删除对 l 进行逃逸分析的代码

        // 在 for 循环末尾将指向该顶点的所有顶点压入队列
        for i, edge := range l.edges {
            if edge.src.escapes {
                continue
            }
            d := derefs + edge.derefs
            if edge.src.walkgen != walkgen || edge.src.derefs > d {
                edge.src.walkgen = walkgen
                edge.src.derefs = d
                edge.src.dst = l
                edge.src.dstEdgeIdx = i
                todo = append(todo, edge.src)
            }
        }
    }
}

对每一个顶点的逃逸分析都发生在内层循环中，弄清楚了整个有向图的遍历方式之后，我们可以看看对于根节点 root 路径上的某节点 l 如何进行逃逸分析了。

逃逸分析逻辑

简单来说，对于根节点 root 及其路径上的节点 l, 如果下列两个关系均满足的话，那么 l 就需要逃逸：

root 的生命周期长于 l 判断该逻辑的方法是 batch.outlives(), 其代码如下：

// outlives reports whether values stored in l may survive beyond
// other's lifetime if stack allocated.
func (b *batch) outlives(l, other *location) bool {
    // The heap outlives everything.
    if l.escapes {
        return true
    }

    // We don't know what callers do with returned values, so
    // pessimistically we need to assume they flow to the heap and
    // outlive everything too.
    if l.isName(ir.PPARAMOUT) {
        // Exception: Directly called closures can return
        // locations allocated outside of them without forcing
        // them to the heap. For example:
        //
        //    var u int  // okay to stack allocate
        //    *(func() *int { return &u }()) = 42
        if containsClosure(other.curfn, l.curfn) && l.curfn.ClosureCalled() {
            return false
        }

        return true
    }

    // If l and other are within the same function, then l
    // outlives other if it was declared outside other's loop
    // scope. For example:
    //
    //    var l *int
    //    for {
    //        l = new(int)
    //    }
    if l.curfn == other.curfn && l.loopDepth < other.loopDepth {
        return true
    }

    // If other is declared within a child closure of where l is
    // declared, then l outlives it. For example:
    //
    //    var l *int
    //    func() {
    //        l = new(int)
    //    }
    if containsClosure(l.curfn, other.curfn) {
        return true
    }

    return false
}

该函数的实现还是比较清晰的，特别的可以看到：对于函数的返回值，由于无法知道函数调用者会如何处理返回值，所以当root节点为函数返回值时，编译器会保守得返回true。

root 持有的指针链包含 l 这里我们引入了“指针链条”这个词，对于赋值关系：a = &b, b = &c, c = &d, 那么我们就说赋值链 a <- b <- c <- d 中从 a 到 d 是一个指针链条。显然，如果 a 需要逃逸的话，整个赋值链上的所有变量都需要逃逸。
对应到有向图上， root 持有的指针链条是否包含 l 是通过有向图中 l 到 root 路径上的各个边的权重（Derefs）来确定的。此处我们定义 W(a, b) 为赋值链 a <- … <- b 中 b 相对于 a 的总权重。对于如下调用链：

W(~r1, l2) = 0 + 1 + -1 = 0, 所以 ~r1 实际持有的指针链中没有包含 l2；而 W(~r1, l1) = W(~r1, l2) + -1 = -1, 表明 ~r1 实际持有的是 l1 的指针。

由于 Go 语言中表达式 &l 本身是不可寻址的，即 &&l 为非法表达式，因此在计算 W(a, b) 时，不能简单得将整个路径上所有边的权重（Derefs）直接加起来，例如上图中，我们不能说 W(~r1, t) = -2，因为根据算法思路中提到的权重计算方法，-2 代表的意义是 &&t 这样的操作，这也不符合实际情况。所以对于赋值链 root <- … <- x <- t，当 W(root, x) = -1 时，我们规定 W(root, t) = min(-1, -1 + W(x, t)), 由此可以推论出对于任意两个顶点a 与 b, W(a, b) 的最小值为 -1. 如果 W(a, b) = -1, 我们就知道 b 在 a 的指针链中，对于逃逸分析而言这就够了。

但 Go 中解引用操作（*）却可以多次应用，例如 ***x 是合法的表达式，代表着对 x 所持有的指针进行三次解引用操作，因此对于赋值链 root <- … <- x <- t, 如果 W(root, x) >= 0, 那么易得 W(root, t) = W(root, x) + W(x, t)

在计算赋值链的权重时，不会出现 w(a, b) < -1 的情况实际上并不是一种规定，而是一种结果。虽然在上图的链条 l3<-l2<-l1<-t 中每条边的权重都是 -1, 但并不能说 W(l3, t) = -3, 因为 l3 并没有对 t 取三次地址，而是对 l2 取了一次地址，然后 l2 又对 l1 取了一次地址，这样依次下去。因此 W(l3, t) 本身就应该等于 -1, 代表着这条赋值链的最终效果就是对 t 取了一次地址。

总体来说就是在一条赋值链中，取址运算（&）的效果是不会传递并叠加的，不管有多少中间变量存在，Go 的按值拷贝（copy-by-value）特性保证了每次取址操作的对象都是一个新的变量。

但运算符 * 的效果却是可以传递和叠加的，所以如果权重为正，总赋值链的总权重直接相加即可。

基于以上两点的分析，在内层队列循环中，逃逸分析的代码可以精简成如下如下样子：

func (b *batch) walkOne(root *location, walkgen uint32, enqueue func(*location)) {
    root.walkgen = walkgen
    root.derefs = 0
    root.dst = nil

    todo := []*location{root} // LIFO queue
    for len(todo) > 0 {
        l := todo[len(todo)-1]
        derefs := l.derefs

        // If l.derefs < 0, then l's address flows to root.
        addressOf := derefs < 0

        if addressOf {
            // For a flow path like "root = &l; l = x",
            // l's address flows to root, but x's does
            // not. We recognize this by lower bounding
            // derefs at 0.
            derefs = 0
        }

        if b.outlives(root, l) {
            if addressOf && !l.escapes {
                l.escapes = true
                enqueue(l)
                continue
            }
        }

        // 删除掉构建内部队列的代码
    }
}

示例分析

我们来看看如下代码的逃逸分析：

package main

type T struct {
    Name string
}

func GetT() **T {
    var t T
    l1 := &t
    l2 := &l1
    r3 := l1
    r4 :=&r3

    var l4 **T
    l4 = l2
    l4 = r4

    return l4
}

函数 GetT() 的数据流有向图为：

我们在这里只标记了分析出逃逸结果的几条路径，编译器分别根据路径 A, B, C 分析出 r3, l1 与 t 需要逃逸。将以上代码保存至 main.go, 通过命令 go tool compile -G=3 -l -m -json "0,file:///tmp/main.json" main.go 可以得出同样结论：

main.go:8:6: moved to heap: t
main.go:9:2: moved to heap: l1
main.go:11:2: moved to heap: r3

上一页9.5.3 构建数据流有向图下一页9.6 编译日志

最后更新于3年前

这有帮助吗？