go run -gcflags=’-m -m’ sample2.go
./sample2.go:5: make([]interface {}, lc) escapes to heap
./sample2.go:6: lc escapes to heap
make从堆申请,这点无可厚非,我们把interface{}改为int类型后
go run -gcflags=’-m -m’ sample2.go
./sample2.go:5: make([]interface {}, lc) escapes to heap
make得到的slice是在堆申请的,生命周期比函数更长,当slice里为引用时变量会转移到堆,而interface{}能接收任意类型,在做逃逸分析时,保守的认为输入的值可能是引用,所以把变量移到堆里去了。stackoverflow相关资料:
make for a slice returns a slice descriptor struct (pointer to underlying array, length, and capacity) and allocates an underlying slice element array. The underlying array is generally allocated on the heap: make([]int, lc) escapes to heap from make([]int, lc).
s[0] = &v stores a reference to the variable v (&v) in the underlying array on the heap: &v escapes to heap from s[0] (slice-element-equals), moved to heap: v. The reference remains on the heap, after the function ends and its stack is reclaimed, until the underlying array is garbage collected.
If the make slice capacity is a small (compile time) constant, make([]*int, 1) in your example, the underlying array may be allocated on the stack. However, escape analysis does not take this into account.
func Append(slice, data []byte) []byte {
l := len(slice)
if l + len(data) > cap(slice) { // reallocate
// Allocate double what’s needed, for future growth.
newSlice := make([]byte, (l+len(data))*2)
// The copy function is predeclared and works for any slice type.
copy(newSlice, slice)
slice = newSlice
}
slice = slice[0:l+len(data)]
copy(slice[l:], data)
return slice
}
append 函数原型如下,其中 T 为通用类型。
func append(s []T, x …T) []T
展开分析
为了方便程序分析的,我们在程序中添加打印信息,代码和结果如下:
./q.go:7:6: cannot inline main: non-leaf function
./q.go:14:13: s1 escapes to heap
./q.go:14:13: from … argument (arg to …) at ./q.go:14:13
./q.go:14:13: from *(… argument) (indirection) at ./q.go:14:13
./q.go:14:13: from … argument (passed to call[argument content escapes]) at ./q.go:14:13
./q.go:8:13: ([]byte)(“”) escapes to heap
./q.go:8:13: from s (assigned) at ./q.go:8:4
./q.go:8:13: from s1 (assigned) at ./q.go:11:5
./q.go:8:13: from s1 (interface-converted) at ./q.go:14:13
./q.go:8:13: from … argument (arg to …) at ./q.go:14:13
./q.go:8:13: from *(… argument) (indirection) at ./q.go:14:13
./q.go:8:13: from … argument (passed to call[argument content escapes]) at ./q.go:14:13
以上输出中的 s1 escapes to heap 和 ([]byte)(“”) escapes to heap 表明,由于 fmt.Println(s1, “===”, s2) 代码的引入导致了变量分配模型的变化。简单点讲就是从栈中逃逸到了堆上。内存逃逸的分析我们会在后面的章节详细介绍。问题到此,大概的思路已经有了,但是我们如何通过代码层面进行验证呢? 通过搜索 go 源码实现调用的函数 runtime.stringtoslicebyte 的地方进行入手。通过搜索发现调用的文件在 cmd/compile/internal/gc/walk.go
关于 string到[]byte 分析调用的代码如下
case OSTRARRAYBYTE:
a := nodnil() // 分配到堆上的的默认行为
if n.Esc == EscNone {
// Create temporary buffer for slice on stack.
t := types.NewArray(types.Types[TUINT8], tmpstringbufsize)
a = nod(OADDR, temp(t), nil) // 分配在栈上,大小为32
}
n = mkcall("stringtoslicebyte", n.Type, init, a, conv(n.Left, types.Types[TSTRING])) OSTRARRAYBYTE 定义
OSTRARRAYBYTE // Type(Left) (Type is []byte, Left is a string)
上述代码中的 n.Esc == EscNone 条件分析则表明了发生内存逃逸和不发生内存逃逸的情况下,初始化的方式是不同的。 EscNone 的定义:
EscNone // Does not escape to heap, result, or parameters.
通过以上分析,我们总算找到了魔法的最终谜底。 以上分析的go语言版本基于 1.9.2,不同的go语言的内存分配机制可能不同,具体可以参见我同事更加详细的分析 Go中string转[]byte的陷阱.md
Go 内存管理
Go 语言能够自动进行内存管理,避免了 C 语言中的内存自己管理的麻烦,但是同时对于代码的内存管理和回收细节进行了封装,也潜在增加了系统调试和优化的难度。同时,内存自动管理也是一项非常困难的事情,比如函数的多层调用、闭包调用、结构体或者管道的多次赋值、切片和MAP、CGO调用等多种情况综合下,往往会导致自动管理优化机制失效,退化成原始的管理状态;go 中的内存回收(GC)策略也在不断地优化过程。Golang 从第一个版本以来,GC 一直是大家诟病最多的,但是每一个版本的发布基本都伴随着 GC 的改进。下面列出一些比较重要的改动。
v1.1 STW
v1.3 Mark STW, Sweep 并行
v1.5 三色标记法
v1.8 hybrid write barrier
预热基础知识:How do I know whether a variable is allocated on the heap or the stack?
逃逸分析-Escape Analysis
更深入和细致的了解建议阅读 William Kennedy 的 4 篇 Post
go 没有像 C 语言那样提供精确的堆与栈分配控制,由于提供了内存自动管理的功能,很大程度上模糊了堆与栈的界限。例如以下代码:
package main
func main() {
str := GetString()
_ = str
}
func GetString() *string {
var s string
s = “hello”
return &s
}
行 10 中的变量 s = “hello” 尽管声明在了 GetString() 函数内,但是在 main 函数中却仍然能够访问到返回的变量;这种在函数内定义的局部变量,能够突破自身的范围被外部访问的行为称作逃逸,也即通过逃逸将变量分配到堆上,能够跨边界进行数据共享。
stack allocation is cheap and heap allocation is expensive.
Go 语言逃逸分析实现
更多内存建议阅读 Allocation efficiency in high-performance Go services
2.go
package main
import “fmt”
func main() {
x := 42
fmt.Println(x)
}
go build 工具中的 flag -gcflags ‘-m’ 可以用来分析内存逃逸的情况汇总,最多可以提供 4 个 “-m”, m 越多则表示分析的程度越详细,一般情况下我们可以采用两个 m 分析。
$ go build -gcflags ‘-m -l’ 2.go
command-line-arguments
./2.go:7:13: x escapes to heap
./2.go:7:13: main … argument does not escape
-l disable inline, 也可以调用的函数前添加注释
$ go build -gcflags ‘-m -m -l’ 2.go
command-line-arguments
./2.go:7:13: x escapes to heap
./2.go:7:13: from … argument (arg to …) at ./2.go:7:13
./2.go:7:13: from *(… argument) (indirection) at ./2.go:7:13
./2.go:7:13: from … argument (passed to call[argument content escapes]) at ./2.go:7:13
./2.go:7:13: main … argument does not escape
上例中的 x escapes to heap 则表明了变量 x 变量逃逸到了堆(heap)上。其中 -l 表示不启用 inline 模式调用,否则会使得分析更加复杂,也可以在函数上方添加注释 //go:noinline禁止函数 inline调用。至于调用 fmt.Println()为什么会导致 x escapes to heap,可以参考 Issue #19720 和 Issue #8618,对于上述 fmt.Println() 的行为我们可以通过以下代码进行简单模拟测试,效果基本一样:
./3.go:13:9: a escapes to heap
./3.go:13:9: from pp.arg (star-dot-equals) at ./3.go:13:9
./3.go:11:45: leaking param: a
./3.go:11:45: from a (interface-converted) at ./3.go:13:9
./3.go:11:45: from pp.arg (star-dot-equals) at ./3.go:13:9
./3.go:12:11: Fprintln new(pp) does not escape
./3.go:7:21: leaking param: a
./3.go:7:21: from a (passed to call[argument escapes]) at ./3.go:8:10
./3.go:19:11: … argument escapes to heap
./3.go:19:11: from … argument (passed to call[argument escapes]) at ./3.go:19:11
./3.go:19:11: x escapes to heap
./3.go:19:11: from … argument (arg to …) at ./3.go:19:11
./3.go:19:11: from … argument (passed to call[argument escapes]) at ./3.go:19:11
逃逸的常见情况分析参见: http://www.agardner.me/golang/garbage/collection/gc/escape/analysis/2015/10/18/go-escape-analysis.html
主要原因如下:变量 x 虽为 int 类型,但是在传递给函数 MyPrintln函数中被转换成 interface{} 类型,因为 interface{} 类型中包含指向数据的地址,因此 x 在传递到函数 MyPrintln过程中进行了一个内存重新分配的过程,由于 pp.arg = a 结构体中的字段赋值的引用,导致了后续变量的逃逸到了堆上。如果将上述 pp.arg = a 注释掉,则不会出现内存逃逸的情况。
go 语言中的 interface 接口,在编译时候的时候会进行隐式转换的静态检查,但是显示的 interface 到 interface 的转换可以在运行时查询方法集,动态检测比如:
type Stringer interface {
String() string
}
if v, ok := any.(Stringer); ok {
return v.String()
}
关于 Itab 结构的计算,由于(interface、type)对的不确定性,go 编译器或者链接器不可能在编译的时候计算两者的对应关系,而且即使能够计算出来也可能是绝大多数的对应关系在实际中不适用;因此 go 编译器会在编译的过程中对于 interface 和 type 中的方法生成一个相关的描述结构,分别记录 interface 和 type 各自对应的方法集合,go 语言会在 type 实际的动态转换成 interface 过程中,将 interafce 中定义的方法在 type 中一一进行对比查找,并完善 Itab 结构,并将 Itab 结构进行缓存提升性能。
./5.go:9:18: (*S).M1 s does not escape
./5.go:23:11: leaking param: s
./5.go:23:11: from s.M1(42) (receiver in indirect call) at ./5.go:23:21
./5.go:24:12: f2 s does not escape
./5.go:19:5: &s1 escapes to heap
./5.go:19:5: from &s1 (passed to call[argument escapes]) at ./5.go:19:4
./5.go:19:5: &s1 escapes to heap
./5.go:19:5: from &s1 (interface-converted) at ./5.go:19:5
./5.go:19:5: from &s1 (passed to call[argument escapes]) at ./5.go:19:4
./5.go:16:6: moved to heap: s1
./5.go:20:5: main &s2 does not escape
:1:0: leaking param: .this :1:0: from .this.M1(.anon0) (receiver in indirect call) at :1:0
性能测试分析:
package main_test
import "testing"
// go test -bench . --benchmem -gcflags "-N -l" 5_test.go
func main() {
s := make([]byte, 1, 1*1024)
_ = s
}
$ go build -gcflags "-m -m" slice_esc.go
# command-line-arguments
./slice_esc.go:3:6: can inline main as: func() { s := make([]byte, 1, 1 * 1024); _ = s }
./slice_esc.go:4:11: main make([]byte, 1, 1 * 1024) does not escape
如果 slice 大小超过 64k,则会分配到堆上 (go 1.9.2)
package main
func main() {
s := make([]byte, 1, 64*1024) // 64k
_ = s
}
$ go build -gcflags "-m -m" slice_esc.go
# command-line-arguments
./slice_esc.go:3:6: can inline main as: func() { s := make([]byte, 1, 64 * 1024); _ = s }
./slice_esc.go:4:11: make([]byte, 1, 64 * 1024) escapes to heap
./slice_esc.go:4:11: from make([]byte, 1, 64 * 1024) (too large for stack) at ./slice_esc.go:4:11
指针类型切片验证
package main
func main() {
s := make([]*string, 1, 100)
str := "hello"
s = append(s, &str)
_ = s
}
$ go build -gcflags "-m -m -l" slice_esc.go
# command-line-arguments
./slice_esc.go:6:16: &str escapes to heap
./slice_esc.go:6:16: from append(s, &str) (appended to slice) at ./slice_esc.go:6:12
./slice_esc.go:5:9: moved to heap: str
./slice_esc.go:4:11: main make([]*string, 1, 100) does not escape
对于保存在 []*string 中的字符串都会直接在堆上分配。
package main
import "math/rand"
func main() {
randSize := rand.Int()
s := make([]*string, 0, randSize)
str := "hello"
s = append(s, &str)
_ = s
}
$ go build -gcflags "-m -m -l" slice_esc.go
# command-line-arguments
./slice_esc.go:7:11: make([]*string, 0, randSize) escapes to heap
./slice_esc.go:7:11: from make([]*string, 0, randSize) (too large for stack) at ./slice_esc.go:7:11
./slice_esc.go:9:16: &str escapes to heap
./slice_esc.go:9:16: from append(s, &str) (appended to slice) at ./slice_esc.go:9:12
./slice_esc.go:8:9: moved to heap: str
由于 s := make([]*string, 0, randSize) 大小不能编译确定,所以会逃逸到堆上。
参考
Golang 内存逃逸分析
深入解析 Go 中 Slice 底层实现 ***
以C视角来理解Go内存逃逸
golang string和[]byte的对比
Go Slices: usage and internals
Where is append() implementation?
SliceTricks ***
Variadic func changes []byte(s) cap #24972
spec: clarify that conversions to slices don't guarantee slice capacity? #24163
Golang escape analysis ***
Go Escape Analysis Flaws
Escape Analysis for Java
Language Mechanics On Escape Analysis 中文 中文2
Allocation efficiency in high-performance Go services ***
Profiling Go Programs
https://github.com/mushroomsir/blog/blob/master/Go%E4%B8%ADstring%E8%BD%AC%5B%5Dbyte%E7%9A%84%E9%99%B7%E9%98%B1.md
the-go-programming-language-report
https://golang.org/doc/faq
年终盘点!2017年超有价值的Golang文章
Golang 垃圾回收剖析
深入Golang之垃圾回收
// 删除中间一个元素
a := int(len(numbers) / 2)
numbers = append(numbers[:a], numbers[a+1:]...)
经典案例:切片反转
// reverse reverses a slice of ints in place.
func reverse(s []int) {
for i, j := 0, len(s)-1; i < j; i, j = i+1, j-1 {
s[i], s[j] = s[j], s[i]
}
}
切片在编译时的特性
编译时新建一个切片,切片内元素的类型是在编译期间确定的
func NewSlice(elem *Type) *Type {
if t := elem.Cache.slice; t != nil {
if t.Elem() != elem {
Fatalf("elem mismatch")
}
return t
}
t := New(TSLICE)
t.Extra = Slice{Elem: elem}
elem.Cache.slice = t
return t
}
切片的类型
// Slice contains Type fields specific to slice types.
type Slice struct {
Elem *Type // element type
}
编译时:字面量初始化
当我们使用字面量 []int{1, 2, 3} 创建新的切片时,会创建一个array数组([3]int{1,2,3})存储于静态区中。同时会创建一个变量。
// recipe for var = []t{...}
// 1. make a static array
// var vstat [...]t
// 2. assign (data statements) the constant part
// vstat = constpart{}
// 3. make an auto pointer to array and allocate heap to it
// var vauto *[...]t = new([...]t)
// 4. copy the static array to the auto array
// *vauto = vstat
// 5. for each dynamic part assign to the array
// vauto[i] = dynamic part
// 6. assign slice of allocated heap to var
// var = vauto[:]
编译时:make 初始化
例如make([]int,3,4)
func typecheck1(n *Node, top int) (res *Node) {
switch t.Etype {
case TSLICE:
if i >= len(args) {
yyerror("missing len argument to make(%v)", t)
n.Type = nil
return n
}
l = args[i]
i++
l = typecheck(l, ctxExpr)
var r *Node
if i < len(args) {
r = args[i]
i++
r = typecheck(r, ctxExpr)
}
if l.Type == nil || (r != nil && r.Type == nil) {
n.Type = nil
return n
}
if !checkmake(t, "len", l) || r != nil && !checkmake(t, "cap", r) {
n.Type = nil
return n
}
n.Left = l
n.Right = r
n.Op = OMAKESLICE
下面来分析一下编译时内存的逃逸问题,如果make初始化了一个太大的切片,这个空间会逃逸到堆中,由运行时分配。如果一个空间比较小,会在栈中分配。
// maximum size of implicit variables that we will allocate on the stack.
// p := new(T) allocating T on the stack
// p := &T{} allocating T on the stack
// s := make([]T, n) allocating [n]T on the stack
// s := []byte("...") allocating [n]byte on the stack
// Note: the flag smallframes can update this value.
maxImplicitStackVarSize = int64(64 * 1024)
核心逻辑位于go/src/cmd/compile/internal/gc/walk.go,n.Esc代表变量是否逃逸
func walkexpr(n *Node, init *Nodes) *Node{
case OMAKESLICE:
...
if n.Esc == EscNone {
// var arr [r]T
// n = arr[:l]
i := indexconst(r)
if i < 0 {
Fatalf("walkexpr: invalid index %v", r)
}
t = types.NewArray(t.Elem(), i) // [r]T
var_ := temp(t)
a := nod(OAS, var_, nil) // zero temp
a = typecheck(a, ctxStmt)
init.Append(a)
r := nod(OSLICE, var_, nil) // arr[:l]
r.SetSliceBounds(nil, l, nil)
r = conv(r, n.Type) // in case n.Type is named.
r = typecheck(r, ctxExpr)
r = walkexpr(r, init)
n = r
} else {
if t.Elem().NotInHeap() {
yyerror("%v is go:notinheap; heap allocation disallowed", t.Elem())
}
// go/src/runtime/slice.go
func makeslice(et *_type, len, cap int) unsafe.Pointer {
mem, overflow := math.MulUintptr(et.size, uintptr(cap))
if overflow || mem > maxAlloc || len < 0 || len > cap {
// NOTE: Produce a 'len out of range' error instead of a
// 'cap out of range' error when someone does make([]T, bignumber).
// 'cap out of range' is true too, but since the cap is only being
// supplied implicitly, saying len is clearer.
// See golang.org/issue/4085.
mem, overflow := math.MulUintptr(et.size, uintptr(len))
if overflow || mem > maxAlloc || len < 0 {
panicmakeslicelen()
}
panicmakeslicecap()
}
return mallocgc(mem, et, true)
}
func makeslice64(et *_type, len64, cap64 int64) unsafe.Pointer {
len := int(len64)
if int64(len) != len64 {
panicmakeslicelen()
}
cap := int(cap64)
if int64(cap) != cap64 {
panicmakeslicecap()
}
return makeslice(et, len, cap)
}
切片的扩容
Go 中切片append表示添加元素,但不是使用了append就需要扩容,如下代码不需要扩容
if et.ptrdata == 0 {
p = mallocgc(capmem, nil, false)
// The append() that calls growslice is going to overwrite from old.len to cap (which will be the new length).
// Only clear the part that will not be overwritten.
memclrNoHeapPointers(add(p, newlenmem), capmem-newlenmem)
} else {
// Note: can't use rawmem (which avoids zeroing of memory), because then GC can scan uninitialized memory.
p = mallocgc(capmem, et, true)
if lenmem > 0 && writeBarrier.enabled {
// Only shade the pointers in old.array since we know the destination slice p
// only contains nil pointers because it has been cleared during alloc.
bulkBarrierPreWriteSrcOnly(uintptr(p), uintptr(old.array), lenmem)
}
}
memmove(p, old.array, lenmem)
init {
n := len(a)
if n > len(b) { n = len(b) }
if a.ptr != b.ptr { memmove(a.ptr, b.ptr, n*sizeof(elem(a))) }
}
除非是协程调用的方式go copy(numbers1, numbers) 或者(加入了race等检测 && 不是在编译go运行时代码) 会转而调用运行时slicestringcopy 或 slicecopy .
case OCOPY:
n = copyany(n, init, instrumenting && !compiling_runtime)
case OGO:
switch n.Left.Op {
case OCOPY:
n.Left = copyany(n.Left, &n.Ninit, true)
slicestringcopy 或 slicecopy 本质上仍然是调用了memmove只是进行了额外的race冲突等判断。
func slicecopy(to, fm slice, width uintptr) int {
...
if raceenabled {
callerpc := getcallerpc()
pc := funcPC(slicecopy)
racewriterangepc(to.array, uintptr(n*int(width)), callerpc, pc)
racereadrangepc(fm.array, uintptr(n*int(width)), callerpc, pc)
}
if msanenabled {
msanwrite(to.array, uintptr(n*int(width)))
msanread(fm.array, uintptr(n*int(width)))
}
size := uintptr(n) * width
if size == 1 { // common case worth about 2x to do here
// TODO: is this still worth it with new memmove impl?
*(*byte)(to.array) = *(*byte)(fm.array) // known to be a byte pointer
} else {
memmove(to.array, fm.array, size)
}
return n
}
总结
切片是go语言中重要的数据结果,其和其他语言不同的是,其维护了底层的内存,以及长度和容量