去基准和GC：B/OP分配/运

基准代码：去基准和GC：B/OP分配/运

func BenchmarkSth(b *testing.B) { 
    var x []int 
    b.ResetTimer() 
    for i := 0; i < b.N; i++ { 
     x = append(x, i) 
    } 
}

结果：

BenchmarkSth-4 50000000 20.7 ns/op 40 B/op 0 allocs/op

问题/ S：

在哪里40 B /运来从？（任何跟踪+指令的方式非常感谢）
如何有可能有40个B/op有0分配？
哪一个影响GC以及如何？（B/op或者alloc/op）
真的有可能使用append有0 B/op吗？

来源

2017-02-11 John Ballesteros

The Go Programming Language Specification

Appending to and copying slices

的可变参数函数追加追加零个或多个值x为s S型的，它必须是一个切片类型，并返回生成的切片，也的类型S.
append(s S, x ...T) S // T is the element type of S 
如果s的容量不够大以适合附加值， append会分配一个新的，足够大的底层数组，该数组适合现有切片元素和附加值。否则，追加重新使用底层数组。

对于您的示例，平均每个操作的[40，41）个字节被分配以在必要时增加片的容量。使用分期固定时间算法增加容量：最多可将len 1024增加至2倍上限，然后增加至1.25倍上限。平均而言，每个操作有[0,1）个分配。

例如，

func BenchmarkMem(b *testing.B) { 
    b.ReportAllocs() 
    var x []int64 
    var a, ac int64 
    b.ResetTimer() 
    for i := 0; i < b.N; i++ { 
     c := cap(x) 
     x = append(x, int64(i)) 
     if cap(x) != c { 
      a++ 
      ac += int64(cap(x)) 
     } 
    } 
    b.StopTimer() 
    sizeInt64 := int64(8) 
    B := ac * sizeInt64 // bytes 
    b.Log("op", b.N, "B", B, "alloc", a, "lx", len(x), "cx", cap(x)) 
}

输出：

BenchmarkMem-4  50000000   26.6 ns/op  40 B/op   0 allocs/op 
--- BENCH: BenchmarkMem-4 
    bench_test.go:32: op 1 B 8 alloc 1 lx 1 cx 1 
    bench_test.go:32: op 100 B 2040 alloc 8 lx 100 cx 128 
    bench_test.go:32: op 10000 B 386296 alloc 20 lx 10000 cx 12288 
    bench_test.go:32: op 1000000 B 45188344 alloc 40 lx 1000000 cx 1136640 
    bench_test.go:32: op 50000000 B 2021098744 alloc 57 lx 50000000 cx 50539520

对于op = 50000000，

B/op = floor(2021098744/50000000) = floor(40.421974888) = 40 

allocs/op = floor(57/50000000) = floor(0.00000114) = 0

阅读：

Go Slices: usage and internals

Arrays, slices (and strings): The mechanics of 'append'

'append' complexity

具有零B/OP（和零个分配/ OP）进行追加，追加分配之前具有足够容量的切片。

例如，对于var x = make([]int64, 0, b.N)，

func BenchmarkZero(b *testing.B) { 
    b.ReportAllocs() 
    var x = make([]int64, 0, b.N) 
    var a, ac int64 
    b.ResetTimer() 
    for i := 0; i < b.N; i++ { 
     c := cap(x) 
     x = append(x, int64(i)) 
     if cap(x) != c { 
      a++ 
      ac += int64(cap(x)) 
     } 
    } 
    b.StopTimer() 
    sizeInt64 := int64(8) 
    B := ac * sizeInt64 // bytes 
    b.Log("op", b.N, "B", B, "alloc", a, "lx", len(x), "cx", cap(x)) 
}

输出：

BenchmarkZero-4  100000000   11.7 ns/op   0 B/op   0 allocs/op 
--- BENCH: BenchmarkZero-4 
    bench_test.go:51: op 1 B 0 alloc 0 lx 1 cx 1 
    bench_test.go:51: op 100 B 0 alloc 0 lx 100 cx 100 
    bench_test.go:51: op 10000 B 0 alloc 0 lx 10000 cx 10000 
    bench_test.go:51: op 1000000 B 0 alloc 0 lx 1000000 cx 1000000 
    bench_test.go:51: op 100000000 B 0 alloc 0 lx 100000000 cx 100000000

注意，在基准CPU时间的减少从围绕26.6纳秒/ op键周围11.7纳秒/运算。

来源

2017-02-11 13:49:37 peterSO

关于问题3的任何想法？ –

去基准和GC：B/OP分配/运

回答

相关问题