本文基于golang 1.10版本分析。
slice 結(jié)構(gòu)
slice實(shí)際就是一個struct,在runtime/slice.go中的定義如下:
type slice struct {
array unsafe.Pointer
len int
cap int
}
// An notInHeapSlice is a slice backed by go:notinheap memory.
type notInHeapSlice struct {
array *notInHeap
len int
cap int
}
由定義可以看出slice底層是基于數(shù)組,本質(zhì)是對數(shù)組的封裝。由三部分組成:
- 指針 指向第一個slice元素對應(yīng)的底層數(shù)組元素地址。
- 長度 slice中元素的數(shù)目
- 容量 slice開始位置到底層數(shù)據(jù)的結(jié)尾
內(nèi)置函數(shù)len和cap,分別返回slice的長度和容量。slice使用下標(biāo)不能超過len,向后擴(kuò)展不能超過cap。多個不同slice之間可以共享底層的數(shù)據(jù),起始地址、長度都可以不同,所以slice第一個元素未必是數(shù)組的第一個元素。
使用切片
Slice代表變長的序列,序列中每個元素都有相同的類型,屬于引用類型,一般這么聲明:
var 變量名 []類型 //這樣沒有初始化賦值,僅僅是引用,沒分配底層數(shù)組。
var 變量名 = []類型{置集合} //會分配底層數(shù)組,len、cap都是置集合大小
var 變量名 []類型 = make([]類型,len,cap) //這樣會分配底層數(shù)組
注意,slice類型是不能比較的,對于字節(jié)型的slice,標(biāo)準(zhǔn)庫有bytes.Equal函數(shù)用于比較,但是其他類型的slice,需要自行展開比較。
對于數(shù)組和slice,除了使用聲明,還可以使用[begin:end:cap]來創(chuàng)建新的切片,注意這是個左閉右開區(qū)間,slice的容量是cap-begin,底層數(shù)據(jù)共享。
對于string類型,進(jìn)行如[:]的操作,返回的是一個string,而不是切片,這個新的string和原來的string未必是一塊內(nèi)存,看編譯器優(yōu)化。
另外給切片從后面追加數(shù)據(jù),可以用buildin函數(shù)append來實(shí)現(xiàn)。
func f(a []int) {
a = append(a, 8)
fmt.Printf("f cap:%d addr:%p value:%v\n", cap(a), &a[0], a)
}
func main() {
var slice1 = []int{1, 2, 3, 4, 5, 6}
fmt.Printf("%v %d %d\n", slice1, len(slice1), cap(slice1))
slice2 := slice1[2:2:3]
fmt.Printf("%v %d %d\n", slice2, len(slice2), cap(slice2))
var a [5]int = [5]int{1, 2, 3, 4, 5} //先定義了一個數(shù)組
array_slice := a[:]
fmt.Printf("cap:%d a_addr:%p slice_addr:%p slice_type:%T\n", cap(array_slice), &a, &array_slice[0], array_slice)
array_slice[1] = 9
fmt.Printf("cap:%d addr:%p value:%v\n", cap(array_slice), &array_slice[0], a)
array_slice = append(array_slice, 6)
fmt.Printf("cap:%d addr:%p value:%v\n", cap(array_slice), &array_slice[0], array_slice)
array_slice = append(array_slice, 7)
fmt.Printf("cap:%d addr:%p value:%v\n", cap(array_slice), &array_slice[0], array_slice)
f(array_slice)
fmt.Printf("cap:%d addr:%p value:%v\n", cap(array_slice), &array_slice[0], array_slice)
array_slice = array_slice[:8]
fmt.Printf("cap:%d addr:%p value:%v\n", cap(array_slice), &array_slice[0], array_slice)
var s string = "abcdefg"
string_slice := s[0:5]
fmt.Printf("%p %T\n", &s, string_slice)
}
輸出:
[1 2 3 4 5 6] 6 6
[] 0 1
cap:5 a_addr:0xc042086090 slice_addr:0xc042086090 slice_type:[]int
cap:5 addr:0xc042086090 value:[1 9 3 4 5]
cap:10 addr:0xc04209a000 value:[1 9 3 4 5 6]
cap:10 addr:0xc04209a000 value:[1 9 3 4 5 6 7]
f cap:10 addr:0xc04209a000 value:[1 9 3 4 5 6 7 8]
cap:10 addr:0xc04209a000 value:[1 9 3 4 5 6 7]
cap:10 addr:0xc04209a000 value:[1 9 3 4 5 6 7 8]
0xc04205c1c0 string
可以看到返回的切片,底層數(shù)據(jù)是一樣的,修改切片中某個元素的值,就是修改原數(shù)據(jù)的值。但是對切片進(jìn)行append的時候,如果底層空間足夠就使用原來的空間,如果底層空間不夠,那么就會申請新的空間。函數(shù)傳遞切片的時候,也是值傳遞,不是引用傳遞,傳遞的是slice結(jié)構(gòu)體那三個字段的值,所以不會復(fù)制slice的實(shí)際內(nèi)容,在函數(shù)內(nèi)append,那么在cap足夠的時候,修改的僅僅是函數(shù)中slice的len,外面的slice len不會發(fā)生變化。
nil值、空值
slice有2個特殊的值,大家要注意分辨一下
var s []int //nil值
var t = []int{} //空值
var u = make([]int, 3)[3:] //空值
fmt.Printf("value of s: %#v\n", s) // value of s: []int(nil)
fmt.Printf("value of t: %#v\n", t) // value of t: []int{}
fmt.Printf("value of u: %#v\n", u) //value of u: []int{}
fmt.Printf("s is nil? %v\n", s == nil) //true
fmt.Printf("t is nil? %v\n", t == nil) //false
fmt.Printf("u is nil? %v\n", u == nil) //false
區(qū)別是,nil slice的底層數(shù)組指針是nil,empty slice底層數(shù)組指針指向一個長度為0的數(shù)組。
所以測試一個slice是否有數(shù)據(jù),使用len(s) == 0來判斷,而不應(yīng)用s == nil來判斷。
一般的用法是nil slice表示數(shù)組不存在,empty slice表示集合為空。序列化json的時候,nil slice會變成null,empty是[]
源碼分析
創(chuàng)建slice
// maxElems is a lookup table containing the maximum capacity for a slice.
// The index is the size of the slice element.
var maxElems = [...]uintptr{
^uintptr(0),
maxAlloc / 1, maxAlloc / 2, maxAlloc / 3, maxAlloc / 4,
maxAlloc / 5, maxAlloc / 6, maxAlloc / 7, maxAlloc / 8,
maxAlloc / 9, maxAlloc / 10, maxAlloc / 11, maxAlloc / 12,
maxAlloc / 13, maxAlloc / 14, maxAlloc / 15, maxAlloc / 16,
maxAlloc / 17, maxAlloc / 18, maxAlloc / 19, maxAlloc / 20,
maxAlloc / 21, maxAlloc / 22, maxAlloc / 23, maxAlloc / 24,
maxAlloc / 25, maxAlloc / 26, maxAlloc / 27, maxAlloc / 28,
maxAlloc / 29, maxAlloc / 30, maxAlloc / 31, maxAlloc / 32,
}
// maxSliceCap returns the maximum capacity for a slice.
func maxSliceCap(elemsize uintptr) uintptr {
if elemsize < uintptr(len(maxElems)) {
return maxElems[elemsize]
}
return maxAlloc / elemsize
}
func makeslice(et *_type, len, cap int) slice {
// NOTE: The len > maxElements check here is not strictly necessary,
// but it produces a 'len out of range' error instead of a 'cap out of range' error
// when someone does make([]T, bignumber). 'cap out of range' is true too,
// but since the cap is only being supplied implicitly, saying len is clearer.
// See issue 4085.
maxElements := maxSliceCap(et.size)
if len < 0 || uintptr(len) > maxElements {
panicmakeslicelen()
}
if cap < len || uintptr(cap) > maxElements {
panicmakeslicecap()
}
p := mallocgc(et.size*uintptr(cap), et, true)
return slice{p, len, cap}
}
可以看到創(chuàng)建slice的流程非常簡單,根據(jù)類型的大小,算出最多能申請多少個元素,然后檢查一下參數(shù),不對就panic,就用malloc申請空間,賦值到一個slice結(jié)構(gòu)體中,返回。
擴(kuò)容
// growslice handles slice growth during append.
// It is passed the slice element type, the old slice, and the desired new minimum capacity,
// and it returns a new slice with at least that capacity, with the old data
// copied into it.
// The new slice's length is set to the old slice's length,
// NOT to the new requested capacity.
// This is for codegen convenience. The old slice's length is used immediately
// to calculate where to write new values during an append.
// TODO: When the old backend is gone, reconsider this decision.
// The SSA backend might prefer the new length or to return only ptr/cap and save stack space.
func growslice(et *_type, old slice, cap int) slice {
if raceenabled {
callerpc := getcallerpc()
racereadrangepc(old.array, uintptr(old.len*int(et.size)), callerpc, funcPC(growslice))
}
if msanenabled {
msanread(old.array, uintptr(old.len*int(et.size)))
}
if et.size == 0 {
if cap < old.cap {
panic(errorString("growslice: cap out of range"))
}
// append should not create a slice with nil pointer but non-zero len.
// We assume that append doesn't need to preserve old.array in this case.
return slice{unsafe.Pointer(&zerobase), old.len, cap}
}
newcap := old.cap
doublecap := newcap + newcap
if cap > doublecap {
newcap = cap
} else {
if old.len < 1024 {
newcap = doublecap
} else {
// Check 0 < newcap to detect overflow
// and prevent an infinite loop.
for 0 < newcap && newcap < cap {
newcap += newcap / 4
}
// Set newcap to the requested cap when
// the newcap calculation overflowed.
if newcap <= 0 {
newcap = cap
}
}
}
var overflow bool
var lenmem, newlenmem, capmem uintptr
// Specialize for common values of et.size.
// For 1 we don't need any division/multiplication.
// For sys.PtrSize, compiler will optimize division/multiplication into a shift by a constant.
// For powers of 2, use a variable shift.
switch {
case et.size == 1:
lenmem = uintptr(old.len)
newlenmem = uintptr(cap)
capmem = roundupsize(uintptr(newcap))
overflow = uintptr(newcap) > maxAlloc
newcap = int(capmem)
case et.size == sys.PtrSize:
lenmem = uintptr(old.len) * sys.PtrSize
newlenmem = uintptr(cap) * sys.PtrSize
capmem = roundupsize(uintptr(newcap) * sys.PtrSize)
overflow = uintptr(newcap) > maxAlloc/sys.PtrSize
newcap = int(capmem / sys.PtrSize)
case isPowerOfTwo(et.size):
var shift uintptr
if sys.PtrSize == 8 {
// Mask shift for better code generation.
shift = uintptr(sys.Ctz64(uint64(et.size))) & 63
} else {
shift = uintptr(sys.Ctz32(uint32(et.size))) & 31
}
lenmem = uintptr(old.len) << shift
newlenmem = uintptr(cap) << shift
capmem = roundupsize(uintptr(newcap) << shift)
overflow = uintptr(newcap) > (maxAlloc >> shift)
newcap = int(capmem >> shift)
default:
lenmem = uintptr(old.len) * et.size
newlenmem = uintptr(cap) * et.size
capmem = roundupsize(uintptr(newcap) * et.size)
overflow = uintptr(newcap) > maxSliceCap(et.size)
newcap = int(capmem / et.size)
}
// The check of overflow (uintptr(newcap) > maxSliceCap(et.size))
// in addition to capmem > _MaxMem is needed to prevent an overflow
// which can be used to trigger a segfault on 32bit architectures
// with this example program:
//
// type T [1<<27 + 1]int64
//
// var d T
// var s []T
//
// func main() {
// s = append(s, d, d, d, d)
// print(len(s), "\n")
// }
if cap < old.cap || overflow || capmem > maxAlloc {
panic(errorString("growslice: cap out of range"))
}
var p unsafe.Pointer
if et.kind&kindNoPointers != 0 {
p = mallocgc(capmem, nil, false)
memmove(p, old.array, lenmem)
// The append() that calls growslice is going to overwrite from old.len to cap (which will be the new length).
// Only clear the part that will not be overwritten.
memclrNoHeapPointers(add(p, newlenmem), capmem-newlenmem)
} else {
// Note: can't use rawmem (which avoids zeroing of memory), because then GC can scan uninitialized memory.
p = mallocgc(capmem, et, true)
if !writeBarrier.enabled {
memmove(p, old.array, lenmem)
} else {
for i := uintptr(0); i < lenmem; i += et.size {
typedmemmove(et, add(p, i), add(old.array, i))
}
}
}
return slice{p, old.len, newcap}
}
擴(kuò)容時,先計算需要擴(kuò)多少個,算法是這樣的:
- 如果申請的容量(cap)是老容量(old.cap)的兩倍以上,那么就擴(kuò)成cap
- 否則,如果老容量old.cap小于1024,那么就擴(kuò)成old.cap x 2
- 再否則,newcap初始為old.cap,一直循環(huán)newcap += newcap/4,直到不小于cap,newcap就是最終擴(kuò)成的大小,注意這里還有個溢出保護(hù),如果溢出了,那么newcap=cap。
計算完需要申請的元素個數(shù)大小之后,就計算內(nèi)存位置,進(jìn)行復(fù)制,這里不細(xì)說。
需要注意的地方是,擴(kuò)容之后可能還是原來的數(shù)組,因為可能底層數(shù)組還有空間。
slice copy
func slicecopy(to, fm slice, width uintptr) int {
if fm.len == 0 || to.len == 0 {
return 0
}
n := fm.len
if to.len < n {
n = to.len
}
if width == 0 {
return n
}
if raceenabled {
callerpc := getcallerpc()
pc := funcPC(slicecopy)
racewriterangepc(to.array, uintptr(n*int(width)), callerpc, pc)
racereadrangepc(fm.array, uintptr(n*int(width)), callerpc, pc)
}
if msanenabled {
msanwrite(to.array, uintptr(n*int(width)))
msanread(fm.array, uintptr(n*int(width)))
}
size := uintptr(n) * width
if size == 1 { // common case worth about 2x to do here
// TODO: is this still worth it with new memmove impl?
*(*byte)(to.array) = *(*byte)(fm.array) // known to be a byte pointer
} else {
memmove(to.array, fm.array, size)
}
return n
}
這是常規(guī)的copy,比較兩個slice的len,選取小的len進(jìn)行復(fù)制,把fm的內(nèi)容復(fù)制to中,使用memmove對array進(jìn)行內(nèi)存拷貝。我們可以看到因為使用的是較小的len,所以slice to中的cap不需要改變。如果fm的len較小,那么就覆蓋to中的前l(fā)en個位置,其余不變。eg:
func main() {
var slice1 = []int{1, 2, 3, 4, 5, 6}
var slice2 = []int{8,9,10,11,12,13,14,15}
copy(slice2, slice1)
fmt.Printf("len:%d cap:%d %#v\n", len(slice2), cap(slice2), slice2)
}
輸出
len:8 cap:8 []int{1, 2, 3, 4, 5, 6, 14, 15}