k8s list-watch
Background
參考kubernetes設(shè)計理念分析 | 從運行流程和list-watch看kubernetes系統(tǒng)的設(shè)計理念
k8s各個組件與api-server通過list-watch機制通信。list-watch需要解決以下問題:
- 實時性:各個組件必須及時得知數(shù)據(jù)變化;
- 順序性:保證數(shù)據(jù)變化的順序性,如果刪除在創(chuàng)建之前,畫面太美;
- 可靠性:由于網(wǎng)絡(luò)波動等因素,必須保證消息必達,AMQP?
解決之道
實時性
http streaming
,client發(fā)起HTTP長連接請求,server如果有數(shù)據(jù)更新就發(fā)送response。HTTP2通過連接復(fù)用技術(shù),可以優(yōu)化多個HTTP長連接共用一個TCP長連接。
順序性
每一種資源都有resverison,當發(fā)生變化時,resverion加1。resversionde 一致性,由etcd保證全局單調(diào)遞增,類似redis-incr。
所以client watch的response都是按照resversion排好序的。
resourceVersion參數(shù)說明
When specified with a watch call, shows changes that occur after that particular version of a resource. Defaults to changes from the beginning of history. When specified for list: - if unset, then the result is returned from remote storage based on quorum-read flag; - if it's 0, then we simply return what we currently have in cache, no guarantee; - if set to non zero, then the result is at least as fresh as given rv. (optional)
可靠性
list-watch總是先list,獲取apiserver cache中的所有數(shù)據(jù),然后根據(jù)最后的resversion watch。這樣如果網(wǎng)絡(luò)波動,client先list獲取之前未處理的數(shù)據(jù),然后watch處理更新的數(shù)據(jù)。保證數(shù)據(jù)不丟失。
watch優(yōu)化
問題
- 以前watch請求都是直接watch etcd,太多長連接給etcd以及apiserver都造成壓力;
- 很多相同的watch請求,造成太多重復(fù)序列化/反序列化操作。
優(yōu)化
- 每種REST,apiserver會watch etcd,然后cache到對應(yīng)的storage;
- apiserver接收watch請求,只讀對應(yīng)的REST storage,避免直接連接etcd;
- list返回全量數(shù)據(jù),每次watch失敗都會relist。在大規(guī)模場景,如果所有client同時發(fā)生relist,那server肯定受不了。為了應(yīng)對這種情況,提供了
EtcdResync
; - apiserver為了減少沒用的長連接(client掛了),給每個watch都加了一個隨機的超時參數(shù)。
Reflector
在k8s組件中,采用k8s.io\client-go\tools\cache\controller.goNewInformer()
對REST監(jiān)控,其中核心是Reflector
。Reflector
監(jiān)控指定的REST資源,然后將所有的變化保存在store
中,一般采用DeltaFIFO,DeltaFIFO is like FIFO, but allows you to process deletes
。
k8s.io\client-go\tools\cache\reflector.go
// ListAndWatch first lists all items and get the resource version at the moment of call,
// and then use the resource version to watch.
// It returns error if ListAndWatch didn't even try to initialize watch.
func (r *Reflector) ListAndWatch(stopCh <-chan struct{}) error {
options := metav1.ListOptions{ResourceVersion: "0"}
list, err := r.listerWatcher.List(options)
resourceVersion = listMetaInterface.GetResourceVersion()
r.setLastSyncResourceVersion(resourceVersion)
for {
timemoutseconds := int64(minWatchTimeout.Seconds() * (rand.Float64() + 1.0))
options = metav1.ListOptions{
ResourceVersion: resourceVersion,
// We want to avoid situations of hanging watchers. Stop any wachers that do not
// receive any events within the timeout window.
TimeoutSeconds: &timemoutseconds,
}
w, err := r.listerWatcher.Watch(options)
r.watchHandler(w, &resourceVersion, resyncerrc, stopCh)
}
}
// watchHandler watches w and keeps *resourceVersion up to date.
func (r *Reflector) watchHandler(w watch.Interface, resourceVersion *string, errc chan error, stopCh <-chan struct{}) error {
for {
select {
// streamwatch
case event, ok := <-w.ResultChan():
meta, err := meta.Accessor(event.Object)
newResourceVersion := meta.GetResourceVersion()
switch event.Type {
case watch.Added:
err := r.store.Add(event.Object)
case watch.Modified:
err := r.store.Update(event.Object)
case watch.Deleted:
// TODO: Will any consumers need access to the "last known
// state", which is passed in event.Object? If so, may need
// to change this.
err := r.store.Delete(event.Object)
default:
utilruntime.HandleError(fmt.Errorf("%s: unable to understand watch event %#v", r.name, event))
}
*resourceVersion = newResourceVersion
r.setLastSyncResourceVersion(newResourceVersion)
}
}
}