導語:熟悉k8s的同學知道Deployment目前只支持RollingUpgrade和ReCreate兩種策略。而對于運維的同學而言,實際生產環境中更多應該使用灰度發布和藍綠部署,筆者本想嘗試造輪子,實現一個加強版,正好網上搜索到Argo-Rollout和我想法一致,就不用重復造輪子了,本文就是體驗一下Argo-Rollout。
簡介
Argo-Rollout是一個Kubernetes Controller和對應一系列的CRD,提供更強大的Deployment能力。包括灰度發布、藍綠部署、更新測試(experimentation)、漸進式交付(progressive delivery)等特性。
支持特性:
- 藍綠部署
- 灰度發布
- 細粒的,帶權重的流量調度(traffic shifting)
- 自動rollback和promotion
- 手動管理
- 可定制的metric查詢和kpi分析
- Ingress controller集成:nginx,alb
- Service Mesh集成:Istio,Linkerd,SMI
- Metric provider集成:Prometheus, Wavefront, Kayenta, Web, Kubernetes Jobs
原理:
Argo原理和Deployment差不多,只是加強rollout的策略和流量控制。當spec.template發送變化時,Argo-Rollout就會根據spec.strategy進行rollout,通常會產生一個新的ReplicaSet,逐步scale down之前的ReplicaSet的pod數量。
安裝
1.安裝argo-rollouts的controller和crd
kubectl create namespace argo-rollouts
kubectl apply -n argo-rollouts -f https://raw.githubusercontent.com/argoproj/argo-rollouts/stable/manifests/install.yaml
2.安裝argo-rollouts的kubectl plugin
curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64
chmod +x ./kubectl-argo-rollouts-linux-amd64
mv ./kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts
使用
灰度發布包含Replica Shifting和Traffic Shifting兩個過程。
Replica Shifting
這里就直接拿官網的例子,來體驗一下Replica Shifting。
1.部署一個Demo應用
首先創建一個Rollout的CR和訪問該CR的Service:
kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/basic/rollout.yaml
kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/basic/service.yaml
Rollout CR,可以看到除了apiVersion
,kind
以及strategy
之外,其他和Deployment無異,實際上其源碼基本上都是引用的Deployment的數據結構:
#cat rollout.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: rollouts-demo
spec:
replicas: 5
strategy:
canary:
steps:
- setWeight: 20
- pause: {}
- setWeight: 40
- pause: {duration: 10}
- setWeight: 60
- pause: {duration: 10}
- setWeight: 80
- pause: {duration: 10}
revisionHistoryLimit: 2
selector:
matchLabels:
app: rollouts-demo
template:
metadata:
labels:
app: rollouts-demo
spec:
containers:
- name: rollouts-demo
image: argoproj/rollouts-demo:blue
ports:
- name: http
containerPort: 8080
protocol: TCP
resources:
requests:
memory: 32Mi
cpu: 5m
暴露的service:
apiVersion: v1
kind: Service
metadata:
name: rollouts-demo
spec:
ports:
- port: 80
targetPort: http
protocol: TCP
name: http
selector:
app: rollouts-demo
可以使用Argo-Rollout提供的plugin查看其狀態,感覺還是很香:
kubectl argo rollouts get rollout rollouts-demo
2.更新spec觸發rollout
然后通過修改spec中的鏡像,觸發一次rollout:
kubectl argo rollouts set image rollouts-demo rollouts-demo=argoproj/rollouts-demo:yellow
預期Rollout會創建一個新的ReplicaSet,并且逐步擴容新的ReplicaSet和縮容舊的ReplicaSet,用plugin查看一下:
# kubectl argo rollouts get rollout rollouts-demo --watch
可以看到Rollout新創建了ReplicaSet rollouts-demo-789746c88d,并且將老ReplicaSet的Pod轉移到新的ReplicaSet,新老ReplicaSet的pod比例為: 1:4,并且狀態為Paused,沒有繼續升級新pod,為什么呢?
主要原因就在這個spec.strategy,通過這個strategy我們可以看到其為升級設定了steps,由于是個列表,因此其會按照順序執行。這里第一步就是setWeight:20,意味著需要將20%的pod更新為新版本;第二步動作為pause: {},意味著將永久暫停,需要人為通過plugin使其繼續:
strategy:
canary:
steps:
- setWeight: 20
- pause: {}
- setWeight: 40
- pause: {duration: 10} #停頓10s
- setWeight: 60
- pause: {duration: 10}
- setWeight: 80
- pause: {duration: 10}
我們通過promote命令使其進行下一步:
# kubectl argo rollouts promote rollouts-demo
讓我們再查看結果,所有pod都為新的ReplicaSet的pod:
Traffic Shifting
上面例子演示了Argo-Rollout如何控制Replica Shifting,而正常的灰度過程,應該包含Replica Shifting和Traffic Shifting兩部分。
目前Argo-Rollout主要集成了Ingress和ServiceMesh兩種流量控制方法,我的測試環境中目前只部署了Nginx-Controller那就使用Ingress做演示。
1.部署物料
首先刪除之前的例子:
kubectl delete -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/basic/rollout.yaml
kubectl delete -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/basic/service.yaml
再部署官網的例子:
kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/nginx/rollout.yaml
kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/nginx/services.yaml
kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/nginx/ingress.yaml
上面的文件會部署1個rollout,兩個service和一個ingress:
Rollout里分別用canaryService
和stableService
分別定義了該應用灰度的Service Name(rollouts-demo-canary)和當前版本的Service Name(rollouts-demo-stable):
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: rollouts-demo
spec:
replicas: 1
strategy:
canary:
canaryService: rollouts-demo-canary
stableService: rollouts-demo-stable
trafficRouting:
nginx:
stableIngress: rollouts-demo-stable
steps:
- setWeight: 5
- pause: {}
...
Service rollouts-demo-canary 和 rollouts-demo-stable,二者內容一樣。selector中暫時沒有填上pod-template-hash,Argo-Rollout Controller會根據實際的ReplicaSet hash來修改該值:
apiVersion: v1
kind: Service
metadata:
name: rollouts-demo-canary
spec:
ports:
- port: 80
targetPort: http
protocol: TCP
name: http
selector:
app: rollouts-demo
# This selector will be updated with the pod-template-hash of the canary ReplicaSet. e.g.:
# rollouts-pod-template-hash: 7bf84f9696
---
apiVersion: v1
kind: Service
metadata:
name: rollouts-demo-stable
spec:
ports:
- port: 80
targetPort: http
protocol: TCP
name: http
selector:
app: rollouts-demo
# This selector will be updated with the pod-template-hash of the stable ReplicaSet. e.g.:
# rollouts-pod-template-hash: 789746c88d
Ingress
則定義了規則,nginx將rollouts-demo.local域名的請求轉發到當前版本的Service
(rollouts-demo-stable):
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: rollouts-demo-stable
annotations:
kubernetes.io/ingress.class: nginx
spec:
rules:
- host: rollouts-demo.local
http:
paths:
- path: /
backend:
# Reference to a Service name, also specified in the Rollout spec.strategy.canary.stableService field
serviceName: rollouts-demo-stable
servicePort: 80
Rollout Controller會根據ingress
rollouts-demo-stable內容,自動創建一個ingress
用了灰度的流量,名字為<ROLLOUT-NAME>-<INGRESS-NAME>-canary,所以這里多了一個ingress
rollouts-demo-rollouts-demo-stable-canary,將流量導向Canary Service
(rollouts-demo-canary):
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
generation: 1
name: rollouts-demo-rollouts-demo-stable-canary
namespace: default
ownerReferences:
- apiVersion: argoproj.io/v1alpha1
blockOwnerDeletion: true
controller: true
kind: Rollout
name: rollouts-demo
uid: 2d5b728b-2f71-4bf2-8283-323acf8ef573
spec:
rules:
- host: rollouts-demo.local
http:
paths:
- backend:
serviceName: rollouts-demo-canary
servicePort: 80
path: /
2.觸發更新
kubectl argo rollouts set image rollouts-demo rollouts-demo=argoproj/rollouts-demo:yellow
kubectl argo rollouts get rollout rollouts-demo
可以看到Rollout狀態中SetWeight為5了
同時查看Ingress
,多了nginx.ingress.kubernetes.io/canary
和nginx.ingress.kubernetes.io/canary-weight
兩條annotation:
#當前版本Ingress
#灰度Ingress
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "5"
creationTimestamp: "2020-07-05T06:31:54Z"
generation: 1
name: rollouts-demo-rollouts-demo-stable-canary
細心的你也能看出上面結果顯示有一個小問題問題ActualWeight:50,這里應該為5或者95,所以順便提了個issue給社區。
總結
Argo-Rollout提供更加強大的Deployment,包含比較適合運維的灰度發布和藍綠發布功能。本文也是簡單體驗了一下其灰度發布功能。
本文未提及的功能包括:
- Experiments,可以加入到Steps中,用于檢驗每個Step是否符合用戶預期;
- Analysis,用于統計Rollout中的各種metrics,包括每個Step花費時間等。
另外想到一個需求Argo-Rollout暫時未支持:
對于traffic-shifting,在做灰度的時候應該是讓固定的一些用戶或者url流量到新版本,目前Argo-Rollout并不支持。
當然上面這個問題可以通過添加一個Experiment,由該Experiment去修改Ingress或者SMI中的內容來實現。
除去功能之外,從源碼學習的角度來說,Argo-Rollout仍然是一個好項目,結構清晰,適合學習寫Controller和Plugin。
為什么代碼不是很復雜,而k8s自己不實現呢?可能是k8s為了鼓勵大家多寫crd吧,哈哈!