cilium指標和采集

概述

這里的指標采集,包括 Cilium Operator,Cilium 本身以及 Hubble 吐出的指標。

安裝

通過 Helm 來部署 Cilium,其中修改了大量的參數值,以滿足 Staging 集群的環境的需求,主要就是污點的容忍、調度的 NodeSelector 以及 SecurityContext 等,另外就是需要開啟關于 metrics 的相關的參數,修改后的配置文件與默認的配置的差別如下。

# diff values.yaml vip-proxy-metrics.yaml
150c150
<   useDigest: true
---
>   useDigest: false
163a164
>   kubernetes.io/hostname: 10.189.212.125
168,172c169,172
< - operator: Exists
<   # - key: "key"
<   #   operator: "Equal|Exists"
<   #   value: "value"
<   #   effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
---
> - operator: Equal
>   key: "key"
>   value: "cilium"
>   effect: "NoExecute"
188c188,190
< extraEnv: []
---
> extraEnv:
>   - name: KUBERNETES_SERVICE_HOST
>     value: hh-k8s-noah-sc-staging001-master.api.vip.com
215c217
< podSecurityContext:
---
> podSecurityContext: {}
237c239
<   privileged: false
---
>   privileged: true
538c540
<   chainingMode: none
---
>   chainingMode: generic-veth
559c561
<   customConf: false
---
>   customConf: true
577c579
<   # configMap: cni-configuration
---
>   configMap: cni-configuration
940c942
<     useDigest: true
---
>     useDigest: false
952c954,958
<   tolerations: []
---
>   tolerations:
>     - operator: Equal
>       key: "key"
>       value: "cilium"
>       effect: "NoExecute"
982,988c988,994
<     #   enabled:
<     #   - dns:query;ignoreAAAA
<     #   - drop
<     #   - tcp
<     #   - flow
<     #   - icmp
<     #   - http
---
>     enabled:
>     - dns:query;ignoreAAAA
>     - drop
>     - tcp
>     - flow
>     - icmp
>     - http
994d999
<     enabled: ~
1059c1064
<     enabled: true
---
>     enabled: false
1105c1110
<     enabled: false
---
>     enabled: true
1117c1122
<       useDigest: true
---
>       useDigest: false
1144a1150
>       kubernetes.io/hostname: 10.189.212.125
1148c1154,1158
<     tolerations: []
---
>     tolerations:
>       - operator: Equal
>         key: "key"
>         value: "cilium"
>         effect: "NoExecute"
1151c1161,1163
<     extraEnv: []
---
>     extraEnv:
>       - name: KUBERNETES_SERVICE_HOST
>         value: hh-k8s-noah-sc-staging001-master.api.vip.com
1257c1269
<       enabled: false
---
>       enabled: true
1293c1305
<     enabled: false
---
>     enabled: true
1338c1350
<         useDigest: true
---
>         useDigest: false
1342c1354,1356
<       securityContext: {}
---
>       securityContext:
>         privileged: true
>
1345c1359,1361
<       extraEnv: []
---
>       extraEnv:
>         - name: KUBERNETES_SERVICE_HOST
>           value: hh-k8s-noah-sc-staging001-master.api.vip.com
1369c1385
<         useDigest: true
---
>         useDigest: false
1373c1389,1391
<       securityContext: {}
---
>       securityContext:
>         privileged: true
>
1376c1394,1396
<       extraEnv: []
---
>       extraEnv:
>         - name: KUBERNETES_SERVICE_HOST
>           value: hh-k8s-noah-sc-staging001-master.api.vip.com
1395c1415
<           enabled: true
---
>           enabled: false
1429a1450
>       kubernetes.io/hostname: 10.189.212.125
1433c1454,1458
<     tolerations: []
---
>     tolerations:
>       - operator: Equal
>         key: "key"
>         value: "cilium"
>         effect: "NoExecute"
1585c1610
< # kubeProxyReplacement: "true"
---
> kubeProxyReplacement: "true"
1625c1650
< enableIPv4Masquerade: true
---
> enableIPv4Masquerade: false
1773c1798
<   enabled: false
---
>   enabled: true
1855c1880
<     useDigest: true
---
>     useDigest: false
1864c1889,1891
<   extraEnv: []
---
>   extraEnv:
>     - name: KUBERNETES_SERVICE_HOST
>       value: hh-k8s-noah-sc-staging001-master.api.vip.com
1970a1998
>     kubernetes.io/hostname: 10.189.212.125
1975,1979c2003,2006
<   - operator: Exists
<     # - key: "key"
<     #   operator: "Equal|Exists"
<     #   value: "value"
<     #   effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
---
>     - operator: Equal
>       key: "key"
>       value: "cilium"
>       effect: "NoExecute"
2120c2147
< routingMode: ""
---
> routingMode: "native"
2146c2173
<     useDigest: true
---
>     useDigest: false
2164,2168c2191,2194
<   - operator: Exists
<     # - key: "key"
<     #   operator: "Equal|Exists"
<     #   value: "value"
<     #   effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
---
>   - operator: Equal
>     key: "key"
>     value: "cilium"
>     effect: "NoExecute"
2179a2206
>     kubernetes.io/hostname: 10.189.212.125
2258c2285
<     useDigest: true
---
>     useDigest: false
2263c2290
<   replicas: 2
---
>   replicas: 1
2297a2325
>     kubernetes.io/hostname: 10.189.212.125
2302,2306c2330,2333
<   - operator: Exists
<     # - key: "key"
<     #   operator: "Equal|Exists"
<     #   value: "value"
<     #   effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
---
>   - operator: Equal
>     key: "key"
>     value: "cilium"
>     effect: "NoExecute"
2312c2339,2341
<   extraEnv: []
---
>   extraEnv:
>     - name: KUBERNETES_SERVICE_HOST
>       value: hh-k8s-noah-sc-staging001-master.api.vip.com
2389c2418
<     enabled: false
---
>     enabled: true
2434c2463
<     restart: true
---
>     restart: false
2458c2487,2489
<   extraEnv: []
---
>   extraEnv:
>     - name: KUBERNETES_SERVICE_HOST
>       value: hh-k8s-noah-sc-staging001-master.api.vip.com
2472a2504
>     kubernetes.io/hostname: 10.189.212.125
2477,2481c2509,2512
<   - operator: Exists
<     # - key: "key"
<     #   operator: "Equal|Exists"
<     #   value: "value"
<     #   effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
---
>   - operator: Equal
>     key: "key"
>     value: "cilium"
>     effect: "NoExecute"
2498c2529
<     privileged: false
---
>     privileged: true
2539c2570
<     useDigest: true
---
>     useDigest: false
2550c2581,2583
<   extraEnv: []
---
>   extraEnv:
>     - name: KUBERNETES_SERVICE_HOST
>       value: hh-k8s-noah-sc-staging001-master.api.vip.com
2570a2604
>     kubernetes.io/hostname: 10.189.212.125
2586,2589c2620,2623
<     # - key: "key"
<     #   operator: "Equal|Exists"
<     #   value: "value"
<     #   effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
---
>   - operator: Equal
>     key: "key"
>     value: "cilium"
>     effect: "NoExecute"
2689c2723
<       useDigest: true
---
>       useDigest: false
2699c2733
<         useDigest: true
---
>         useDigest: false
2736c2770
<         useDigest: true
---
>         useDigest: false
2743c2777,2779
<       extraEnv: []
---
>       extraEnv:
>         - name: KUBERNETES_SERVICE_HOST
>           value: hh-k8s-noah-sc-staging001-master.api.vip.com
2798c2834,2836
<     extraEnv: []
---
>     extraEnv:
>       - name: KUBERNETES_SERVICE_HOST
>         value: hh-k8s-noah-sc-staging001-master.api.vip.com
2864a2903
>       kubernetes.io/hostname: 10.189.212.125
2868c2907,2911
<     tolerations: []
---
>     tolerations:
>       - operator: Equal
>         key: "key"
>         value: "cilium"
>         effect: "NoExecute"
3138c3181,3185
<           tolerations: []
---
>           tolerations:
>             - operator: Equal
>               key: "key"
>               value: "cilium"
>               effect: "NoExecute"
3167c3214,3218
<           tolerations: []
---
>           tolerations:
>             - operator: Equal
>               key: "key"
>               value: "cilium"
>               effect: "NoExecute"

另外還需要部署 Grafana 和 Prometheus 來驗證指標收集的效果。

k apply -f https://raw.githubusercontent.com/cilium/cilium/HEAD/examples/kubernetes/addons/prometheus/monitoring-example.yaml

最終部署的結果如下。

# k get pods -n kube-system -o wide
NAME                               READY   STATUS    RESTARTS   AGE     IP               NODE             
cilium-c987r                       1/1     Running   0          3h22m   10.189.212.125   10.189.212.125   
cilium-operator-7df8cb69b8-2h4gm   1/1     Running   0          3h22m   10.189.212.125   10.189.212.125   
hubble-ui-7b4bcf6bcf-d4fpb         2/2     Running   0          3h22m   10.189.82.106    10.189.212.125   
# k get po -n cilium-monitoring -o wide
NAME                          READY   STATUS    RESTARTS   AGE    IP             NODE             
grafana-7457fdc76-xhg8l       1/1     Running   0          3h7m   10.189.83.14   10.189.212.125   
prometheus-547b7d9856-zl8lp   1/1     Running   0          3h7m   10.189.83.12   10.189.212.125   

查看Dashboard

img.png

指標Label

所有的指標,如果沒有合適的 Label,就無法精準表示指標的含義了,但是大量的 Label 會增加存儲容量的需求,需要根據需求,適當設計。

默認的安裝的方法里,配置 Hubble 指標主要在下面的地方,除了按照 dns, drop, tcp 配置外,如果需要配置上流量的上下文,還需要配置一些特殊的標記,具體參考 Hubble Metrics

hubble:
  enabled: true
  metrics:
    enabled:
    - dns:query;ignoreAAAA
    - drop
    - tcp
    - flow:destinationContext=dns|ip
    - icmp
    - http

根據以上的配置,flow:destinationContext=dns|ip,將會在 flow 的指標上添加目標上下文的信息,如果有域名就填域名,沒有就是 IP,最終的指標如下。

hubble_flows_processed_total{destination="10.189.94.59",protocol="TCP",subtype="to-stack",type="Trace",verdict="FORWARDED"} 159
hubble_flows_processed_total{destination="10.190.135.235",protocol="TCP",subtype="to-stack",type="Trace",verdict="FORWARDED"} 159
hubble_flows_processed_total{destination="10.190.56.61",protocol="TCP",subtype="to-stack",type="Trace",verdict="FORWARDED"} 1

Service Map

官方的 Hubble Grafana 插件是收費的,所以如果需要做 Service Map 的話,需要開發一個將 Hubble 的指標轉成 Node Graph 插件要求的格式的轉換的插件。

參考資料

  1. Monitoring & Metrics
  2. Hubble Service Map
  3. Cilium Hubble Series (Part 3): Hubble and Grafana Better Together
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。

推薦閱讀更多精彩內容