cilium-chain和Flannel

概述

本文主要測試一下 Flannel 已經作為主要的 CNI 部署的情況下,通過部署 cilium-chain,讓容器具有網絡觀測的能力。

  1. 測試kube-proxy替代
  2. 測試cni-chaining

部署

Flannel配置

推薦使用 host-gw 方式來測試,部署 Flannel 的時候需要修改一下配置,這里的關鍵配置是 net-conf.json 里的 Backend 參數,需要修改成 host-gw

kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-cfg
  namespace: kube-system
  labels:
    tier: node
    app: flannel
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "host-gw"
      }
    }

修改完之后重新發布 Flannel,查看路由修改是否有效,因為本文不是對 Flannel 的分析,所以簡單的介紹一下。

# 修改前,都是通過flannel.1來路由
default via 192.168.1.1 dev eno1 proto static metric 100
10.244.0.0/24 via 10.244.0.0 dev flannel.1 onlink
10.244.1.0/24 dev cni0 proto kernel scope link src 10.244.1.1
10.244.3.0/24 via 10.244.3.0 dev flannel.1 onlink
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.1.0/24 dev eno1 proto kernel scope link src 192.168.1.202 metric 100
# 修改后,就直接通過物理網卡來做路由
default via 192.168.1.1 dev eno1 proto static metric 100
10.244.0.0/24 via 192.168.1.200 dev eno1
10.244.1.0/24 dev cni0 proto kernel scope link src 10.244.1.1
10.244.3.0/24 via 192.168.1.201 dev eno1
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.1.0/24 dev eno1 proto kernel scope link src 192.168.1.202 metric 100

cilium-chain配置

下面是在 Flannel 的基礎上,部署 cilium-chain 的過程。

# 最好先備份kube-proxy
k delete ds kube-proxy
k delete cm kube-proxy
k delete svc kubernetes -n default
# 每個節點
iptables-save | grep -v KUBE | iptables-restore
ipvsadm -D -t 10.96.0.1:443
# 安裝cilium-chain
helm repo add cilium https://helm.cilium.io/
helm pull cilium/cilium --version 1.14.4
tar zxvf cilium-1.14.4.tgz
cd cilium
# cni-configuration需要部署
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: cni-configuration
  namespace: kube-system
data:
  cni-config: |-
    {
      "name": "generic-veth",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "cilium-cni"
        }
      ]
    }
EOF
# 為了部署cilium-chain,下面幾個參數是必須的
helm install cilium . --version 1.14.4 --namespace=kube-system --set cni.chainingMode=generic-veth --set cni.customConf=true --set cni.configMap=cni-configuration --set routingMode=native --set enableIPv4Masquerade=false --set k8sServiceHost=192.168.1.200 --set k8sServicePort=6443 

注意上面這個 configmap 實際上會在每個節點的目錄 /etc/cni/net.d/ 下生成下面的文件。

# cat 05-cilium.conflist
{
  "name": "generic-veth",
  "cniVersion": "0.3.1",
  "plugins": [
    {
      "type": "flannel",
      "delegate": {
        "hairpinMode": true,
        "isDefaultGateway": true
      }
    },
    {
      "type": "cilium-cni"
    }
  ]
}

另外 Helm 命令安裝的時候,對應的參數如下。

# helm get values cilium
USER-SUPPLIED VALUES:
cni:
  chainingMode: generic-veth
  configMap: cni-configuration
  customConf: true
enableIPv4Masquerade: false
k8sServiceHost: 192.168.1.200
k8sServicePort: 6443
routingMode: native

部署完之后,可以看到 Cilium 相關的 Pod 都啟動了,然后可以看到啟動之后,CoreDNS 的 Pod 會立刻重啟,目的是為存量的容器創建 cilium-chain 下對應的的 eBPF 程序。

[root@master cilium]# k get pods -o wide
NAME                               READY   STATUS              RESTARTS   AGE   IP              NODE   
cilium-d4tw8                       0/1     Running             0          25s   192.168.1.200   master 
cilium-operator-6dcf5cdc6f-nkj2g   1/1     Running             0          25s   192.168.1.201   node1  
cilium-operator-6dcf5cdc6f-sjlb9   1/1     Running             0          25s   192.168.1.202   node2  
cilium-tjnz2                       0/1     Running             0          25s   192.168.1.202   node2  
cilium-vbc59                       1/1     Running             0          25s   192.168.1.201   node1  
coredns-74ff55c5b-d7ltz            0/1     ContainerCreating   0          3s    <none>          node1  
coredns-74ff55c5b-gdzqw            1/1     Terminating         0          15d   10.244.1.2      node2  

查看日志,當安裝完 cilium-chain 之后,endpoint 包括容器的 veth 會重建,并且被 cilium-agent 附著上了 eBPF 的程序。

level=info msg="Create endpoint request" addressing="&{10.244.1.4   fe80::807a:4ff:fe33:2558  }" containerID=829c155460f8e3af06fd3be64fdf1c9f84a69cf73a0d63259906555f1ca3a4a2 datapathConfiguration="&{false true false true true 0xc001127c3a}" interface=veth72f16d18 k8sPodName=kube-system/nm-mp24m labels="[]" subsys=daemon sync-build=true
level=info msg="New endpoint" containerID=829c155460 datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=3120 ipv4=10.244.1.4 ipv6="fe80::807a:4ff:fe33:2558" k8sPodName=kube-system/nm-mp24m subsys=endpoint
level=info msg="Resolving identity labels (blocking)" containerID=829c155460 datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=3120 identityLabels="k8s:app=network-multitool,k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=kube-system,k8s:io.cilium.k8s.policy.cluster=default,k8s:io.cilium.k8s.policy.serviceaccount=default,k8s:io.kubernetes.pod.namespace=kube-system" ipv4=10.244.1.4 ipv6="fe80::807a:4ff:fe33:2558" k8sPodName=kube-system/nm-mp24m subsys=endpoint
level=info msg="Reusing existing global key" key="k8s:app=network-multitool;k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=kube-system;k8s:io.cilium.k8s.policy.cluster=default;k8s:io.cilium.k8s.policy.serviceaccount=default;k8s:io.kubernetes.pod.namespace=kube-system;" subsys=allocator
level=info msg="Identity of endpoint changed" containerID=829c155460 datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=3120 identity=48596 identityLabels="k8s:app=network-multitool,k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=kube-system,k8s:io.cilium.k8s.policy.cluster=default,k8s:io.cilium.k8s.policy.serviceaccount=default,k8s:io.kubernetes.pod.namespace=kube-system" ipv4=10.244.1.4 ipv6="fe80::807a:4ff:fe33:2558" k8sPodName=kube-system/nm-mp24m oldIdentity="no identity" subsys=endpoint
level=info msg="Waiting for endpoint to be generated" containerID=829c155460 datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=3120 identity=48596 ipv4=10.244.1.4 ipv6="fe80::807a:4ff:fe33:2558" k8sPodName=kube-system/nm-mp24m subsys=endpoint
level=info msg="Compiled new BPF template" BPFCompilationTime=1.084117166s file-path=/var/run/cilium/state/templates/73bcbef6260920cfb00a0de97e364d357cb1b1b36ed82f6c301db7bda0678ce1/bpf_lxc.o subsys=datapath-loader
level=info msg="Rewrote endpoint BPF program" containerID=829c155460 datapathPolicyRevision=0 desiredPolicyRevision=1 endpointID=3120 identity=48596 ipv4=10.244.1.4 ipv6="fe80::807a:4ff:fe33:2558" k8sPodName=kube-system/nm-mp24m subsys=endpoint
level=info msg="Successful endpoint creation" containerID=829c155460 datapathPolicyRevision=1 desiredPolicyRevision=1 endpointID=3120 identity=48596 ipv4=10.244.1.4 ipv6="fe80::807a:4ff:fe33:2558" k8sPodName=kube-system/nm-mp24m subsys=daemon
level=info msg="Policy Add Request" ciliumNetworkPolicy="[&{EndpointSelector:{\"matchLabels\":{\"any:app\":\"network-multitool\",\"k8s:io.kubernetes.pod.namespace\":\"kube-system\"}} NodeSelector:{} Ingress:[{IngressCommonRule:{FromEndpoints:[{\"matchLabels\":{\"any:app\":\"network-multitool\",\"k8s:io.kubernetes.pod.namespace\":\"kube-system\"}}] FromRequires:[] FromCIDR: FromCIDRSet:[] FromEntities:[] aggregatedSelectors:[]} ToPorts:[{Ports:[{Port:80 Protocol:TCP}] TerminatingTLS:<nil> OriginatingTLS:<nil> ServerNames:[] Listener:<nil> Rules:0xc0001cbce0}] ICMPs:[] Authentication:<nil>}] IngressDeny:[] Egress:[] EgressDeny:[] Labels:[k8s:io.cilium.k8s.policy.derived-from=CiliumNetworkPolicy k8s:io.cilium.k8s.policy.name=rule1 k8s:io.cilium.k8s.policy.namespace=kube-system k8s:io.cilium.k8s.policy.uid=89ffe113-74be-44ec-839e-0bc4aad53a2d] Description:Allow HTTP GET /public from env=prod to app=service}]" policyAddRequest=412f9e59-ae09-4fc2-8a03-1974f30dded4 subsys=daemon

...
...
..
level=info msg="Rewrote endpoint BPF program" containerID=829c155460 datapathPolicyRevision=1 desiredPolicyRevision=2 endpointID=3120 identity=48596 ipv4=10.244.1.4 ipv6="fe80::807a:4ff:fe33:2558" k8sPodName=kube-system/nm-mp24m subsys=endpoint

測試case

下面是分別創建3、4和7層的網絡策略,為了方便測試,注意這里的所有關于 Label 的設置,都是針對 app: test

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: l3-test
spec:
  endpointSelector:
    matchLabels:
      app: test
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: test
---              
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: l4-test
spec:
  endpointSelector:
    matchLabels:
      app: test
  ingress:
  - fromEndpoints:
    - matchLabels:
        app: test
    toPorts:
    - ports:
      - port: "80"
        protocol: TCP
---        
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: l7-test
spec:
  description: "Allow HTTP GET /"
  endpointSelector:
    matchLabels:
      app: test
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: test
      toPorts:
        - ports:
            - port: "80"
              protocol: TCP
          rules:
            http:
              - method: "GET"
                path: "/"

所有策略配置好之后,創建測試的 Pod,注意這里的鏡像啟動的時候,會啟動一個 Nginx 進程,之后就會進入 sleep inf,我們的測試,可以通過進入容器之后進行,另外為了方便測試,我們通過部署 DaemonSet,在我們兩個節點上都部署容器來測試跨主機的網絡情況。另外為了方便測試,這里的 Label 也是設置成 app: test,這樣上面創建的網絡策略就會作用于這些 Pod。

apiVersion: apps/v1
kind: DaemonSet
metadata:
  namespace: kube-system
  name: nm
spec:
  selector:
    matchLabels:
      app: test
  template:
    metadata:
      labels:
        app: test
    spec:
      containers:
        - name: test
          image: runzhliu/network-multitool:latest
          command: ["/bin/bash", "-c", "nginx && sleep inf"]
          securityContext:
            privileged: true

可以按下面的形式進行測試。

針對l3測試

img.png

針對l4測試


img_1.png

針對l7測試


img_2.png

總結

Cilium 的功能很 fancy,比如網絡策略,但在某些情況下,直接用一個新的 CNI 換掉生產的舊的(運營成熟) CNI 是比較有風險的,cilium-chain 則提供了一種相對無害的方式來給存量的 Kubernetes 集群提供 Cilium 的網絡策略的能力的選擇。

參考資料

  1. 鏈式CNI插件與portmap端口映射
  2. 支持hostPort
  3. Cilium 1.6: 無KVstore操作、百分之百kube-proxy替換、基于套接字的負載均衡
  4. Tutorial: How to Use Cilium Hubble for Observability in CNI Chaining Mode (Part 1)
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。

推薦閱讀更多精彩內容