K8S 網(wǎng)絡(luò)詳解 4 開源網(wǎng)絡(luò)組件

K8S 底層網(wǎng)絡(luò)所需要解決的兩個(gè)問題

  1. 協(xié)助 k8s , 給每個(gè) NODE上的 docker 容器都分配互相不沖突的 IP
  2. 在這些 IP 地址之間簡(jiǎn)歷一個(gè)覆蓋網(wǎng)絡(luò)(overlay Network), 通過這個(gè)覆蓋網(wǎng)絡(luò), 將數(shù)據(jù)包原封不動(dòng)地傳遞到目標(biāo)容器內(nèi).

Open vSwitch

Open vSwitch 可以建立多鐘通信隧道, 例如Open vswitch with GRE/VXALN. 在K8S 場(chǎng)景下, 我們主要簡(jiǎn)歷 L3 到 L3 的隧道. 網(wǎng)絡(luò)架構(gòu)如下

image.png

需要完成的步驟如下:

  1. 刪除docker daemon 創(chuàng)建的網(wǎng)橋 docker0 已避免 docker0地址沖突.

  2. 手工創(chuàng)建一個(gè)linux網(wǎng)橋, 手動(dòng)配置網(wǎng)橋的 IP

  3. 建立 Open vswitch 網(wǎng)橋 ovs-bridge, 使用 ovs-vsctl 給ovs-bridge 添加gre端口, 在添加端口是, 需要將目標(biāo) NODE 的 IP 地址設(shè)置為對(duì)端 IP 地址. 每個(gè)對(duì)端 IP 地址都需要這么操作.

  4. 將 ovs-bridge 作為網(wǎng)絡(luò)接口, 加入docker 的網(wǎng)橋上(docker0 或自己手工創(chuàng)建的網(wǎng)橋)

  5. 重啟 ovs-bridge 網(wǎng)橋 和 docker 的網(wǎng)橋, 并添加一個(gè)docker 的網(wǎng)段到docker 網(wǎng)橋的路由規(guī)則中

網(wǎng)絡(luò)通信過程

當(dāng)容器內(nèi)的應(yīng)用訪問另一個(gè)容器地址時(shí), 數(shù)據(jù)包會(huì)通過容器內(nèi)的默認(rèn)路由發(fā)送給docker0網(wǎng)橋, ovs的網(wǎng)橋是作為docker0 網(wǎng)橋的端口存在的, 它會(huì)將數(shù)據(jù)發(fā)送給ovs 網(wǎng)橋, ovs 通過gre隧道 送達(dá)對(duì)端的node.

配置步驟

在兩個(gè)節(jié)點(diǎn)都安裝ovs

安裝ovs

yum install openvswitch

禁用selinux 并重啟

#vi /etc/selinux/conifg
SELINUX=disabled

查看ovs狀態(tài)

systemctl status openvswtich

創(chuàng)建網(wǎng)橋和 gre 隧道

在每個(gè)node上創(chuàng)建ovs 網(wǎng)橋 br0 然后在網(wǎng)橋上創(chuàng)建 gre 隧道

#創(chuàng)建ovs網(wǎng)橋
ovs-vsctl add-br br0
# 創(chuàng)建 GRE 隧道連接對(duì)端, remote_ip 為對(duì)端 eth0 的 IP, 注意在另一臺(tái)機(jī)器上設(shè)置對(duì)端的時(shí)候,IP 要改為當(dāng)前這臺(tái)機(jī)器的 IP
ovs-vsctl add-port br0 gre1 -- set interface gre1 type gre option:remote_ip=192.168.18.128
# 添加br0 到本地 docker0 網(wǎng)橋, 使得容器流量通過 ovs 進(jìn)入 tunnel
brctl addif docker0 br0
#啟動(dòng)br0 docker0 網(wǎng)橋
ip link set dev br0 up
ip link set dev docker0 up

由于128, 131 ip的兩臺(tái)機(jī)器docker0 網(wǎng)段分別是172.17.43.0/24, 172.17.42.0/24, 這兩個(gè)網(wǎng)段的路由都需要經(jīng)過本機(jī)docker0網(wǎng)橋.其中一個(gè)24網(wǎng)段通過ovs的gre 隧道到達(dá)對(duì)端. 因此需要在每個(gè)node上配置通過docker0 網(wǎng)橋的路由規(guī)則

ip route add 172.17.0.0/16 dev docker0

清空 docker 自帶的iptables 規(guī)則及l(fā)inux 的規(guī)則, 后者存在拒絕ICMP 保溫通過防火墻的規(guī)則

iptables -t nat -F; iptalbes -F

直接路由

網(wǎng)絡(luò)模型

image.png

在默認(rèn)情況下docker0 的IP 在node 網(wǎng)絡(luò)是沒法感知到的, 通過手工設(shè)置路由, 可以讓pod 在不同node 之間互通.

實(shí)現(xiàn)方式

通過部署multilayer switch (MLS) 來實(shí)現(xiàn)

假設(shè) POD1 所在的 docker0 網(wǎng)橋的 IP 網(wǎng)段是 10.1.10.0 , NODE1 地址為 192. 168.1.128; 而 POD2 所在 docker0 ip 網(wǎng)段為 10.1.20.0 NODE2 地址為 192.168.1.129

1 在NODE 1 上添加一條到node2 上 docker0 的靜態(tài)路由規(guī)則

route add -net 10.1.20.0 netmask 255.255.255.0 gw 192.168.1.129

2 在 NODE 2 上添加一條到 NODE 1 上 docker0 的靜態(tài)路由規(guī)則

route add -net 10.1.10.0 netmask 255.255.255.0 gw 192.168.1.128

3 驗(yàn)證連通性, 在 NODE1 上 ping node2 上的 docker0 網(wǎng)絡(luò)

ping 10.1.20.1

大規(guī)模集群下的實(shí)現(xiàn)方式, 手工建立 linux bridge 已避免 docker daemon 建立docker0 造成 IP 段沖突, 然后使用docker 的--bridge 命令來指定網(wǎng)橋.

然后在每個(gè)節(jié)點(diǎn)運(yùn)行quagga 路由學(xué)習(xí)軟件.

calico

Calico 工作方式

Calico可以創(chuàng)建并管理一個(gè)3層平面網(wǎng)絡(luò),為每個(gè)工作負(fù)載分配一個(gè)完全可路由的IP地址。 工作負(fù)載可以在沒有IP封裝或網(wǎng)絡(luò)地址轉(zhuǎn)換的情況下進(jìn)行通信,以實(shí)現(xiàn)裸機(jī)性能,簡(jiǎn)化故障排除和提供更好的互操作性。 在需要使用overlay網(wǎng)絡(luò)的環(huán)境中,Calico提供了IP-in-IP隧道技術(shù),或者也可以與flannel等其他overlay網(wǎng)絡(luò)配合使用。

Calico還提供網(wǎng)絡(luò)安全規(guī)則的動(dòng)態(tài)配置。 使用Calico的簡(jiǎn)單策略語言,就可以實(shí)現(xiàn)對(duì)容器、虛擬機(jī)工作負(fù)載和裸機(jī)主機(jī)各節(jié)點(diǎn)之間通信的細(xì)粒度控制。

Calico v3.4于2018.12.10號(hào)發(fā)布,可與Kubernetes、OpenShift和OpenStack良好地集成使用。

注意: 在Mesos, DC/OS和Docker orchestrators中使用Calico時(shí),目前只支持到了 Calico v2.6.

uguo

Calico的IPIP與BGP模式

  • IPIP是一種將各Node的路由之間做一個(gè)tunnel,再把兩個(gè)網(wǎng)絡(luò)連接起來的模式。啟用IPIP模式時(shí),Calico將在各Node上創(chuàng)建一個(gè)名為”tunl0″的虛擬網(wǎng)絡(luò)接口。如下圖所示。
  • BGP模式則直接使用物理機(jī)作為虛擬路由路(vRouter),不再創(chuàng)建額外的tunnel

calico 在linux 內(nèi)核中實(shí)現(xiàn)一個(gè)vRouter來負(fù)責(zé)數(shù)據(jù)轉(zhuǎn)發(fā), 通過 BGP 協(xié)議,將node 節(jié)點(diǎn)上的路由信息在整個(gè)calico 網(wǎng)絡(luò)中廣播, 并自動(dòng)設(shè)置到達(dá)其他節(jié)點(diǎn)的路由轉(zhuǎn)發(fā)規(guī)則.

image.png

Calico BGP模式在小規(guī)模集群中可以直接互聯(lián),在大規(guī)模集群中可以通過額外的BGP route reflector來完成。

image.png

Calico主要組件

Calico利用了Linux內(nèi)核原生的路由和iptables防火墻功能。 進(jìn)出各個(gè)容器、虛擬機(jī)和物理主機(jī)的所有流量都會(huì)在路由到目標(biāo)之前遍歷這些內(nèi)核規(guī)則。

  • Felix:主要的Calico代理agent,運(yùn)行每臺(tái)計(jì)算機(jī)上管理endpoints資源。
  • calicoctl:允許從命令行界面配置實(shí)現(xiàn)高級(jí)策略和網(wǎng)絡(luò)。
  • orchestrator plugins:提供與各種流行的云計(jì)算編排工具的緊密集成和同步支持。
  • key/value store:存儲(chǔ)Calico的策略配置和網(wǎng)絡(luò)狀態(tài)信息,目前主要使用etcdv3或k8s api。
  • calico/node:在每個(gè)主機(jī)上運(yùn)行,從key/value存儲(chǔ)中讀取相關(guān)的策略和網(wǎng)絡(luò)配置信息,并在Linux內(nèi)核中實(shí)現(xiàn)它。
  • Dikastes/Envoy:可選的Kubernetes sidecars,可以通過相互TLS身份驗(yàn)證保護(hù)工作負(fù)載到工作負(fù)載的通信,并增加應(yīng)用層控制策略。

Felix

Felix是一個(gè)守護(hù)程序,它在每個(gè)提供endpoints資源的計(jì)算機(jī)上運(yùn)行。在大多數(shù)情況下,這意味著它需要在托管容器或VM的宿主機(jī)節(jié)點(diǎn)上運(yùn)行。 Felix 負(fù)責(zé)編制路由和ACL規(guī)則以及在該主機(jī)上所需的任何其他內(nèi)容,以便為該主機(jī)上的endpoints資源正常運(yùn)行提供所需的網(wǎng)絡(luò)連接。

根據(jù)特定的編排環(huán)境,F(xiàn)elix負(fù)責(zé)以下任務(wù):

  • 管理網(wǎng)絡(luò)接口,F(xiàn)elix將有關(guān)接口的一些信息編程到內(nèi)核中,以使內(nèi)核能夠正確處理該endpoint發(fā)出的流量。 特別是,它將確保主機(jī)正確響應(yīng)來自每個(gè)工作負(fù)載的ARP請(qǐng)求,并將為其管理的接口啟用IP轉(zhuǎn)發(fā)支持。它還監(jiān)視網(wǎng)絡(luò)接口的出現(xiàn)和消失,以便確保針對(duì)這些接口的編程得到了正確的應(yīng)用。
  • 編寫路由,F(xiàn)elix負(fù)責(zé)將到其主機(jī)上endpoints的路由編寫到Linux內(nèi)核FIB(轉(zhuǎn)發(fā)信息庫)中。 這可以確保那些發(fā)往目標(biāo)主機(jī)的endpoints的數(shù)據(jù)包被正確地轉(zhuǎn)發(fā)。
  • 編寫ACLs,F(xiàn)elix還負(fù)責(zé)將ACLs編程到Linux內(nèi)核中。 這些ACLs用于確保只能在endpoints之間發(fā)送有效的網(wǎng)絡(luò)流量,并確保endpoints無法繞過Calico的安全措施。
  • 報(bào)告狀態(tài),F(xiàn)elix負(fù)責(zé)提供有關(guān)網(wǎng)絡(luò)健康狀況的數(shù)據(jù)。 特別是,它將報(bào)告配置其主機(jī)時(shí)發(fā)生的錯(cuò)誤和問題。 該數(shù)據(jù)會(huì)被寫入etcd,以使其對(duì)網(wǎng)絡(luò)中的其他組件和操作才可見。

Orchestrator Plugin

每個(gè)主要的云編排平臺(tái)都有單獨(dú)的Calico網(wǎng)絡(luò)插件(例如OpenStack,Kubernetes)。 這些插件的目的是將Calico更緊密地綁定到編排工具中,允許用戶管理Calico網(wǎng)絡(luò),就像他們管理編排工具中內(nèi)置的網(wǎng)絡(luò)工具一樣。

一個(gè)好的Orchestrator插件示例是Calico Neutron ML2 驅(qū)動(dòng)程序。 該插件與Neutron的ML2插件集成,允許用戶通過Neutron API調(diào)用來配置Calico網(wǎng)絡(luò),實(shí)現(xiàn)了與Neutron的無縫集成。

Orchestrator插件負(fù)責(zé)以下任務(wù):

  • API Translation,每個(gè)云編排工具都不可避免地?fù)碛凶约旱囊惶子糜诠芾砭W(wǎng)絡(luò)的API接口規(guī)范, Orchestrator插件的主要工作就是將這些API轉(zhuǎn)換為Calico的數(shù)據(jù)模型,然后將其存儲(chǔ)在Calico的數(shù)據(jù)存儲(chǔ)區(qū)中。這種轉(zhuǎn)換中的一些工作將非常簡(jiǎn)單,其他一部分可能更復(fù)雜,以便將單個(gè)復(fù)雜操作(例如,實(shí)時(shí)遷移)轉(zhuǎn)換為Calico網(wǎng)絡(luò)期望的一系列更簡(jiǎn)單的操作。
  • Feedback,如有需要,orchestrator插件將從Calico網(wǎng)絡(luò)向編排器提供管理命令的反饋信息。 包括提供有關(guān)Felix存活的信息,以及如果網(wǎng)絡(luò)配置失敗則將某些endpoints標(biāo)記為失敗。

etcd

etcd是一個(gè)分布式鍵值存儲(chǔ)數(shù)據(jù)庫,專注于實(shí)現(xiàn)數(shù)據(jù)存儲(chǔ)一致性。 Calico使用etcd提供組件之間的數(shù)據(jù)通信,并作為可以保證一致性的數(shù)據(jù)存儲(chǔ),以確保Calico始終可以構(gòu)建出一個(gè)準(zhǔn)確的網(wǎng)絡(luò)。

根據(jù)orchestrator插件的不同,etcd既可以是作為主數(shù)據(jù)存儲(chǔ)使用,也可以是一個(gè)單獨(dú)數(shù)據(jù)存儲(chǔ)的輕量級(jí)鏡像。例如,在OpenStack部署中,OpenStack數(shù)據(jù)庫被認(rèn)為是“真實(shí)配置信息的來源”,而etcd用于鏡像其中有關(guān)網(wǎng)絡(luò)配置的信息,并用于服務(wù)其他Calico組件。

etcd組件穿插在整個(gè)部署中。它可以被分為兩組主機(jī)節(jié)點(diǎn):核心集群和代理。

對(duì)于小型部署,核心集群可以是一個(gè)節(jié)點(diǎn)的etcd集群(通常與orchestrator插件組件位于同一節(jié)點(diǎn)上)。這種部署模型很簡(jiǎn)單但沒有為etcd提供冗余。在etcd失敗的情況下,orchstrator插件必須重建數(shù)據(jù)庫,例如OpenStack,它需要插件從OpenStack數(shù)據(jù)庫重新同步狀態(tài)到etcd。

在較大的部署中,核心群集可以根據(jù)etcd管理指南進(jìn)行擴(kuò)展。

此外,在運(yùn)行Felix或orchstrator插件的每臺(tái)計(jì)算機(jī)上,會(huì)運(yùn)行一個(gè)etcd代理服務(wù)。這減少了etcd核心集群上的負(fù)載,并為主機(jī)節(jié)點(diǎn)屏蔽了etcd服務(wù)集群的細(xì)節(jié)。在etcd集群與orchstrator插件在同一臺(tái)機(jī)器上都有成員的情況下,可以放棄在該機(jī)器上使用etcd代理。

etcd負(fù)責(zé)執(zhí)行以下任務(wù):

  • Data Storage,etcd以分布式、一致和容錯(cuò)的方式存儲(chǔ)Calico網(wǎng)絡(luò)的數(shù)據(jù)(對(duì)于至少三個(gè)etcd節(jié)點(diǎn)的cluster大小)。 這確保Calico網(wǎng)絡(luò)始終處于已知良好狀態(tài),同時(shí)允許運(yùn)行etcd的個(gè)別機(jī)器節(jié)點(diǎn)失敗或無法訪問。Calico網(wǎng)絡(luò)數(shù)據(jù)的這種分布式存儲(chǔ)提高了Calico組件從數(shù)據(jù)庫讀取的能力。
  • Communication,etcd也用作組件之間的通信服務(wù)。 我們通過讓非etcd組件監(jiān)視鍵值空間中的某些點(diǎn)來確保他們看到已經(jīng)做出的任何更改,從而允許他們及時(shí)響應(yīng)這些更改。 該功能允許將狀態(tài)信息提交到數(shù)據(jù)庫,然后觸發(fā)基于該狀態(tài)數(shù)據(jù)的進(jìn)一步網(wǎng)絡(luò)配置管理。

BGP Client (BIRD)

Calico在每個(gè)運(yùn)行Felix服務(wù)的節(jié)點(diǎn)上都部署一個(gè)BGP客戶端。 BGP客戶端的作用是讀取Felix程序編寫到內(nèi)核中并在數(shù)據(jù)中心內(nèi)分發(fā)的路由信息。

BGP客戶端負(fù)責(zé)執(zhí)行以下任務(wù):

  • 路由信息分發(fā),當(dāng)Felix將路由插入Linux內(nèi)核FIB時(shí),BGP客戶端將接收它們并將它們分發(fā)到集群中的其他工作節(jié)點(diǎn)。

BGP Route Reflector (BIRD)

對(duì)于較大規(guī)模的部署,簡(jiǎn)單的BGP可能成為限制因素,因?yàn)樗竺總€(gè)BGP客戶端連接到網(wǎng)狀拓?fù)渲械拿恳粋€(gè)其他BGP客戶端。這需要越來越多的連接,迅速變得難以維護(hù),甚至?xí)屢恍┰O(shè)備的路由表撐滿。

因此,在較大規(guī)模的部署中,Calico建議部署B(yǎng)GP Route Reflector。通常是在Internet中使用這樣的組件充當(dāng)BGP客戶端連接的中心點(diǎn),從而防止它們需要與群集中的每個(gè)BGP客戶端進(jìn)行通信。為了實(shí)現(xiàn)冗余,也可以同時(shí)部署多個(gè)BGP Route Reflector服務(wù)。Route Reflector僅僅是協(xié)助管理BGP網(wǎng)絡(luò),并沒有endpoint數(shù)據(jù)會(huì)通過它們。

在Calico中,此BGP組件也是使用的最常見的BIRD,配置為Route Reflector運(yùn)行,而不是標(biāo)準(zhǔn)BGP客戶端。

BGP Route Reflector負(fù)責(zé)以下任務(wù):

  • 集中式的路由信息分發(fā),當(dāng)Calico BGP客戶端將路由從其FIB通告到Route Reflector時(shí),Route Reflector會(huì)將這些路由通告給部署集群中的其他節(jié)點(diǎn)。

BIRD是什么

BIRD是布拉格查理大學(xué)數(shù)學(xué)與物理學(xué)院的一個(gè)學(xué)校項(xiàng)目,項(xiàng)目名是BIRD Internet Routing Daemon的縮寫。 目前,它由CZ.NIC實(shí)驗(yàn)室開發(fā)和支持。

BIRD項(xiàng)目旨在開發(fā)一個(gè)功能齊全的動(dòng)態(tài)IP路由守護(hù)進(jìn)程,主要針對(duì)(但不限于)Linux,F(xiàn)reeBSD和其他類UNIX系統(tǒng),并在GNU通用公共許可證下分發(fā)。詳細(xì)信息參照官網(wǎng)https://bird.network.cz/

作為一個(gè)開源的網(wǎng)絡(luò)路由守護(hù)進(jìn)程項(xiàng)目,BRID設(shè)計(jì)并支持了以下功能:

  • both IPv4 and IPv6 protocols
  • multiple routing tables
  • the Border Gateway Protocol (BGPv4)
  • the Routing Information Protocol (RIPv2, RIPng)
  • the Open Shortest Path First protocol (OSPFv2, OSPFv3)
  • the Babel Routing Protocol
  • the Router Advertisements for IPv6 hosts
  • a virtual protocol for exchange of routes between different routing tables on a single host
  • a command-line interface allowing on-line control and inspection of status of the daemon
  • soft reconfiguration (no need to use complex online commands to change the configuration, just edit the configuration file and notify BIRD to re-read it and it will smoothly switch itself to the new configuration, not disturbing routing protocols unless they are affected by the configuration changes)
  • a powerful language for route filtering

K8S 中部署 calico

  1. 修改kube-api server 啟動(dòng)參數(shù)

    --allow-priviledge=true (calico 需要特權(quán)模式)
    
  2. 修改kubelet 啟動(dòng)參數(shù) --network-plugin=cni

假設(shè)K8S 環(huán)境包含兩個(gè)node節(jié)點(diǎn) node1 (192,168.18.3) , node2 (192.168.18.4)

創(chuàng)建calico 服務(wù), 主要包括calico-node 和 calico policy controller, 需要的K8S 資源對(duì)象如下

  • configmap: calico-config 包含calico的配置參數(shù)
  • secret: calico-etcd-secrets 用于TLS 連接etcd
  • 在每個(gè)節(jié)點(diǎn)以daemonset的形式 部署calico/node 容器
  • 在每個(gè)節(jié)點(diǎn)都安裝calico cni 二進(jìn)制文件和網(wǎng)絡(luò)配置參數(shù)(由install-cni 容器完成)
  • 部署一個(gè)名為calico/kube-policy-controller的deployment, 為K8S 集群中的POD 設(shè)置network policy

官方 calico k8s 安裝 yaml 文件如下

calico-etcd.yaml

---
# Source: calico/templates/calico-etcd-secrets.yaml
# The following contains k8s Secrets for use with a TLS enabled etcd cluster.
# For information on populating Secrets, see http://kubernetes.io/docs/user-guide/secrets/
apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: calico-etcd-secrets
  namespace: kube-system
data:
  # Populate the following with etcd TLS configuration if desired, but leave blank if
  # not using TLS for etcd.
  # The keys below should be uncommented and the values populated with the base64
  # encoded contents of each file that would be associated with the TLS data.
  # Example command for encoding a file contents: cat <file> | base64 -w 0
  # etcd-key: null
  # etcd-cert: null
  # etcd-ca: null
---
# Source: calico/templates/calico-config.yaml
# This ConfigMap is used to configure a self-hosted Calico installation.
kind: ConfigMap
apiVersion: v1
metadata:
  name: calico-config
  namespace: kube-system
data:
  # Configure this with the location of your etcd cluster.
  #ETCD的服務(wù)地址
  etcd_endpoints: "http://<ETCD_IP>:<ETCD_PORT>"
  # If you're using TLS enabled etcd uncomment the following.
  # You must also populate the Secret below with these files.
  etcd_ca: ""   # "/calico-secrets/etcd-ca"
  etcd_cert: "" # "/calico-secrets/etcd-cert"
  etcd_key: ""  # "/calico-secrets/etcd-key"
  # Typha is disabled.
  typha_service_name: "none"
  # Configure the backend to use.
  calico_backend: "bird"

  # Configure the MTU to use
  veth_mtu: "1440"

  # The CNI network configuration to install on each node.  The special
  # values in this config will be automatically populated.
  cni_network_config: |-
    {
      "name": "k8s-pod-network",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "calico",
          "log_level": "info",
          "etcd_endpoints": "__ETCD_ENDPOINTS__",
          "etcd_key_file": "__ETCD_KEY_FILE__",
          "etcd_cert_file": "__ETCD_CERT_FILE__",
          "etcd_ca_cert_file": "__ETCD_CA_CERT_FILE__",
          "mtu": __CNI_MTU__,
          "ipam": {
              "type": "calico-ipam"
          },
          "policy": {
              "type": "k8s"
          },
          "kubernetes": {
              "kubeconfig": "__KUBECONFIG_FILEPATH__"
          }
        },
        {
          "type": "portmap",
          "snat": true,
          "capabilities": {"portMappings": true}
        }
      ]
    }

---
# Source: calico/templates/rbac.yaml

# Include a clusterrole for the kube-controllers component,
# and bind it to the calico-kube-controllers serviceaccount.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: calico-kube-controllers
rules:
  # Pods are monitored for changing labels.
  # The node controller monitors Kubernetes nodes.
  # Namespace and serviceaccount labels are used for policy.
  - apiGroups: [""]
    resources:
      - pods
      - nodes
      - namespaces
      - serviceaccounts
    verbs:
      - watch
      - list
  # Watch for changes to Kubernetes NetworkPolicies.
  - apiGroups: ["networking.k8s.io"]
    resources:
      - networkpolicies
    verbs:
      - watch
      - list
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: calico-kube-controllers
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: calico-kube-controllers
subjects:
- kind: ServiceAccount
  name: calico-kube-controllers
  namespace: kube-system
---
# Include a clusterrole for the calico-node DaemonSet,
# and bind it to the calico-node serviceaccount.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: calico-node
rules:
  # The CNI plugin needs to get pods, nodes, and namespaces.
  - apiGroups: [""]
    resources:
      - pods
      - nodes
      - namespaces
    verbs:
      - get
  - apiGroups: [""]
    resources:
      - endpoints
      - services
    verbs:
      # Used to discover service IPs for advertisement.
      - watch
      - list
  - apiGroups: [""]
    resources:
      - nodes/status
    verbs:
      # Needed for clearing NodeNetworkUnavailable flag.
      - patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: calico-node
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: calico-node
subjects:
- kind: ServiceAccount
  name: calico-node
  namespace: kube-system

---
# Source: calico/templates/calico-node.yaml
# This manifest installs the calico-node container, as well
# as the CNI plugins and network config on
# each master and worker node in a Kubernetes cluster.
kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: calico-node
  namespace: kube-system
  labels:
    k8s-app: calico-node
spec:
  selector:
    matchLabels:
      k8s-app: calico-node
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  template:
    metadata:
      labels:
        k8s-app: calico-node
      annotations:
        # This, along with the CriticalAddonsOnly toleration below,
        # marks the pod as a critical add-on, ensuring it gets
        # priority scheduling and that its resources are reserved
        # if it ever gets evicted.
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      nodeSelector:
        beta.kubernetes.io/os: linux
      hostNetwork: true
      tolerations:
        # Make sure calico-node gets scheduled on all nodes.
        - effect: NoSchedule
          operator: Exists
        # Mark the pod as a critical add-on for rescheduling.
        - key: CriticalAddonsOnly
          operator: Exists
        - effect: NoExecute
          operator: Exists
      serviceAccountName: calico-node
      # Minimize downtime during a rolling upgrade or deletion; tell Kubernetes to do a "force
      # deletion": https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods.
      terminationGracePeriodSeconds: 0
      priorityClassName: system-node-critical
      initContainers:
        # This container installs the CNI binaries
        # and CNI network config file on each node.
        - name: install-cni
          image: calico/cni:v3.8.0
          command: ["/install-cni.sh"]
          env:
            # Name of the CNI config file to create.
            - name: CNI_CONF_NAME
              value: "10-calico.conflist"
            # The CNI network config to install on each node.
            - name: CNI_NETWORK_CONFIG
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: cni_network_config
            # The location of the etcd cluster.
            - name: ETCD_ENDPOINTS
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_endpoints
            # CNI MTU Config variable
            - name: CNI_MTU
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: veth_mtu
            # Prevents the container from sleeping forever.
            - name: SLEEP
              value: "false"
          volumeMounts:
            - mountPath: /host/opt/cni/bin
              name: cni-bin-dir
            - mountPath: /host/etc/cni/net.d
              name: cni-net-dir
            - mountPath: /calico-secrets
              name: etcd-certs
        # Adds a Flex Volume Driver that creates a per-pod Unix Domain Socket to allow Dikastes
        # to communicate with Felix over the Policy Sync API.
        - name: flexvol-driver
          image: calico/pod2daemon-flexvol:v3.8.0
          volumeMounts:
          - name: flexvol-driver-host
            mountPath: /host/driver
      containers:
        # Runs calico-node container on each Kubernetes node.  This
        # container programs network policy and routes on each
        # host.
        - name: calico-node
          image: calico/node:v3.8.0
          env:
            # The location of the etcd cluster.
            - name: ETCD_ENDPOINTS
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_endpoints
            # Location of the CA certificate for etcd.
            - name: ETCD_CA_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_ca
            # Location of the client key for etcd.
            - name: ETCD_KEY_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_key
            # Location of the client certificate for etcd.
            - name: ETCD_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_cert
            # Set noderef for node controller.
            - name: CALICO_K8S_NODE_REF
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            # Choose the backend to use.
            - name: CALICO_NETWORKING_BACKEND
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: calico_backend
            # Cluster type to identify the deployment type
            - name: CLUSTER_TYPE
              value: "k8s,bgp"
            # Auto-detect the BGP IP address.
            - name: IP
              value: "autodetect"
            # Enable IPIP
            - name: CALICO_IPV4POOL_IPIP
              value: "Always"
            # Set MTU for tunnel device used if ipip is enabled
            - name: FELIX_IPINIPMTU
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: veth_mtu
            # The default IPv4 pool to create on startup if none exists. Pod IPs will be
            # chosen from this range. Changing this value after installation will have
            # no effect. This should fall within `--cluster-cidr`.
            - name: CALICO_IPV4POOL_CIDR
              value: "192.168.0.0/16"
            # Disable file logging so `kubectl logs` works.
            - name: CALICO_DISABLE_FILE_LOGGING
              value: "true"
            # Set Felix endpoint to host default action to ACCEPT.
            - name: FELIX_DEFAULTENDPOINTTOHOSTACTION
              value: "ACCEPT"
            # Disable IPv6 on Kubernetes.
            - name: FELIX_IPV6SUPPORT
              value: "false"
            # Set Felix logging to "info"
            - name: FELIX_LOGSEVERITYSCREEN
              value: "info"
            - name: FELIX_HEALTHENABLED
              value: "true"
          securityContext:
            privileged: true
          resources:
            requests:
              cpu: 250m
          livenessProbe:
            httpGet:
              path: /liveness
              port: 9099
              host: localhost
            periodSeconds: 10
            initialDelaySeconds: 10
            failureThreshold: 6
          readinessProbe:
            exec:
              command:
              - /bin/calico-node
              - -bird-ready
              - -felix-ready
            periodSeconds: 10
          volumeMounts:
            - mountPath: /lib/modules
              name: lib-modules
              readOnly: true
            - mountPath: /run/xtables.lock
              name: xtables-lock
              readOnly: false
            - mountPath: /var/run/calico
              name: var-run-calico
              readOnly: false
            - mountPath: /var/lib/calico
              name: var-lib-calico
              readOnly: false
            - mountPath: /calico-secrets
              name: etcd-certs
            - name: policysync
              mountPath: /var/run/nodeagent
      volumes:
        # Used by calico-node.
        - name: lib-modules
          hostPath:
            path: /lib/modules
        - name: var-run-calico
          hostPath:
            path: /var/run/calico
        - name: var-lib-calico
          hostPath:
            path: /var/lib/calico
        - name: xtables-lock
          hostPath:
            path: /run/xtables.lock
            type: FileOrCreate
        # Used to install CNI.
        - name: cni-bin-dir
          hostPath:
            path: /opt/cni/bin
        - name: cni-net-dir
          hostPath:
            path: /etc/cni/net.d
        # Mount in the etcd TLS secrets with mode 400.
        # See https://kubernetes.io/docs/concepts/configuration/secret/
        - name: etcd-certs
          secret:
            secretName: calico-etcd-secrets
            defaultMode: 0400
        # Used to create per-pod Unix Domain Sockets
        - name: policysync
          hostPath:
            type: DirectoryOrCreate
            path: /var/run/nodeagent
        # Used to install Flex Volume Driver
        - name: flexvol-driver-host
          hostPath:
            type: DirectoryOrCreate
            path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds
---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: calico-node
  namespace: kube-system

---
# Source: calico/templates/calico-kube-controllers.yaml

# See https://github.com/projectcalico/kube-controllers
apiVersion: apps/v1
kind: Deployment
metadata:
  name: calico-kube-controllers
  namespace: kube-system
  labels:
    k8s-app: calico-kube-controllers
spec:
  # The controllers can only have a single active instance.
  replicas: 1
  selector:
    matchLabels:
      k8s-app: calico-kube-controllers
  strategy:
    type: Recreate
  template:
    metadata:
      name: calico-kube-controllers
      namespace: kube-system
      labels:
        k8s-app: calico-kube-controllers
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      nodeSelector:
        beta.kubernetes.io/os: linux
      tolerations:
        # Mark the pod as a critical add-on for rescheduling.
        - key: CriticalAddonsOnly
          operator: Exists
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
      serviceAccountName: calico-kube-controllers
      priorityClassName: system-cluster-critical
      # The controllers must run in the host network namespace so that
      # it isn't governed by policy that would prevent it from working.
      hostNetwork: true
      containers:
        - name: calico-kube-controllers
          image: calico/kube-controllers:v3.8.0
          env:
            # The location of the etcd cluster.
            - name: ETCD_ENDPOINTS
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_endpoints
            # Location of the CA certificate for etcd.
            - name: ETCD_CA_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_ca
            # Location of the client key for etcd.
            - name: ETCD_KEY_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_key
            # Location of the client certificate for etcd.
            - name: ETCD_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_cert
            # Choose which controllers to run.
            - name: ENABLED_CONTROLLERS
              value: policy,namespace,serviceaccount,workloadendpoint,node
          volumeMounts:
            # Mount in the etcd TLS secrets.
            - mountPath: /calico-secrets
              name: etcd-certs
          readinessProbe:
            exec:
              command:
              - /usr/bin/check-status
              - -r
      volumes:
        # Mount in the etcd TLS secrets with mode 400.
        # See https://kubernetes.io/docs/concepts/configuration/secret/
        - name: etcd-certs
          secret:
            secretName: calico-etcd-secrets
            defaultMode: 0400

---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: calico-kube-controllers
  namespace: kube-system
---
# Source: calico/templates/calico-typha.yaml

---
# Source: calico/templates/configure-canal.yaml

---
# Source: calico/templates/kdd-crds.yaml


kubectl apply -f calico-etcd.yaml

注意修改參數(shù)

更多calico設(shè)置

coredns

啟用 coredns 需要在kubelet 上添加兩個(gè)參數(shù)

--cluster-dns=169.169.0.100 IP 為DNS服務(wù)的cluster ip

--cluster-domain=cluster.local 為dns服務(wù)設(shè)置的域名

需要部署的coredns yaml文件

apiVersion: v1
kind: ServiceAccount
metadata:
  name: coredns
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:coredns
rules:
- apiGroups:
  - ""
  resources:
  - endpoints
  - services
  - pods
  - namespaces
  verbs:
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:coredns
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:coredns
subjects:
- kind: ServiceAccount
  name: coredns
  namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health
        ready
        kubernetes CLUSTER_DOMAIN REVERSE_CIDRS {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }FEDERATIONS
        prometheus :9153
        forward . UPSTREAMNAMESERVER
        cache 30
        loop
        reload
        loadbalance
    }STUBDOMAINS
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: coredns
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    kubernetes.io/name: "CoreDNS"
spec:
  replicas: 2
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  selector:
    matchLabels:
      k8s-app: kube-dns
  template:
    metadata:
      labels:
        k8s-app: kube-dns
    spec:
      priorityClassName: system-cluster-critical
      serviceAccountName: coredns
      tolerations:
        - key: "CriticalAddonsOnly"
          operator: "Exists"
      nodeSelector:
        beta.kubernetes.io/os: linux
      containers:
      - name: coredns
        image: coredns/coredns:1.5.0
        imagePullPolicy: IfNotPresent
        resources:
          limits:
            memory: 170Mi
          requests:
            cpu: 100m
            memory: 70Mi
        args: [ "-conf", "/etc/coredns/Corefile" ]
        volumeMounts:
        - name: config-volume
          mountPath: /etc/coredns
          readOnly: true
        ports:
        - containerPort: 53
          name: dns
          protocol: UDP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
        - containerPort: 9153
          name: metrics
          protocol: TCP
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            add:
            - NET_BIND_SERVICE
            drop:
            - all
          readOnlyRootFilesystem: true
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        readinessProbe:
          httpGet:
            path: /ready
            port: 8181
            scheme: HTTP
      dnsPolicy: Default
      volumes:
        - name: config-volume
          configMap:
            name: coredns
            items:
            - key: Corefile
              path: Corefile
---
apiVersion: v1
kind: Service
metadata:
  name: kube-dns
  namespace: kube-system
  annotations:
    prometheus.io/port: "9153"
    prometheus.io/scrape: "true"
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: "CoreDNS"
spec:
  selector:
    k8s-app: kube-dns
    #這里要和kubelet 參數(shù)對(duì)應(yīng)上
  clusterIP: 169.169.0.100
  ports:
  - name: dns
    port: 53
    protocol: UDP
  - name: dns-tcp
    port: 53
    protocol: TCP
  - name: metrics
    port: 9153
    protocol: TCP

如果想知道現(xiàn)有集群中使用的配置項(xiàng)可以使用如下命令進(jìn)行查看

kubectl -n kube-system get configmap coredns -o yaml

如果文章對(duì)您有幫助,請(qǐng)點(diǎn)一下下面的 "喜歡"

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌,老刑警劉巖,帶你破解...
    沈念sama閱讀 229,362評(píng)論 6 537
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場(chǎng)離奇詭異,居然都是意外死亡,警方通過查閱死者的電腦和手機(jī),發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 99,013評(píng)論 3 423
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人,你說我怎么就攤上這事。” “怎么了?”我有些...
    開封第一講書人閱讀 177,346評(píng)論 0 382
  • 文/不壞的土叔 我叫張陵,是天一觀的道長(zhǎng)。 經(jīng)常有香客問我,道長(zhǎng),這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 63,421評(píng)論 1 316
  • 正文 為了忘掉前任,我火速辦了婚禮,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘。我一直安慰自己,他們只是感情好,可當(dāng)我...
    茶點(diǎn)故事閱讀 72,146評(píng)論 6 410
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著,像睡著了一般。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 55,534評(píng)論 1 325
  • 那天,我揣著相機(jī)與錄音,去河邊找鬼。 笑死,一個(gè)胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播,決...
    沈念sama閱讀 43,585評(píng)論 3 444
  • 文/蒼蘭香墨 我猛地睜開眼,長(zhǎng)吁一口氣:“原來是場(chǎng)噩夢(mèng)啊……” “哼!你這毒婦竟也來了?” 一聲冷哼從身側(cè)響起,我...
    開封第一講書人閱讀 42,767評(píng)論 0 289
  • 序言:老撾萬榮一對(duì)情侶失蹤,失蹤者是張志新(化名)和其女友劉穎,沒想到半個(gè)月后,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 49,318評(píng)論 1 335
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 41,074評(píng)論 3 356
  • 正文 我和宋清朗相戀三年,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 43,258評(píng)論 1 371
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情,我是刑警寧澤,帶...
    沈念sama閱讀 38,828評(píng)論 5 362
  • 正文 年R本政府宣布,位于F島的核電站,受9級(jí)特大地震影響,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 44,486評(píng)論 3 347
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧,春花似錦、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 34,916評(píng)論 0 28
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 36,156評(píng)論 1 290
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人。 一個(gè)月前我還...
    沈念sama閱讀 51,993評(píng)論 3 395
  • 正文 我出身青樓,卻偏偏與公主長(zhǎng)得像,于是被迫代替她去往敵國和親。 傳聞我的和親對(duì)象是個(gè)殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 48,234評(píng)論 2 375