K8S 網(wǎng)絡(luò)詳解 4 開源網(wǎng)絡(luò)組件

K8S 底層網(wǎng)絡(luò)所需要解決的兩個(gè)問題

協(xié)助 k8s , 給每個(gè) NODE上的 docker 容器都分配互相不沖突的 IP
在這些 IP 地址之間簡(jiǎn)歷一個(gè)覆蓋網(wǎng)絡(luò)(overlay Network), 通過這個(gè)覆蓋網(wǎng)絡(luò), 將數(shù)據(jù)包原封不動(dòng)地傳遞到目標(biāo)容器內(nèi).

Open vSwitch

Open vSwitch 可以建立多鐘通信隧道, 例如Open vswitch with GRE/VXALN. 在K8S 場(chǎng)景下, 我們主要簡(jiǎn)歷 L3 到 L3 的隧道. 網(wǎng)絡(luò)架構(gòu)如下

image.png

需要完成的步驟如下:

刪除docker daemon 創(chuàng)建的網(wǎng)橋 docker0 已避免 docker0地址沖突.
手工創(chuàng)建一個(gè)linux網(wǎng)橋, 手動(dòng)配置網(wǎng)橋的 IP
建立 Open vswitch 網(wǎng)橋 ovs-bridge, 使用 ovs-vsctl 給ovs-bridge 添加gre端口, 在添加端口是, 需要將目標(biāo) NODE 的 IP 地址設(shè)置為對(duì)端 IP 地址. 每個(gè)對(duì)端 IP 地址都需要這么操作.
將 ovs-bridge 作為網(wǎng)絡(luò)接口, 加入docker 的網(wǎng)橋上(docker0 或自己手工創(chuàng)建的網(wǎng)橋)
重啟 ovs-bridge 網(wǎng)橋和 docker 的網(wǎng)橋, 并添加一個(gè)docker 的網(wǎng)段到docker 網(wǎng)橋的路由規(guī)則中

網(wǎng)絡(luò)通信過程

當(dāng)容器內(nèi)的應(yīng)用訪問另一個(gè)容器地址時(shí), 數(shù)據(jù)包會(huì)通過容器內(nèi)的默認(rèn)路由發(fā)送給docker0網(wǎng)橋, ovs的網(wǎng)橋是作為docker0 網(wǎng)橋的端口存在的, 它會(huì)將數(shù)據(jù)發(fā)送給ovs 網(wǎng)橋, ovs 通過gre隧道送達(dá)對(duì)端的node.

配置步驟

在兩個(gè)節(jié)點(diǎn)都安裝ovs

安裝ovs

yum install openvswitch

禁用selinux 并重啟

#vi /etc/selinux/conifg
SELINUX=disabled

查看ovs狀態(tài)

systemctl status openvswtich

創(chuàng)建網(wǎng)橋和 gre 隧道

在每個(gè)node上創(chuàng)建ovs 網(wǎng)橋 br0 然后在網(wǎng)橋上創(chuàng)建 gre 隧道

#創(chuàng)建ovs網(wǎng)橋
ovs-vsctl add-br br0
# 創(chuàng)建 GRE 隧道連接對(duì)端, remote_ip 為對(duì)端 eth0 的 IP, 注意在另一臺(tái)機(jī)器上設(shè)置對(duì)端的時(shí)候,IP 要改為當(dāng)前這臺(tái)機(jī)器的 IP
ovs-vsctl add-port br0 gre1 -- set interface gre1 type gre option:remote_ip=192.168.18.128
# 添加br0 到本地 docker0 網(wǎng)橋, 使得容器流量通過 ovs 進(jìn)入 tunnel
brctl addif docker0 br0
#啟動(dòng)br0 docker0 網(wǎng)橋
ip link set dev br0 up
ip link set dev docker0 up

由于128, 131 ip的兩臺(tái)機(jī)器docker0 網(wǎng)段分別是172.17.43.0/24, 172.17.42.0/24, 這兩個(gè)網(wǎng)段的路由都需要經(jīng)過本機(jī)docker0網(wǎng)橋.其中一個(gè)24網(wǎng)段通過ovs的gre 隧道到達(dá)對(duì)端. 因此需要在每個(gè)node上配置通過docker0 網(wǎng)橋的路由規(guī)則

ip route add 172.17.0.0/16 dev docker0

清空 docker 自帶的iptables 規(guī)則及l(fā)inux 的規(guī)則, 后者存在拒絕ICMP 保溫通過防火墻的規(guī)則

iptables -t nat -F; iptalbes -F

直接路由

網(wǎng)絡(luò)模型

image.png

在默認(rèn)情況下docker0 的IP 在node 網(wǎng)絡(luò)是沒法感知到的, 通過手工設(shè)置路由, 可以讓pod 在不同node 之間互通.

實(shí)現(xiàn)方式

通過部署multilayer switch (MLS) 來實(shí)現(xiàn)

假設(shè) POD1 所在的 docker0 網(wǎng)橋的 IP 網(wǎng)段是 10.1.10.0 , NODE1 地址為 192. 168.1.128; 而 POD2 所在 docker0 ip 網(wǎng)段為 10.1.20.0 NODE2 地址為 192.168.1.129

1 在NODE 1 上添加一條到node2 上 docker0 的靜態(tài)路由規(guī)則

route add -net 10.1.20.0 netmask 255.255.255.0 gw 192.168.1.129

2 在 NODE 2 上添加一條到 NODE 1 上 docker0 的靜態(tài)路由規(guī)則

route add -net 10.1.10.0 netmask 255.255.255.0 gw 192.168.1.128

3 驗(yàn)證連通性, 在 NODE1 上 ping node2 上的 docker0 網(wǎng)絡(luò)

ping 10.1.20.1

大規(guī)模集群下的實(shí)現(xiàn)方式, 手工建立 linux bridge 已避免 docker daemon 建立docker0 造成 IP 段沖突, 然后使用docker 的--bridge 命令來指定網(wǎng)橋.

然后在每個(gè)節(jié)點(diǎn)運(yùn)行quagga 路由學(xué)習(xí)軟件.

calico

Calico 工作方式

Calico可以創(chuàng)建并管理一個(gè)3層平面網(wǎng)絡(luò)，為每個(gè)工作負(fù)載分配一個(gè)完全可路由的IP地址。工作負(fù)載可以在沒有IP封裝或網(wǎng)絡(luò)地址轉(zhuǎn)換的情況下進(jìn)行通信，以實(shí)現(xiàn)裸機(jī)性能，簡(jiǎn)化故障排除和提供更好的互操作性。在需要使用overlay網(wǎng)絡(luò)的環(huán)境中，Calico提供了IP-in-IP隧道技術(shù)，或者也可以與flannel等其他overlay網(wǎng)絡(luò)配合使用。

Calico還提供網(wǎng)絡(luò)安全規(guī)則的動(dòng)態(tài)配置。使用Calico的簡(jiǎn)單策略語言，就可以實(shí)現(xiàn)對(duì)容器、虛擬機(jī)工作負(fù)載和裸機(jī)主機(jī)各節(jié)點(diǎn)之間通信的細(xì)粒度控制。

Calico v3.4于2018.12.10號(hào)發(fā)布，可與Kubernetes、OpenShift和OpenStack良好地集成使用。

注意: 在Mesos, DC/OS和Docker orchestrators中使用Calico時(shí)，目前只支持到了 Calico v2.6.

uguo

Calico的IPIP與BGP模式

IPIP是一種將各Node的路由之間做一個(gè)tunnel，再把兩個(gè)網(wǎng)絡(luò)連接起來的模式。啟用IPIP模式時(shí)，Calico將在各Node上創(chuàng)建一個(gè)名為”tunl0″的虛擬網(wǎng)絡(luò)接口。如下圖所示。
BGP模式則直接使用物理機(jī)作為虛擬路由路（vRouter），不再創(chuàng)建額外的tunnel

calico 在linux 內(nèi)核中實(shí)現(xiàn)一個(gè)vRouter來負(fù)責(zé)數(shù)據(jù)轉(zhuǎn)發(fā), 通過 BGP 協(xié)議,將node 節(jié)點(diǎn)上的路由信息在整個(gè)calico 網(wǎng)絡(luò)中廣播, 并自動(dòng)設(shè)置到達(dá)其他節(jié)點(diǎn)的路由轉(zhuǎn)發(fā)規(guī)則.

image.png

Calico BGP模式在小規(guī)模集群中可以直接互聯(lián)，在大規(guī)模集群中可以通過額外的BGP route reflector來完成。

image.png

Calico主要組件

Calico利用了Linux內(nèi)核原生的路由和iptables防火墻功能。進(jìn)出各個(gè)容器、虛擬機(jī)和物理主機(jī)的所有流量都會(huì)在路由到目標(biāo)之前遍歷這些內(nèi)核規(guī)則。

Felix：主要的Calico代理agent，運(yùn)行每臺(tái)計(jì)算機(jī)上管理endpoints資源。
calicoctl：允許從命令行界面配置實(shí)現(xiàn)高級(jí)策略和網(wǎng)絡(luò)。
orchestrator plugins：提供與各種流行的云計(jì)算編排工具的緊密集成和同步支持。
key/value store：存儲(chǔ)Calico的策略配置和網(wǎng)絡(luò)狀態(tài)信息，目前主要使用etcdv3或k8s api。
calico/node：在每個(gè)主機(jī)上運(yùn)行，從key/value存儲(chǔ)中讀取相關(guān)的策略和網(wǎng)絡(luò)配置信息，并在Linux內(nèi)核中實(shí)現(xiàn)它。
Dikastes/Envoy：可選的Kubernetes sidecars，可以通過相互TLS身份驗(yàn)證保護(hù)工作負(fù)載到工作負(fù)載的通信，并增加應(yīng)用層控制策略。

Felix

Felix是一個(gè)守護(hù)程序，它在每個(gè)提供endpoints資源的計(jì)算機(jī)上運(yùn)行。在大多數(shù)情況下，這意味著它需要在托管容器或VM的宿主機(jī)節(jié)點(diǎn)上運(yùn)行。 Felix 負(fù)責(zé)編制路由和ACL規(guī)則以及在該主機(jī)上所需的任何其他內(nèi)容，以便為該主機(jī)上的endpoints資源正常運(yùn)行提供所需的網(wǎng)絡(luò)連接。

根據(jù)特定的編排環(huán)境，F(xiàn)elix負(fù)責(zé)以下任務(wù)：

管理網(wǎng)絡(luò)接口，F(xiàn)elix將有關(guān)接口的一些信息編程到內(nèi)核中，以使內(nèi)核能夠正確處理該endpoint發(fā)出的流量。特別是，它將確保主機(jī)正確響應(yīng)來自每個(gè)工作負(fù)載的ARP請(qǐng)求，并將為其管理的接口啟用IP轉(zhuǎn)發(fā)支持。它還監(jiān)視網(wǎng)絡(luò)接口的出現(xiàn)和消失，以便確保針對(duì)這些接口的編程得到了正確的應(yīng)用。
編寫路由，F(xiàn)elix負(fù)責(zé)將到其主機(jī)上endpoints的路由編寫到Linux內(nèi)核FIB（轉(zhuǎn)發(fā)信息庫）中。這可以確保那些發(fā)往目標(biāo)主機(jī)的endpoints的數(shù)據(jù)包被正確地轉(zhuǎn)發(fā)。
編寫ACLs，F(xiàn)elix還負(fù)責(zé)將ACLs編程到Linux內(nèi)核中。這些ACLs用于確保只能在endpoints之間發(fā)送有效的網(wǎng)絡(luò)流量，并確保endpoints無法繞過Calico的安全措施。
報(bào)告狀態(tài)，F(xiàn)elix負(fù)責(zé)提供有關(guān)網(wǎng)絡(luò)健康狀況的數(shù)據(jù)。特別是，它將報(bào)告配置其主機(jī)時(shí)發(fā)生的錯(cuò)誤和問題。該數(shù)據(jù)會(huì)被寫入etcd，以使其對(duì)網(wǎng)絡(luò)中的其他組件和操作才可見。

Orchestrator Plugin

每個(gè)主要的云編排平臺(tái)都有單獨(dú)的Calico網(wǎng)絡(luò)插件（例如OpenStack，Kubernetes）。這些插件的目的是將Calico更緊密地綁定到編排工具中，允許用戶管理Calico網(wǎng)絡(luò)，就像他們管理編排工具中內(nèi)置的網(wǎng)絡(luò)工具一樣。

一個(gè)好的Orchestrator插件示例是Calico Neutron ML2 驅(qū)動(dòng)程序。該插件與Neutron的ML2插件集成，允許用戶通過Neutron API調(diào)用來配置Calico網(wǎng)絡(luò)，實(shí)現(xiàn)了與Neutron的無縫集成。

Orchestrator插件負(fù)責(zé)以下任務(wù)：

API Translation，每個(gè)云編排工具都不可避免地?fù)碛凶约旱囊惶子糜诠芾砭W(wǎng)絡(luò)的API接口規(guī)范， Orchestrator插件的主要工作就是將這些API轉(zhuǎn)換為Calico的數(shù)據(jù)模型，然后將其存儲(chǔ)在Calico的數(shù)據(jù)存儲(chǔ)區(qū)中。這種轉(zhuǎn)換中的一些工作將非常簡(jiǎn)單，其他一部分可能更復(fù)雜，以便將單個(gè)復(fù)雜操作（例如，實(shí)時(shí)遷移）轉(zhuǎn)換為Calico網(wǎng)絡(luò)期望的一系列更簡(jiǎn)單的操作。
Feedback，如有需要，orchestrator插件將從Calico網(wǎng)絡(luò)向編排器提供管理命令的反饋信息。包括提供有關(guān)Felix存活的信息，以及如果網(wǎng)絡(luò)配置失敗則將某些endpoints標(biāo)記為失敗。

etcd

etcd是一個(gè)分布式鍵值存儲(chǔ)數(shù)據(jù)庫，專注于實(shí)現(xiàn)數(shù)據(jù)存儲(chǔ)一致性。 Calico使用etcd提供組件之間的數(shù)據(jù)通信，并作為可以保證一致性的數(shù)據(jù)存儲(chǔ)，以確保Calico始終可以構(gòu)建出一個(gè)準(zhǔn)確的網(wǎng)絡(luò)。

根據(jù)orchestrator插件的不同，etcd既可以是作為主數(shù)據(jù)存儲(chǔ)使用，也可以是一個(gè)單獨(dú)數(shù)據(jù)存儲(chǔ)的輕量級(jí)鏡像。例如，在OpenStack部署中，OpenStack數(shù)據(jù)庫被認(rèn)為是“真實(shí)配置信息的來源”，而etcd用于鏡像其中有關(guān)網(wǎng)絡(luò)配置的信息，并用于服務(wù)其他Calico組件。

etcd組件穿插在整個(gè)部署中。它可以被分為兩組主機(jī)節(jié)點(diǎn)：核心集群和代理。

對(duì)于小型部署，核心集群可以是一個(gè)節(jié)點(diǎn)的etcd集群（通常與orchestrator插件組件位于同一節(jié)點(diǎn)上）。這種部署模型很簡(jiǎn)單但沒有為etcd提供冗余。在etcd失敗的情況下，orchstrator插件必須重建數(shù)據(jù)庫，例如OpenStack，它需要插件從OpenStack數(shù)據(jù)庫重新同步狀態(tài)到etcd。

在較大的部署中，核心群集可以根據(jù)etcd管理指南進(jìn)行擴(kuò)展。

此外，在運(yùn)行Felix或orchstrator插件的每臺(tái)計(jì)算機(jī)上，會(huì)運(yùn)行一個(gè)etcd代理服務(wù)。這減少了etcd核心集群上的負(fù)載，并為主機(jī)節(jié)點(diǎn)屏蔽了etcd服務(wù)集群的細(xì)節(jié)。在etcd集群與orchstrator插件在同一臺(tái)機(jī)器上都有成員的情況下，可以放棄在該機(jī)器上使用etcd代理。

etcd負(fù)責(zé)執(zhí)行以下任務(wù)：

Data Storage，etcd以分布式、一致和容錯(cuò)的方式存儲(chǔ)Calico網(wǎng)絡(luò)的數(shù)據(jù)（對(duì)于至少三個(gè)etcd節(jié)點(diǎn)的cluster大小）。這確保Calico網(wǎng)絡(luò)始終處于已知良好狀態(tài)，同時(shí)允許運(yùn)行etcd的個(gè)別機(jī)器節(jié)點(diǎn)失敗或無法訪問。Calico網(wǎng)絡(luò)數(shù)據(jù)的這種分布式存儲(chǔ)提高了Calico組件從數(shù)據(jù)庫讀取的能力。
Communication，etcd也用作組件之間的通信服務(wù)。我們通過讓非etcd組件監(jiān)視鍵值空間中的某些點(diǎn)來確保他們看到已經(jīng)做出的任何更改，從而允許他們及時(shí)響應(yīng)這些更改。該功能允許將狀態(tài)信息提交到數(shù)據(jù)庫，然后觸發(fā)基于該狀態(tài)數(shù)據(jù)的進(jìn)一步網(wǎng)絡(luò)配置管理。

BGP Client (BIRD)

Calico在每個(gè)運(yùn)行Felix服務(wù)的節(jié)點(diǎn)上都部署一個(gè)BGP客戶端。 BGP客戶端的作用是讀取Felix程序編寫到內(nèi)核中并在數(shù)據(jù)中心內(nèi)分發(fā)的路由信息。

BGP客戶端負(fù)責(zé)執(zhí)行以下任務(wù)：

路由信息分發(fā)，當(dāng)Felix將路由插入Linux內(nèi)核FIB時(shí)，BGP客戶端將接收它們并將它們分發(fā)到集群中的其他工作節(jié)點(diǎn)。

BGP Route Reflector (BIRD)

對(duì)于較大規(guī)模的部署，簡(jiǎn)單的BGP可能成為限制因素，因?yàn)樗竺總€(gè)BGP客戶端連接到網(wǎng)狀拓?fù)渲械拿恳粋€(gè)其他BGP客戶端。這需要越來越多的連接，迅速變得難以維護(hù)，甚至?xí)屢恍┰O(shè)備的路由表撐滿。

因此，在較大規(guī)模的部署中，Calico建議部署B(yǎng)GP Route Reflector。通常是在Internet中使用這樣的組件充當(dāng)BGP客戶端連接的中心點(diǎn)，從而防止它們需要與群集中的每個(gè)BGP客戶端進(jìn)行通信。為了實(shí)現(xiàn)冗余，也可以同時(shí)部署多個(gè)BGP Route Reflector服務(wù)。Route Reflector僅僅是協(xié)助管理BGP網(wǎng)絡(luò)，并沒有endpoint數(shù)據(jù)會(huì)通過它們。

在Calico中，此BGP組件也是使用的最常見的BIRD，配置為Route Reflector運(yùn)行，而不是標(biāo)準(zhǔn)BGP客戶端。

BGP Route Reflector負(fù)責(zé)以下任務(wù)：

集中式的路由信息分發(fā)，當(dāng)Calico BGP客戶端將路由從其FIB通告到Route Reflector時(shí)，Route Reflector會(huì)將這些路由通告給部署集群中的其他節(jié)點(diǎn)。

BIRD是什么

BIRD是布拉格查理大學(xué)數(shù)學(xué)與物理學(xué)院的一個(gè)學(xué)校項(xiàng)目，項(xiàng)目名是BIRD Internet Routing Daemon的縮寫。目前，它由CZ.NIC實(shí)驗(yàn)室開發(fā)和支持。

BIRD項(xiàng)目旨在開發(fā)一個(gè)功能齊全的動(dòng)態(tài)IP路由守護(hù)進(jìn)程，主要針對(duì)（但不限于）Linux，F(xiàn)reeBSD和其他類UNIX系統(tǒng)，并在GNU通用公共許可證下分發(fā)。詳細(xì)信息參照官網(wǎng)https://bird.network.cz/。

作為一個(gè)開源的網(wǎng)絡(luò)路由守護(hù)進(jìn)程項(xiàng)目，BRID設(shè)計(jì)并支持了以下功能：

both IPv4 and IPv6 protocols
multiple routing tables
the Border Gateway Protocol (BGPv4)
the Routing Information Protocol (RIPv2, RIPng)
the Open Shortest Path First protocol (OSPFv2, OSPFv3)
the Babel Routing Protocol
the Router Advertisements for IPv6 hosts
a virtual protocol for exchange of routes between different routing tables on a single host
a command-line interface allowing on-line control and inspection of status of the daemon
soft reconfiguration (no need to use complex online commands to change the configuration, just edit the configuration file and notify BIRD to re-read it and it will smoothly switch itself to the new configuration, not disturbing routing protocols unless they are affected by the configuration changes)
a powerful language for route filtering

K8S 中部署 calico

修改kube-api server 啟動(dòng)參數(shù)

--allow-priviledge=true (calico 需要特權(quán)模式)

修改kubelet 啟動(dòng)參數(shù) --network-plugin=cni

假設(shè)K8S 環(huán)境包含兩個(gè)node節(jié)點(diǎn) node1 (192,168.18.3) , node2 (192.168.18.4)

創(chuàng)建calico 服務(wù), 主要包括calico-node 和 calico policy controller, 需要的K8S 資源對(duì)象如下

configmap: calico-config 包含calico的配置參數(shù)
secret: calico-etcd-secrets 用于TLS 連接etcd
在每個(gè)節(jié)點(diǎn)以daemonset的形式部署calico/node 容器
在每個(gè)節(jié)點(diǎn)都安裝calico cni 二進(jìn)制文件和網(wǎng)絡(luò)配置參數(shù)(由install-cni 容器完成)
部署一個(gè)名為calico/kube-policy-controller的deployment, 為K8S 集群中的POD 設(shè)置network policy

官方 calico k8s 安裝 yaml 文件如下

calico-etcd.yaml

---
# Source: calico/templates/calico-etcd-secrets.yaml
# The following contains k8s Secrets for use with a TLS enabled etcd cluster.
# For information on populating Secrets, see http://kubernetes.io/docs/user-guide/secrets/
apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: calico-etcd-secrets
  namespace: kube-system
data:
  # Populate the following with etcd TLS configuration if desired, but leave blank if
  # not using TLS for etcd.
  # The keys below should be uncommented and the values populated with the base64
  # encoded contents of each file that would be associated with the TLS data.
  # Example command for encoding a file contents: cat <file> | base64 -w 0
  # etcd-key: null
  # etcd-cert: null
  # etcd-ca: null
---
# Source: calico/templates/calico-config.yaml
# This ConfigMap is used to configure a self-hosted Calico installation.
kind: ConfigMap
apiVersion: v1
metadata:
  name: calico-config
  namespace: kube-system
data:
  # Configure this with the location of your etcd cluster.
  #ETCD的服務(wù)地址
  etcd_endpoints: "http://<ETCD_IP>:<ETCD_PORT>"
  # If you're using TLS enabled etcd uncomment the following.
  # You must also populate the Secret below with these files.
  etcd_ca: ""   # "/calico-secrets/etcd-ca"
  etcd_cert: "" # "/calico-secrets/etcd-cert"
  etcd_key: ""  # "/calico-secrets/etcd-key"
  # Typha is disabled.
  typha_service_name: "none"
  # Configure the backend to use.
  calico_backend: "bird"

  # Configure the MTU to use
  veth_mtu: "1440"

  # The CNI network configuration to install on each node.  The special
  # values in this config will be automatically populated.
  cni_network_config: |-
    {
      "name": "k8s-pod-network",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "calico",
          "log_level": "info",
          "etcd_endpoints": "__ETCD_ENDPOINTS__",
          "etcd_key_file": "__ETCD_KEY_FILE__",
          "etcd_cert_file": "__ETCD_CERT_FILE__",
          "etcd_ca_cert_file": "__ETCD_CA_CERT_FILE__",
          "mtu": __CNI_MTU__,
          "ipam": {
              "type": "calico-ipam"
          },
          "policy": {
              "type": "k8s"
          },
          "kubernetes": {
              "kubeconfig": "__KUBECONFIG_FILEPATH__"
          }
        },
        {
          "type": "portmap",
          "snat": true,
          "capabilities": {"portMappings": true}
        }
      ]
    }

---
# Source: calico/templates/rbac.yaml

# Include a clusterrole for the kube-controllers component,
# and bind it to the calico-kube-controllers serviceaccount.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: calico-kube-controllers
rules:
  # Pods are monitored for changing labels.
  # The node controller monitors Kubernetes nodes.
  # Namespace and serviceaccount labels are used for policy.
  - apiGroups: [""]
    resources:
      - pods
      - nodes
      - namespaces
      - serviceaccounts
    verbs:
      - watch
      - list
  # Watch for changes to Kubernetes NetworkPolicies.
  - apiGroups: ["networking.k8s.io"]
    resources:
      - networkpolicies
    verbs:
      - watch
      - list
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: calico-kube-controllers
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: calico-kube-controllers
subjects:
- kind: ServiceAccount
  name: calico-kube-controllers
  namespace: kube-system
---
# Include a clusterrole for the calico-node DaemonSet,
# and bind it to the calico-node serviceaccount.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: calico-node
rules:
  # The CNI plugin needs to get pods, nodes, and namespaces.
  - apiGroups: [""]
    resources:
      - pods
      - nodes
      - namespaces
    verbs:
      - get
  - apiGroups: [""]
    resources:
      - endpoints
      - services
    verbs:
      # Used to discover service IPs for advertisement.
      - watch
      - list
  - apiGroups: [""]
    resources:
      - nodes/status
    verbs:
      # Needed for clearing NodeNetworkUnavailable flag.
      - patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: calico-node
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: calico-node
subjects:
- kind: ServiceAccount
  name: calico-node
  namespace: kube-system

---
# Source: calico/templates/calico-node.yaml
# This manifest installs the calico-node container, as well
# as the CNI plugins and network config on
# each master and worker node in a Kubernetes cluster.
kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: calico-node
  namespace: kube-system
  labels:
    k8s-app: calico-node
spec:
  selector:
    matchLabels:
      k8s-app: calico-node
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  template:
    metadata:
      labels:
        k8s-app: calico-node
      annotations:
        # This, along with the CriticalAddonsOnly toleration below,
        # marks the pod as a critical add-on, ensuring it gets
        # priority scheduling and that its resources are reserved
        # if it ever gets evicted.
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      nodeSelector:
        beta.kubernetes.io/os: linux
      hostNetwork: true
      tolerations:
        # Make sure calico-node gets scheduled on all nodes.
        - effect: NoSchedule
          operator: Exists
        # Mark the pod as a critical add-on for rescheduling.
        - key: CriticalAddonsOnly
          operator: Exists
        - effect: NoExecute
          operator: Exists
      serviceAccountName: calico-node
      # Minimize downtime during a rolling upgrade or deletion; tell Kubernetes to do a "force
      # deletion": https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods.
      terminationGracePeriodSeconds: 0
      priorityClassName: system-node-critical
      initContainers:
        # This container installs the CNI binaries
        # and CNI network config file on each node.
        - name: install-cni
          image: calico/cni:v3.8.0
          command: ["/install-cni.sh"]
          env:
            # Name of the CNI config file to create.
            - name: CNI_CONF_NAME
              value: "10-calico.conflist"
            # The CNI network config to install on each node.
            - name: CNI_NETWORK_CONFIG
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: cni_network_config
            # The location of the etcd cluster.
            - name: ETCD_ENDPOINTS
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_endpoints
            # CNI MTU Config variable
            - name: CNI_MTU
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: veth_mtu
            # Prevents the container from sleeping forever.
            - name: SLEEP
              value: "false"
          volumeMounts:
            - mountPath: /host/opt/cni/bin
              name: cni-bin-dir
            - mountPath: /host/etc/cni/net.d
              name: cni-net-dir
            - mountPath: /calico-secrets
              name: etcd-certs
        # Adds a Flex Volume Driver that creates a per-pod Unix Domain Socket to allow Dikastes
        # to communicate with Felix over the Policy Sync API.
        - name: flexvol-driver
          image: calico/pod2daemon-flexvol:v3.8.0
          volumeMounts:
          - name: flexvol-driver-host
            mountPath: /host/driver
      containers:
        # Runs calico-node container on each Kubernetes node.  This
        # container programs network policy and routes on each
        # host.
        - name: calico-node
          image: calico/node:v3.8.0
          env:
            # The location of the etcd cluster.
            - name: ETCD_ENDPOINTS
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_endpoints
            # Location of the CA certificate for etcd.
            - name: ETCD_CA_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_ca
            # Location of the client key for etcd.
            - name: ETCD_KEY_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_key
            # Location of the client certificate for etcd.
            - name: ETCD_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_cert
            # Set noderef for node controller.
            - name: CALICO_K8S_NODE_REF
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            # Choose the backend to use.
            - name: CALICO_NETWORKING_BACKEND
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: calico_backend
            # Cluster type to identify the deployment type
            - name: CLUSTER_TYPE
              value: "k8s,bgp"
            # Auto-detect the BGP IP address.
            - name: IP
              value: "autodetect"
            # Enable IPIP
            - name: CALICO_IPV4POOL_IPIP
              value: "Always"
            # Set MTU for tunnel device used if ipip is enabled
            - name: FELIX_IPINIPMTU
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: veth_mtu
            # The default IPv4 pool to create on startup if none exists. Pod IPs will be
            # chosen from this range. Changing this value after installation will have
            # no effect. This should fall within `--cluster-cidr`.
            - name: CALICO_IPV4POOL_CIDR
              value: "192.168.0.0/16"
            # Disable file logging so `kubectl logs` works.
            - name: CALICO_DISABLE_FILE_LOGGING
              value: "true"
            # Set Felix endpoint to host default action to ACCEPT.
            - name: FELIX_DEFAULTENDPOINTTOHOSTACTION
              value: "ACCEPT"
            # Disable IPv6 on Kubernetes.
            - name: FELIX_IPV6SUPPORT
              value: "false"
            # Set Felix logging to "info"
            - name: FELIX_LOGSEVERITYSCREEN
              value: "info"
            - name: FELIX_HEALTHENABLED
              value: "true"
          securityContext:
            privileged: true
          resources:
            requests:
              cpu: 250m
          livenessProbe:
            httpGet:
              path: /liveness
              port: 9099
              host: localhost
            periodSeconds: 10
            initialDelaySeconds: 10
            failureThreshold: 6
          readinessProbe:
            exec:
              command:
              - /bin/calico-node
              - -bird-ready
              - -felix-ready
            periodSeconds: 10
          volumeMounts:
            - mountPath: /lib/modules
              name: lib-modules
              readOnly: true
            - mountPath: /run/xtables.lock
              name: xtables-lock
              readOnly: false
            - mountPath: /var/run/calico
              name: var-run-calico
              readOnly: false
            - mountPath: /var/lib/calico
              name: var-lib-calico
              readOnly: false
            - mountPath: /calico-secrets
              name: etcd-certs
            - name: policysync
              mountPath: /var/run/nodeagent
      volumes:
        # Used by calico-node.
        - name: lib-modules
          hostPath:
            path: /lib/modules
        - name: var-run-calico
          hostPath:
            path: /var/run/calico
        - name: var-lib-calico
          hostPath:
            path: /var/lib/calico
        - name: xtables-lock
          hostPath:
            path: /run/xtables.lock
            type: FileOrCreate
        # Used to install CNI.
        - name: cni-bin-dir
          hostPath:
            path: /opt/cni/bin
        - name: cni-net-dir
          hostPath:
            path: /etc/cni/net.d
        # Mount in the etcd TLS secrets with mode 400.
        # See https://kubernetes.io/docs/concepts/configuration/secret/
        - name: etcd-certs
          secret:
            secretName: calico-etcd-secrets
            defaultMode: 0400
        # Used to create per-pod Unix Domain Sockets
        - name: policysync
          hostPath:
            type: DirectoryOrCreate
            path: /var/run/nodeagent
        # Used to install Flex Volume Driver
        - name: flexvol-driver-host
          hostPath:
            type: DirectoryOrCreate
            path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds
---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: calico-node
  namespace: kube-system

---
# Source: calico/templates/calico-kube-controllers.yaml

# See https://github.com/projectcalico/kube-controllers
apiVersion: apps/v1
kind: Deployment
metadata:
  name: calico-kube-controllers
  namespace: kube-system
  labels:
    k8s-app: calico-kube-controllers
spec:
  # The controllers can only have a single active instance.
  replicas: 1
  selector:
    matchLabels:
      k8s-app: calico-kube-controllers
  strategy:
    type: Recreate
  template:
    metadata:
      name: calico-kube-controllers
      namespace: kube-system
      labels:
        k8s-app: calico-kube-controllers
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      nodeSelector:
        beta.kubernetes.io/os: linux
      tolerations:
        # Mark the pod as a critical add-on for rescheduling.
        - key: CriticalAddonsOnly
          operator: Exists
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
      serviceAccountName: calico-kube-controllers
      priorityClassName: system-cluster-critical
      # The controllers must run in the host network namespace so that
      # it isn't governed by policy that would prevent it from working.
      hostNetwork: true
      containers:
        - name: calico-kube-controllers
          image: calico/kube-controllers:v3.8.0
          env:
            # The location of the etcd cluster.
            - name: ETCD_ENDPOINTS
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_endpoints
            # Location of the CA certificate for etcd.
            - name: ETCD_CA_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_ca
            # Location of the client key for etcd.
            - name: ETCD_KEY_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_key
            # Location of the client certificate for etcd.
            - name: ETCD_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_cert
            # Choose which controllers to run.
            - name: ENABLED_CONTROLLERS
              value: policy,namespace,serviceaccount,workloadendpoint,node
          volumeMounts:
            # Mount in the etcd TLS secrets.
            - mountPath: /calico-secrets
              name: etcd-certs
          readinessProbe:
            exec:
              command:
              - /usr/bin/check-status
              - -r
      volumes:
        # Mount in the etcd TLS secrets with mode 400.
        # See https://kubernetes.io/docs/concepts/configuration/secret/
        - name: etcd-certs
          secret:
            secretName: calico-etcd-secrets
            defaultMode: 0400

---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: calico-kube-controllers
  namespace: kube-system
---
# Source: calico/templates/calico-typha.yaml

---
# Source: calico/templates/configure-canal.yaml

---
# Source: calico/templates/kdd-crds.yaml

kubectl apply -f calico-etcd.yaml

注意修改參數(shù)

更多calico設(shè)置

coredns

啟用 coredns 需要在kubelet 上添加兩個(gè)參數(shù)

--cluster-dns=169.169.0.100 IP 為DNS服務(wù)的cluster ip

--cluster-domain=cluster.local 為dns服務(wù)設(shè)置的域名

需要部署的coredns yaml文件

apiVersion: v1
kind: ServiceAccount
metadata:
  name: coredns
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:coredns
rules:
- apiGroups:
  - ""
  resources:
  - endpoints
  - services
  - pods
  - namespaces
  verbs:
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:coredns
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:coredns
subjects:
- kind: ServiceAccount
  name: coredns
  namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health
        ready
        kubernetes CLUSTER_DOMAIN REVERSE_CIDRS {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }FEDERATIONS
        prometheus :9153
        forward . UPSTREAMNAMESERVER
        cache 30
        loop
        reload
        loadbalance
    }STUBDOMAINS
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: coredns
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    kubernetes.io/name: "CoreDNS"
spec:
  replicas: 2
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  selector:
    matchLabels:
      k8s-app: kube-dns
  template:
    metadata:
      labels:
        k8s-app: kube-dns
    spec:
      priorityClassName: system-cluster-critical
      serviceAccountName: coredns
      tolerations:
        - key: "CriticalAddonsOnly"
          operator: "Exists"
      nodeSelector:
        beta.kubernetes.io/os: linux
      containers:
      - name: coredns
        image: coredns/coredns:1.5.0
        imagePullPolicy: IfNotPresent
        resources:
          limits:
            memory: 170Mi
          requests:
            cpu: 100m
            memory: 70Mi
        args: [ "-conf", "/etc/coredns/Corefile" ]
        volumeMounts:
        - name: config-volume
          mountPath: /etc/coredns
          readOnly: true
        ports:
        - containerPort: 53
          name: dns
          protocol: UDP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
        - containerPort: 9153
          name: metrics
          protocol: TCP
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            add:
            - NET_BIND_SERVICE
            drop:
            - all
          readOnlyRootFilesystem: true
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        readinessProbe:
          httpGet:
            path: /ready
            port: 8181
            scheme: HTTP
      dnsPolicy: Default
      volumes:
        - name: config-volume
          configMap:
            name: coredns
            items:
            - key: Corefile
              path: Corefile
---
apiVersion: v1
kind: Service
metadata:
  name: kube-dns
  namespace: kube-system
  annotations:
    prometheus.io/port: "9153"
    prometheus.io/scrape: "true"
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: "CoreDNS"
spec:
  selector:
    k8s-app: kube-dns
    #這里要和kubelet 參數(shù)對(duì)應(yīng)上
  clusterIP: 169.169.0.100
  ports:
  - name: dns
    port: 53
    protocol: UDP
  - name: dns-tcp
    port: 53
    protocol: TCP
  - name: metrics
    port: 9153
    protocol: TCP

如果想知道現(xiàn)有集群中使用的配置項(xiàng)可以使用如下命令進(jìn)行查看

kubectl -n kube-system get configmap coredns -o yaml

如果文章對(duì)您有幫助,請(qǐng)點(diǎn)一下下面的 "喜歡"

最后編輯于：2019.07.11 14:43:16

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
平臺(tái)聲明：文章內(nèi)容（如有圖片或視頻亦包括在內(nèi)）由作者上傳并發(fā)布，文章內(nèi)容僅代表作者本人觀點(diǎn)，簡(jiǎn)書系信息發(fā)布平臺(tái)，僅提供信息存儲(chǔ)服務(wù)。

人面猴
序言：七十年代末，一起剝皮案震驚了整個(gè)濱河市，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌，老刑警劉巖，帶你破解...
沈念sama閱讀 229,362評(píng)論 6贊 537
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場(chǎng)離奇詭異，居然都是意外死亡，警方通過查閱死者的電腦和手機(jī)，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 99,013評(píng)論 3贊 423
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來，“玉大人，你說我怎么就攤上這事。” “怎么了？”我有些...
開封第一講書人閱讀 177,346評(píng)論 0贊 382
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長(zhǎng)。經(jīng)常有香客問我，道長(zhǎng)，這世上最難降的妖魔是什么？我笑而不...
開封第一講書人閱讀 63,421評(píng)論 1贊 316
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮，結(jié)果婚禮上，老公的妹妹穿的比我還像新娘。我一直安慰自己，他們只是感情好，可當(dāng)我...
茶點(diǎn)故事閱讀 72,146評(píng)論 6贊 410
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布。她就那樣靜靜地躺著，像睡著了一般。火紅的嫁衣襯著肌膚如雪。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 55,534評(píng)論 1贊 325
城市分裂傳說
那天，我揣著相機(jī)與錄音，去河邊找鬼。笑死，一個(gè)胖子當(dāng)著我的面吹牛，可吹牛的內(nèi)容都是我干的。我是一名探鬼主播，決...
沈念sama閱讀 43,585評(píng)論 3贊 444
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長(zhǎng)吁一口氣：“原來是場(chǎng)噩夢(mèng)啊……” “哼！你這毒婦竟也來了？” 一聲冷哼從身側(cè)響起，我...
開封第一講書人閱讀 42,767評(píng)論 0贊 289
萬榮殺人案實(shí)錄
序言：老撾萬榮一對(duì)情侶失蹤，失蹤者是張志新（化名）和其女友劉穎，沒想到半個(gè)月后，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體，經(jīng)...
沈念sama閱讀 49,318評(píng)論 1贊 335
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡，尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 41,074評(píng)論 3贊 356
?白月光啟示錄
正文我和宋清朗相戀三年，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點(diǎn)故事閱讀 43,258評(píng)論 1贊 371
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡，死狀恐怖，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情，我是刑警寧澤，帶...
沈念sama閱讀 38,828評(píng)論 5贊 362
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站，受9級(jí)特大地震影響，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 44,486評(píng)論 3贊 347
男人毒藥：我在死后第九天來索命
文/蒙蒙一、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧，春花似錦、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 34,916評(píng)論 0贊 28
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽。三九已至，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背。一陣腳步聲響...
開封第一講書人閱讀 36,156評(píng)論 1贊 290
情欲美人皮
我被黑心中介騙來泰國打工，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留，地道東北人。一個(gè)月前我還...
沈念sama閱讀 51,993評(píng)論 3贊 395
代替公主和親
正文我出身青樓，卻偏偏與公主長(zhǎng)得像，于是被迫代替她去往敵國和親。傳聞我的和親對(duì)象是個(gè)殘疾皇子，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 48,234評(píng)論 2贊 375

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

K8S 網(wǎng)絡(luò)詳解 4 開源網(wǎng)絡(luò)組件

Open vSwitch

網(wǎng)絡(luò)通信過程

配置步驟

在兩個(gè)節(jié)點(diǎn)都安裝ovs

創(chuàng)建網(wǎng)橋和 gre 隧道

直接路由

網(wǎng)絡(luò)模型

實(shí)現(xiàn)方式

calico

Calico 工作方式

K8S 中部署 calico

coredns

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频