概述
對于一個分布式平臺來說,日志收集處理是一個不可或缺的功能。目前,ELK Stack 已經成為最流行的集中式日志解決方案。本文主要梳理一下ELK的一些理論知識,并針對K8S容器云平臺探討一下集中式日志解決方案的可行性,并做一下簡單實踐。
ELK Stack
ELK Stack主要包括以下組件:
Elasticsearch:分布式搜索和分析引擎,基于 Apache Lucene構建,用于對大容量的數據進行接近實時的存儲,搜索和分析。具有高可伸縮,高可靠,易管理等特點。通常用作某些應用的基礎搜索引擎,使其具有復雜的搜索功能。
Logstash: 數據收集引擎。支持動態的從各種數據源搜集數據,并對數據進行過濾、分析、豐富、統一格式等操作,然后存儲到用戶指定的位置。
Kibana:數據分析和可視化平臺。通常與 Elasticsearch配合使用,對其中數據進行搜索、分析和以統計圖表的方式展示;
Filebeat:ELK 協議棧的新成員,一個輕量級開源日志文件數據搜集器,基于 Logstash-Forwarder 源代碼開發,是對它的替代(Logstash占用內存太大)。在需要采集日志數據的 server 上安裝 Filebeat,并指定日志目錄或日志文件后,Filebeat 就能讀取數據,迅速發送到 Logstash 進行解析,亦或直接發送到 Elasticsearch 進行集中式存儲和分析。
ELK 的架構設計是跟業務息息相關的,如果是數據量比較小,可靠性要求不高,允許數據丟失的情況可以直接布單實例的ELK,大致如下:
日志搜集部分的logstash可以部署在多臺機器上,當然,也可以采用其他日志收集工具,比如Filebeat,rsyslog,fluent等。這種架構每個環節都有單點故障的可能,而且沒有分流的功能,一旦出現數據量激增的情況可能中間的某個組件就掛了。
生產環境中會在上述架構的基礎上增加一些高可用的特性,示例如下:
這里首先注意到的是增加了一個消息隊列來削峰填谷,在收集數據完之后,這里還用logstash做了數據過濾,格式轉換等數據處理工作(可選),elasticsearch采用集群的方式部署(圖中未體現出來)。
Kubernetes Logging Architecture
在k8s官網中,對于日志處理的理論部分說的還是挺詳細的??偨Y如下:
在k8s日志收集方案中,大致可以分為三個級別,第一級別是pod中程序產生的應用日志,第二個級別是node級別的系統日志,第三個級別是集群級別的日志收集方案。
首先是pod級別的日志,默認指定程序輸出到標準輸出,然后就可以通過kubectl logs獲取到日志。node級別的日志收集方案,首先要考慮的就是容器中程序產生的日志,這部分日志可以通過容器配置中的log-driver來對日志進行日志管理。其他程序的日志可以指定日志輸出路徑(比如/var/log)。值得注意的是,這個級別的解決方案需要一個logrotate組件來對日志文件進行管理。常用的log-driver如下:
集群級別的日志解決方案,這種情況下就要使用ELK Stack了,同時還要考慮容器漂移問題。對于日志收集部分,有三種日志收集方案:
使用節點日志agent:也就是在node級別進行日志收集。一般使用DaemonSet部署在每個node中。這種方式優點是耗費資源少,因為只需部署在節點,且對應用無侵入。缺點是只適合容器內應用日志必須都是標準輸出。
使用sidecar container作為容器日志agent:也就是在pod中跟隨應用容器起一個日志處理容器,有兩種形式:一種是直接將應用容器的日志收集并輸出到標準輸出(叫做Streaming sidecar container),如下圖:
還有一種是將應用容器日志直接輸出到日志收集后端,也就是每一個pod中都起一個日志收集agent(比如logstash或fluebtd)。如下圖:這種方式的優點是可以收集多種形式的日志(比如文件,socket等),缺點是耗費資源較多,每個pod都要起一個日志收集容器,相對來說,Streaming sidecar container的形式比較折中,既能收集多種形式的容器,耗費資源也沒有太多,因為起的日志處理容器僅僅是將多種形式的日志輸出到標準輸出而已。
在應用容器中直接將日志推到存儲后端。
EFK 實踐
接下來是具體實踐,以k8s項目的addon中的EFK為例,因為版本不同,我這邊的k8s是1.7.3,而github中的是1.8,所以yaml文件做了一些更改,主要是api-version的改變。
首先說明一下該實踐的大體架構:每個節點以daemonset的形式跑一個fluentd,收集節點日志,收集的數據存儲到ES中,最終通過Kibana可視化。這種部署方式只能收集容器應用日志輸出到標準輸出。而且,因為沒有對ES加驗證,且存儲方式不是持久存儲,所以不能在生產環境中使用。
部署Elasticsearch
在部署ES之前,首先看一下docker 的log-driver配置,修改為json-file,默認的可能是journald。fluentd要讀取/var/log/containers/目錄下的log日志,這些日志是從/var/lib/docker/containers/${CONTAINER_ID}/${CONTAINER_ID}-json.log鏈接過來的,如果log-driver是journald,就會讀取不到:
vim /etc/sysconfig/docker
OPTIONS='--selinux-enabled --log-driver=json-file --signature-verification=false'
........
部署es-statefulset:es-statefulset.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
? name: elasticsearch-logging
? namespace: kube-system
? labels:
?? k8s-app: elasticsearch-logging
?? kubernetes.io/cluster-service: "true"
?? addonmanager.kubernetes.io/mode: Reconcile
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
? name: elasticsearch-logging
? labels:
?? k8s-app: elasticsearch-logging
?? kubernetes.io/cluster-service: "true"
?? addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
? - ""
? resources:
? - "services"
? - "namespaces"
? - "endpoints"
? verbs:
? - "get"
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
? namespace: kube-system
? name: elasticsearch-logging
? labels:
?? k8s-app: elasticsearch-logging
?? kubernetes.io/cluster-service: "true"
?? addonmanager.kubernetes.io/mode: Reconcile
subjects:
- kind: ServiceAccount
? name: elasticsearch-logging
? namespace: kube-system
? apiGroup: ""
roleRef:
? kind: ClusterRole
? name: elasticsearch-logging
? apiGroup: ""
---
# Elasticsearch deployment itself
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
? name: elasticsearch-logging
? namespace: kube-system
? labels:
?? k8s-app: elasticsearch-logging
?? version: v5.6.4
?? kubernetes.io/cluster-service: "true"
?? addonmanager.kubernetes.io/mode: Reconcile
spec:
? serviceName: elasticsearch-logging
? replicas: 2
? selector:
?? matchLabels:
? ?? k8s-app: elasticsearch-logging
? ?? version: v5.6.4
? template:
?? metadata:
? ?? labels:
? ? ?? k8s-app: elasticsearch-logging
? ? ?? version: v5.6.4
? ? ?? kubernetes.io/cluster-service: "true"
?? spec:
? ?? serviceAccountName: elasticsearch-logging
? ?? containers:
? ?? - image: registry.cn-qingdao.aliyuncs.com/zhangchen-aisino/elasticsearch:v5.6.4
? ? ?? name: elasticsearch-logging
? ? ?? resources:
? ? ? ?? # need more cpu upon initialization, therefore burstable class
? ? ? ?? limits:
? ? ? ? ?? cpu: 1000m
? ? ? ?? requests:
? ? ? ? ?? cpu: 100m
? ? ?? ports:
? ? ?? - containerPort: 9200
? ? ? ?? name: db
? ? ? ?? protocol: TCP
? ? ?? - containerPort: 9300
? ? ? ?? name: transport
? ? ? ?? protocol: TCP
? ? ?? volumeMounts:
? ? ?? - name: elasticsearch-logging
? ? ? ?? mountPath: /data
? ? ?? env:
? ? ?? - name: "NAMESPACE"
? ? ? ?? valueFrom:
? ? ? ? ?? fieldRef:
? ? ? ? ? ?? fieldPath: metadata.namespace
? ?? # Elasticsearch requires vm.max_map_count to be at least 262144.
? ?? # If your OS already sets up this number to a higher value, feel free
? ?? # to remove this init container.
? ?? volumes:
? ?? - name: elasticsearch-logging
? ? ?? emptyDir: {}
? ?? # Elasticsearch requires vm.max_map_count to be at least 262144.
? ?? # If your OS already sets up this number to a higher value, feel free
? ?? # to remove this init container.
? ?? initContainers:
? ?? - image: alpine:3.6
? ? ?? command: ["/sbin/sysctl", "-w", "vm.max_map_count=262144"]
? ? ?? name: elasticsearch-logging-init
? ? ?? securityContext:
? ? ? ?? privileged: true
部署es-service:es-service.yaml
apiVersion: v1
kind: Service
metadata:
? name: elasticsearch-logging
? namespace: kube-system
? labels:
?? k8s-app: elasticsearch-logging
?? kubernetes.io/cluster-service: "true"
?? addonmanager.kubernetes.io/mode: Reconcile
?? kubernetes.io/name: "Elasticsearch"
spec:
? ports:
? - port: 9200
?? protocol: TCP
?? targetPort: db
? selector:
?? k8s-app: elasticsearch-logging
部署Fluentd
部署fluentd配置文件:
kind: ConfigMap
apiVersion: v1
data:
? containers.input.conf: |-
?? <source>
? ?? type tail
? ?? path /var/log/containers/*.log
? ?? pos_file /var/log/es-containers.log.pos
? ?? time_format %Y-%m-%dT%H:%M:%S.%NZ
? ?? tag kubernetes.*
? ?? read_from_head true
? ?? format multi_format
? ?? <pattern>
? ? ?? format json
? ? ?? time_key time
? ? ?? time_format %Y-%m-%dT%H:%M:%S.%NZ
? ?? </pattern>
? ?? <pattern>
? ? ?? format /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
? ? ?? time_format %Y-%m-%dT%H:%M:%S.%N%:z
? ?? </pattern>
?? </source>
? system.input.conf: |-
?? # Example:
?? # 2015-12-21 23:17:22,066 [salt.state ? ? ? ][INFO ?? ] Completed state [net.ipv4.ip_forward] at time 23:17:22.066081
?? <source>
? ?? type tail
? ?? format /^(?<time>[^ ]* [^ ,]*)[^\[]*\[[^\]]*\]\[(?<severity>[^ \]]*) *\] (?<message>.*)$/
? ?? time_format %Y-%m-%d %H:%M:%S
? ?? path /var/log/salt/minion
? ?? pos_file /var/log/es-salt.pos
? ?? tag salt
?? </source>
?? # Example:
?? # Dec 21 23:17:22 gke-foo-1-1-4b5cbd14-node-4eoj startupscript: Finished running startup script /var/run/google.startup.script
?? <source>
? ?? type tail
? ?? format syslog
? ?? path /var/log/startupscript.log
? ?? pos_file /var/log/es-startupscript.log.pos
? ?? tag startupscript
?? </source>
?? # Examples:
?? # time="2016-02-04T06:51:03.053580605Z" level=info msg="GET /containers/json"
?? # time="2016-02-04T07:53:57.505612354Z" level=error msg="HTTP Error" err="No such image: -f" statusCode=404
?? <source>
? ?? type tail
? ?? format /^time="(?<time>[^)]*)" level=(?<severity>[^ ]*) msg="(?<message>[^"]*)"( err="(?<error>[^"]*)")?( statusCode=($<status_code>\d+))?/
? ?? path /var/log/docker.log
? ?? pos_file /var/log/es-docker.log.pos
? ?? tag docker
?? </source>
?? # Example:
?? # 2016/02/04 06:52:38 filePurge: successfully removed file /var/etcd/data/member/wal/00000000000006d0-00000000010a23d1.wal
?? <source>
? ?? type tail
? ?? # Not parsing this, because it doesn't have anything particularly useful to
? ?? # parse out of it (like severities).
? ?? format none
? ?? path /var/log/etcd.log
? ?? pos_file /var/log/es-etcd.log.pos
? ?? tag etcd
?? </source>
?? # Multi-line parsing is required for all the kube logs because very large log
?? # statements, such as those that include entire object bodies, get split into
?? # multiple lines by glog.
?? # Example:
?? # I0204 07:32:30.020537 ?? 3368 server.go:1048] POST /stats/container/: (13.972191ms) 200 [[Go-http-client/1.1] 10.244.1.3:40537]
?? <source>
? ?? type tail
? ?? format multiline
? ?? multiline_flush_interval 5s
? ?? format_firstline /^\w\d{4}/
? ?? format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
? ?? time_format %m%d %H:%M:%S.%N
? ?? path /var/log/kubelet.log
? ?? pos_file /var/log/es-kubelet.log.pos
? ?? tag kubelet
?? </source>
?? # Example:
?? # I1118 21:26:53.975789 ? ? ? 6 proxier.go:1096] Port "nodePort for kube-system/default-http-backend:http" (:31429/tcp) was open before and is still needed
?? <source>
? ?? type tail
? ?? format multiline
? ?? multiline_flush_interval 5s
? ?? format_firstline /^\w\d{4}/
? ?? format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
? ?? time_format %m%d %H:%M:%S.%N
? ?? path /var/log/kube-proxy.log
? ?? pos_file /var/log/es-kube-proxy.log.pos
? ?? tag kube-proxy
?? </source>
?? # Example:
?? # I0204 07:00:19.604280 ? ? ? 5 handlers.go:131] GET /api/v1/nodes: (1.624207ms) 200 [[kube-controller-manager/v1.1.3 (linux/amd64) kubernetes/6a81b50] 127.0.0.1:38266]
?? <source>
? ?? type tail
? ?? format multiline
? ?? multiline_flush_interval 5s
? ?? format_firstline /^\w\d{4}/
? ?? format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
? ?? time_format %m%d %H:%M:%S.%N
? ?? path /var/log/kube-apiserver.log
? ?? pos_file /var/log/es-kube-apiserver.log.pos
? ?? tag kube-apiserver
?? </source>
?? # Example:
?? # I0204 06:55:31.872680 ? ? ? 5 servicecontroller.go:277] LB already exists and doesn't need update for service kube-system/kube-ui
?? <source>
? ?? type tail
? ?? format multiline
? ?? multiline_flush_interval 5s
? ?? format_firstline /^\w\d{4}/
? ?? format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
? ?? time_format %m%d %H:%M:%S.%N
? ?? path /var/log/kube-controller-manager.log
? ?? pos_file /var/log/es-kube-controller-manager.log.pos
? ?? tag kube-controller-manager
?? </source>
?? # Example:
?? # W0204 06:49:18.239674 ? ? ? 7 reflector.go:245] pkg/scheduler/factory/factory.go:193: watch of *api.Service ended with: 401: The event in requested index is outdated and cleared (the requested history has been cleared [2578313/2577886]) [2579312]
?? <source>
? ?? type tail
? ?? format multiline
? ?? multiline_flush_interval 5s
? ?? format_firstline /^\w\d{4}/
? ?? format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
? ?? time_format %m%d %H:%M:%S.%N
? ?? path /var/log/kube-scheduler.log
? ?? pos_file /var/log/es-kube-scheduler.log.pos
? ?? tag kube-scheduler
?? </source>
?? # Example:
?? # I1104 10:36:20.242766 ? ? ? 5 rescheduler.go:73] Running Rescheduler
?? <source>
? ?? type tail
? ?? format multiline
? ?? multiline_flush_interval 5s
? ?? format_firstline /^\w\d{4}/
? ?? format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
? ?? time_format %m%d %H:%M:%S.%N
? ?? path /var/log/rescheduler.log
? ?? pos_file /var/log/es-rescheduler.log.pos
? ?? tag rescheduler
?? </source>
?? # Example:
?? # I0603 15:31:05.793605 ? ? ? 6 cluster_manager.go:230] Reading config from path /etc/gce.conf
?? <source>
? ?? type tail
? ?? format multiline
? ?? multiline_flush_interval 5s
? ?? format_firstline /^\w\d{4}/
? ?? format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
? ?? time_format %m%d %H:%M:%S.%N
? ?? path /var/log/glbc.log
? ?? pos_file /var/log/es-glbc.log.pos
? ?? tag glbc
?? </source>
?? # Example:
?? # I0603 15:31:05.793605 ? ? ? 6 cluster_manager.go:230] Reading config from path /etc/gce.conf
?? <source>
? ?? type tail
? ?? format multiline
? ?? multiline_flush_interval 5s
? ?? format_firstline /^\w\d{4}/
? ?? format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
? ?? time_format %m%d %H:%M:%S.%N
? ?? path /var/log/cluster-autoscaler.log
? ?? pos_file /var/log/es-cluster-autoscaler.log.pos
? ?? tag cluster-autoscaler
?? </source>
?? # Logs from systemd-journal for interesting services.
?? <source>
? ?? type systemd
? ?? filters [{ "_SYSTEMD_UNIT": "docker.service" }]
? ?? pos_file /var/log/gcp-journald-docker.pos
? ?? read_from_head true
? ?? tag docker
?? </source>
?? <source>
? ?? type systemd
? ?? filters [{ "_SYSTEMD_UNIT": "kubelet.service" }]
? ?? pos_file /var/log/gcp-journald-kubelet.pos
? ?? read_from_head true
? ?? tag kubelet
?? </source>
?? <source>
? ?? type systemd
? ?? filters [{ "_SYSTEMD_UNIT": "node-problem-detector.service" }]
? ?? pos_file /var/log/gcp-journald-node-problem-detector.pos
? ?? read_from_head true
? ?? tag node-problem-detector
?? </source>
? forward.input.conf: |-
?? # Takes the messages sent over TCP
?? <source>
? ?? type forward
?? </source>
? monitoring.conf: |-
?? # Prometheus Exporter Plugin
?? # input plugin that exports metrics
?? <source>
? ?? @type prometheus
?? </source>
?? <source>
? ?? @type monitor_agent
?? </source>
?? # input plugin that collects metrics from MonitorAgent
?? <source>
? ?? @type prometheus_monitor
? ?? <labels>
? ? ?? host ${hostname}
? ?? </labels>
?? </source>
?? # input plugin that collects metrics for output plugin
?? <source>
? ?? @type prometheus_output_monitor
? ?? <labels>
? ? ?? host ${hostname}
? ?? </labels>
?? </source>
?? # input plugin that collects metrics for in_tail plugin
?? <source>
? ?? @type prometheus_tail_monitor
? ?? <labels>
? ? ?? host ${hostname}
? ?? </labels>
?? </source>
? output.conf: |-
?? # Enriches records with Kubernetes metadata
?? <filter kubernetes.**>
? ?? type kubernetes_metadata
?? </filter>
?? <match **>
? ? ? type elasticsearch
? ? ? log_level info
? ? ? include_tag_key true
? ? ? host elasticsearch-logging
? ? ? port 9200
? ? ? logstash_format true
? ? ? # Set the chunk limits.
? ? ? buffer_chunk_limit 2M
? ? ? buffer_queue_limit 8
? ? ? flush_interval 5s
? ? ? # Never wait longer than 5 minutes between retries.
? ? ? max_retry_wait 30
? ? ? # Disable the limit on the number of retries (retry forever).
? ? ? disable_retry_limit
? ? ? # Use multiple threads for processing.
? ? ? num_threads 2
?? </match>
metadata:
? name: fluentd-es-config-v0.1.1
? namespace: kube-system
? labels:
?? addonmanager.kubernetes.io/mode: Reconcile
在部署fluentd-daemonset之前,先要給k8s node 添加label,beta.kubernetes.io/fluentd-ds-ready: “true”,因為fluentd-daemonset是根據這個label進行node selector。可以通過kubectl label 命令添加:
kubectl label node/node1? beta.kubernetes.io/fluentd-ds-ready:="true"
然后部署fluentd-daemonsetfluentd-es-ds.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
? name: fluentd-es
? namespace: kube-system
? labels:
?? k8s-app: fluentd-es
?? kubernetes.io/cluster-service: "true"
?? addonmanager.kubernetes.io/mode: Reconcile
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
? name: fluentd-es
? labels:
?? k8s-app: fluentd-es
?? kubernetes.io/cluster-service: "true"
?? addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
? - ""
? resources:
? - "namespaces"
? - "pods"
? verbs:
? - "get"
? - "watch"
? - "list"
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
? name: fluentd-es
? labels:
?? k8s-app: fluentd-es
?? kubernetes.io/cluster-service: "true"
?? addonmanager.kubernetes.io/mode: Reconcile
subjects:
- kind: ServiceAccount
? name: fluentd-es
? namespace: kube-system
? apiGroup: ""
roleRef:
? kind: ClusterRole
? name: fluentd-es
? apiGroup: ""
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
? name: fluentd-es-v2.0.2
? namespace: kube-system
? labels:
?? k8s-app: fluentd-es
?? version: v2.0.2
?? kubernetes.io/cluster-service: "true"
?? addonmanager.kubernetes.io/mode: Reconcile
spec:
? template:
?? metadata:
? ?? labels:
? ? ?? k8s-app: fluentd-es
? ? ?? kubernetes.io/cluster-service: "true"
? ? ?? version: v2.0.2
? ?? # This annotation ensures that fluentd does not get evicted if the node
? ?? # supports critical pod annotation based priority scheme.
? ?? # Note that this does not guarantee admission on the nodes (#40573).
? ?? annotations:
? ? ?? scheduler.alpha.kubernetes.io/critical-pod: ''
?? spec:
? ?? serviceAccountName: fluentd-es
? ?? containers:
? ?? - name: fluentd-es
? ? ?? image: registry.cn-qingdao.aliyuncs.com/zhangchen-aisino/fluentd-elasticsearch:v2.0.2
? ? ?? env:
? ? ?? - name: FLUENTD_ARGS
? ? ? ?? value: --no-supervisor -q
? ? ?? resources:
? ? ? ?? limits:
? ? ? ? ?? memory: 500Mi
? ? ? ?? requests:
? ? ? ? ?? cpu: 100m
? ? ? ? ?? memory: 200Mi
? ? ?? volumeMounts:
? ? ?? - name: varlog
? ? ? ?? mountPath: /var/log
? ? ?? - name: varlibdockercontainers
? ? ? ?? mountPath: /var/lib/docker/containers
? ? ? ?? readOnly: true
? ? ?? - name: libsystemddir
? ? ? ?? mountPath: /host/lib
? ? ? ?? readOnly: true
? ? ?? - name: config-volume
? ? ? ?? mountPath: /etc/fluent/config.d
? ?? nodeSelector:
? ? ?? beta.kubernetes.io/fluentd-ds-ready: "true"
? ?? terminationGracePeriodSeconds: 30
? ?? volumes:
? ?? - name: varlog
? ? ?? hostPath:
? ? ? ?? path: /var/log
? ?? - name: varlibdockercontainers
? ? ?? hostPath:
? ? ? ?? path: /var/lib/docker/containers
? ?? # It is needed to copy systemd library to decompress journals
? ?? - name: libsystemddir
? ? ?? hostPath:
? ? ? ?? path: /usr/lib64
? ?? - name: config-volume
? ? ?? configMap:
? ? ? ?? name: fluentd-es-config-v0.1.1
部署 Kibana
部署kibana-deployment:kibana-deployment.yaml
apiVersion: apps/v1beta1
kind: Deployment
metadata:
? name: kibana-logging
? namespace: kube-system
? labels:
?? k8s-app: kibana-logging
?? kubernetes.io/cluster-service: "true"
?? addonmanager.kubernetes.io/mode: Reconcile
spec:
? replicas: 1
? selector:
?? matchLabels:
? ?? k8s-app: kibana-logging
? template:
?? metadata:
? ?? labels:
? ? ?? k8s-app: kibana-logging
?? spec:
? ?? containers:
? ?? - name: kibana-logging
? ? ?? image: docker.elastic.co/kibana/kibana:5.6.4
? ? ?? resources:
? ? ? ?? # need more cpu upon initialization, therefore burstable class
? ? ? ?? limits:
? ? ? ? ?? cpu: 1000m
? ? ? ?? requests:
? ? ? ? ?? cpu: 100m
? ? ?? env:
? ? ? ?? - name: ELASTICSEARCH_URL
? ? ? ? ?? value: http://elasticsearch-logging:9200
? ? ? ?? - name: SERVER_BASEPATH
? ? ? ? ?? value: /api/v1/proxy/namespaces/kube-system/services/kibana-logging
? ? ? ?? - name: XPACK_MONITORING_ENABLED
? ? ? ? ?? value: "false"
? ? ? ?? - name: XPACK_SECURITY_ENABLED
? ? ? ? ?? value: "false"
? ? ?? ports:
? ? ?? - containerPort: 5601
? ? ? ?? name: ui
? ? ? ?? protocol: TCP
部署kibana-service:kibana-service.yaml
apiVersion: v1
kind: Service
metadata:
? name: kibana-logging
? namespace: kube-system
? labels:
?? k8s-app: kibana-logging
?? kubernetes.io/cluster-service: "true"
?? addonmanager.kubernetes.io/mode: Reconcile
?? kubernetes.io/name: "Kibana"
spec:
? ports:
? - port: 5601
?? protocol: TCP
?? targetPort: ui
? selector:
?? k8s-app: kibana-logging
等到所有 pod都running之后,執行kubectl proxy 命令:
kubectl proxy --address='172.16.21.250' --port=8086 --accept-hosts='^*$'
登錄http://172.16.21.250:8086/api/v1/proxy/namespaces/kube-system/services/kibana-logging/app/kibana 即進入kibana的界面,在doscovery中 創建index,便可以看到ES中的日志數據了。
問題匯總
kibana使用Nodeport之后,本以為可以直接使用Nodeport連接,但是會報404 status 錯誤,在搜索之后,大概明白一點,如果啟動參數中添加了server.basePath,那么一般是需要在前端做一個反向代理來重定向。在kibana的yaml文件中刪除SERVER_BASEPATH該環境變量后,可以正常訪問。
之后,嘗試將ES的數據存儲放到ceph中,yaml文件老寫不對,最終嘗試成功,文件如下:
apiVersion: v1
kind: ServiceAccount
metadata:
? name: elasticsearch-logging
? namespace: kube-system
? labels:
?? k8s-app: elasticsearch-logging
?? kubernetes.io/cluster-service: "true"
?? addonmanager.kubernetes.io/mode: Reconcile
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
? name: elasticsearch-logging
? labels:
?? k8s-app: elasticsearch-logging
?? kubernetes.io/cluster-service: "true"
?? addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
? - ""
? resources:
? - "services"
? - "namespaces"
? - "endpoints"
? verbs:
? - "get"
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
? namespace: kube-system
? name: elasticsearch-logging
? labels:
?? k8s-app: elasticsearch-logging
?? kubernetes.io/cluster-service: "true"
?? addonmanager.kubernetes.io/mode: Reconcile
subjects:
- kind: ServiceAccount
? name: elasticsearch-logging
? namespace: kube-system
? apiGroup: ""
roleRef:
? kind: ClusterRole
? name: elasticsearch-logging
? apiGroup: ""
---
# Elasticsearch deployment itself
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
? name: elasticsearch-logging
? namespace: kube-system
? labels:
?? k8s-app: elasticsearch-logging
?? version: v5.6.4
?? kubernetes.io/cluster-service: "true"
?? addonmanager.kubernetes.io/mode: Reconcile
spec:
? serviceName: elasticsearch-logging
? replicas: 2
? selector:
?? matchLabels:
? ?? k8s-app: elasticsearch-logging
? ?? version: v5.6.4
? template:
?? metadata:
? ?? labels:
? ? ?? k8s-app: elasticsearch-logging
? ? ?? version: v5.6.4
? ? ?? kubernetes.io/cluster-service: "true"
?? spec:
? ?? serviceAccountName: elasticsearch-logging
? ?? containers:
? ?? - image: registry.cn-qingdao.aliyuncs.com/zhangchen-aisino/elasticsearch:v5.6.4
? ? ?? name: elasticsearch-logging
? ? ?? resources:
? ? ? ?? # need more cpu upon initialization, therefore burstable class
? ? ? ?? limits:
? ? ? ? ?? cpu: 1000m
? ? ? ?? requests:
? ? ? ? ?? cpu: 100m
? ? ?? ports:
? ? ?? - containerPort: 9200
? ? ? ?? name: db
? ? ? ?? protocol: TCP
? ? ?? - containerPort: 9300
? ? ? ?? name: transport
? ? ? ?? protocol: TCP
? ? ?? volumeMounts:
? ? ?? - name: elasticsearch-logging
? ? ? ?? mountPath: /data
? ? ?? env:
? ? ?? - name: "NAMESPACE"
? ? ? ?? valueFrom:
? ? ? ? ?? fieldRef:
? ? ? ? ? ?? fieldPath: metadata.namespace
? ?? #volumes:
? ?? #- name: elasticsearch-logging
? ? ? #emptyDir: {}
? ?? # Elasticsearch requires vm.max_map_count to be at least 262144.
? ?? # If your OS already sets up this number to a higher value, feel free
? ?? # to remove this init container.
? ?? initContainers:
? ?? - image: alpine:3.6
? ? ?? command: ["/sbin/sysctl", "-w", "vm.max_map_count=262144"]
? ? ?? name: elasticsearch-logging-init
? ? ?? securityContext:
? ? ? ?? privileged: true
? volumeClaimTemplates:
? - metadata:
? ?? name: elasticsearch-logging
? ?? annotations:
? ? ?? volume.beta.kubernetes.io/storage-class: "ceph-web"
?? spec:
? ?? accessModes: [ "ReadWriteOnce" ]
? ?? resources:
? ? ?? requests:
? ? ? ?? storage: 50Gi