运维层
7.1 dashboard配置
安装k8s官方提供的dashboard
# wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml -o dashboard.yaml
# cat dashboard.yaml | grep image:
image: kubernetesui/dashboard:v2.7.0
image: kubernetesui/metrics-scraper:v1.0.8
转存到私仓,并修改镜像地址,如下
# cat dashboard.yaml | grep image:
image: harbor.demo.com/k8s/dashboard:v2.7.0
image: harbor.demo.com/k8s/metrics-scraper:v1.0.8
配置svc采用LB方式
# vi dashboard.yaml
...
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
spec:
ports:
- port: 443
targetPort: 8443
selector:
k8s-app: kubernetes-dashboard
type: LoadBalancer
...
# kubectl apply -f dashboard.yaml
# kubectl get all -n kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
pod/dashboard-metrics-scraper-d97df5556-vvv9w 1/1 Running 0 16s
pod/kubernetes-dashboard-6694866798-pcttp 1/1 Running 0 16s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/dashboard-metrics-scraper ClusterIP 10.10.153.173 <none> 8000/TCP 17s
service/kubernetes-dashboard LoadBalancer 10.10.186.6 192.168.3.182 443:31107/TCP 18s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/dashboard-metrics-scraper 1/1 1 1 17s
deployment.apps/kubernetes-dashboard 1/1 1 1 17s
NAME DESIRED CURRENT READY AGE
replicaset.apps/dashboard-metrics-scraper-d97df5556 1 1 1 17s
replicaset.apps/kubernetes-dashboard-6694866798 1 1 1 17s
添加用户
# cat admin-user.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kubernetes-dashboard
# kubectl apply -f admin-user.yaml
# kubectl -n kubernetes-dashboard create token admin-user
eyJhbGciOiJSUzI1NiIsImtpZCI6IjlUNmROZTZZSEJ4WEJIell2OG5IQS1oTGVLYjJWRU9QRlhzUFBmdlVONU0ifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNjg2NTQ3MTE3LCJpYXQiOjE2ODY1NDM1MTcsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJhZG1pbi11c2VyIiwidWlkIjoiNjk1MWFlODktODYwMi00NzAzLTk3NzYtMmNhNmU0OTJlZjQ2In19LCJuYmYiOjE2ODY1NDM1MTcsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlcm5ldGVzLWRhc2hib2FyZDphZG1pbi11c2VyIn0.j9XHrznphuwv56hcSGRlcOxvzuGGbKEdPZB1r5jc84kNICp2sTwXvr71d6wdYtzGxjODZ81kTqVqRQUcUKi0Uh8OWjxWcspNJIWk0y6_Eub823YWzkusktb7NdqCb6BYIyX79V4iFUQaVjp9BlEXSZ6vnuJhwvEonumDrIo0JtUF8PT1ZV3319kajFTZMWza-QHRMFOjGC74YleMd-7gDA-aimoxjPQIVfIWF2PhssLj38Ci-KZddxOE1yE42QFOmPozOzCT348ZEJEO1lhDC4trnK2TTU8jb1sM7RyPKuvyY0fbimqNi6iGL-aqCaQT6_nWDvxkVycapJ3KAwz2Zw
# nslookup dashboard.demo.com
Server: 192.168.3.250
Address: 192.168.3.250#53
Name: dashboard.demo.com
Address: 192.168.3.182
7.2 rancher
https://github.com/rancher/rancher
常用的k8s容器编排工具如openshift和rancher. 本文以rancher为例。rancher可以同时管理多个k8s集群。
rancher版本
latest 当前最新版(本文测试时其版本为v2.7.3)
stable 当前稳定版(本文测试时其版本为v2.6.12)
7.2.1 rancher节点安装
安装docker
yum -y install yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum -y install docker-ce docker-ce-cli containerd.io
systemctl enable docker containerd
systemctl start docker containerd
rancher镜像拉取(比较大,建议拉取后转存私仓)
# docker pull rancher/rancher:latest
创建rancher节点目录
mkdir -p /opt/racher
7.2.1.1 rancher-web被访问时的证书配置
将域名证书复制到/opt/racher/ca
将证书等改名,如下:
ca.pem ----> cacerts.pem
web-key.pem ---> key.pem
web.pem ---> cert.pem
# tree /opt/rancher/ca
/opt/rancher/ca
├── cacerts.pem
├── cert.pem
└── key.pem
7.2.1.2 私仓配置
# cat /opt/rancher/registries.yaml
mirrors:
harbor.demo.com:
endpoint:
- "harbor.demo.com"
configs:
"harbor.demo.com":
auth:
username: admin
password: 12qwaszx+pp
tls:
ca_file: /opt/harbor/ca.crt
cert_file: /opt/harbor/harbor.demo.com.cert
key_file: /opt/harbor/harbor.demo.com.key
其中的 /opt/harbor/ 目录是rancher节点运行时其容器内部的目录。
私仓库被访问时需使用的证书
# tree /opt/rancher/ca_harbor/
/opt/rancher/ca_harbor/
├── ca.crt
├── harbor.demo.com.cert
└── harbor.demo.com.key
在启动rancher时,需将 /opt/rancher/ca_harbor/ 映射到容器的 /opt/harbor/ 目录(在registries-https.yaml中已指定该目录)
7.2.1.3 安装rancher节点
# docker run -d -it -p 80:80 -p 443:443 --name rancher --privileged=true --restart=unless-stopped \
-v /opt/rancher/k8s:/var/lib/rancher \
-v /opt/rancher/ca:/etc/rancher/ssl \
-e SSL_CERT_DIR="/etc/rancher/ssl" \
-e CATTLE_SYSTEM_DEFAULT_REGISTRY=harbor.demo.com \
-v /opt/rancher/registries.yaml:/etc/rancher/k3s/registries.yaml \
-v /opt/rancher/ca_harbor:/opt/harbor \
rancher/rancher:latest
查看启动日志
# docker logs rancher -f
7.2.1.4 访问rancher web
# nslookup rancher.demo.com
Server: 192.168.3.250
Address: 192.168.3.250#53
Name: rancher.demo.com
Address: 10.2.20.151
查看默认admin用户密码
# docker exec -it rancher kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{{.data.bootstrapPassword|base64decode}}{{"\n"}}'
若忘记密码,则重设密码
# docker exec -it rancher reset-password
7.2.2 添加外部k8s集群
# curl --insecure -sfL https://rancher.demo.com/v3/import/rndjzbgwn78v6v6dx28dlngn7r7qrlwv4b949c47567ltjz7g76tqn_c-m-68r9m4vz.yaml -o rancher.yaml
# cat rancher.yaml | grep image:
image: rancher/rancher-agent:v2.7.3
改为私仓地址
# cat rancher.yaml | grep image:
image: harbor.demo.com/rancher/rancher-agent:v2.7.3
安装
# kubectl apply -f rancher.yaml
查看
# kubectl -n cattle-system get all
NAME READY STATUS RESTARTS AGE
pod/cattle-cluster-agent-5cb7bb7b9b-kc5fn 0/1 ContainerCreating 0 27s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/cattle-cluster-agent ClusterIP 10.10.104.246 <none> 80/TCP,443/TCP 27s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/cattle-cluster-agent 0/1 1 0 28s
NAME DESIRED CURRENT READY AGE
replicaset.apps/cattle-cluster-agent-5cb7bb7b9b 1 1 0 28s
查看某个pod
7.2.3 新增k8s集群
使用rancher建立新的k8s集群较简单,在目标节点上直接运行相应命令即可。
7.3 prometheus/grafana
https://github.com/prometheus/prometheus/
Prometheus是一个开源的系统监控和报警系统,是当前一套非常流行的开源监控和报警系统之一,CNCF托管的项目,在kubernetes容器管理系统中,通常会搭配prometheus进行监控,Prometheus性能足够支撑上万台规模的集群。 prometheus显著特点
- 多维数据模型(时间序列由metrics指标名字和设置key/value键/值的labels构成),高效的存储
- 灵活的查询语言(PromQL)
- 采用http协议,使用pull模式,拉取数据
- 通过中间网关支持推送。
- 丰富多样的Metrics采样器exporter。
- 与Grafana完美结合,由Grafana提供数据可视化能力。
Grafana 是一个用于可视化大型测量数据的开源系统,它的功能非常强大,界面也非常漂亮,使用它可以创建自定义的控制面板,你可以在面板中配置要显示的数据和显示方式,有大量第三方可视插件可使用。
Prometheus/Grafana支持多种方式安装,如源码、二进制、镜像等方式,都比较简单。
本文以kube-prometheus方式在k8s上安装Prometheus/Grafana.
官方安装文档: https://prometheus-operator.dev/docs/prologue/quick-start/
安装要求: https://github.com/prometheus-operator/kube-prometheus#compatibility
官方Github地址: https://github.com/prometheus-operator/kube-prometheus
7.3.1 kube-prometheus安装
下载
# git clone https://github.com/coreos/kube-prometheus.git
# cd kube-prometheus/manifests
查看所需镜像,并转存到私仓,同时修改清单文件中的镜像地址改为私仓
# find ./ | xargs grep image:
# cat prometheusOperator-deployment.yaml | grep prometheus-config-reloader
- --prometheus-config-reloader=quay.io/prometheus-operator/prometheus-config-reloader:v0.65.2
更改prometheusAdapter-clusterRoleAggregatedMetricsReader.yaml
# cat prometheusAdapter-clusterRoleAggregatedMetricsReader.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/component: metrics-adapter
app.kubernetes.io/name: prometheus-adapter
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 0.10.0
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rbac.authorization.k8s.io/aggregate-to-edit: "true"
rbac.authorization.k8s.io/aggregate-to-view: "true"
name: system:aggregated-metrics-reader
#namespace: monitoring
rules:
- apiGroups:
- metrics.k8s.io
resources:
- services
- endpoints
- pods
- nodes
verbs:
- get
- list
- watch
提示:
本配置定义的资源在安装metrics-reader时已提供,此时只需更新一下配置即可。
更改prometheus-clusterRole.yaml
下面的更改,参照istio中提供的prometheus配置
# cat prometheus-clusterRole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/component: prometheus
app.kubernetes.io/instance: k8s
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.44.0
name: prometheus-k8s
rules:
- apiGroups:
- ""
resources:
- nodes
- nodes/proxy
- nodes/metrics
- services
- endpoints
- pods
- ingresses
- configmaps
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
- "networking.k8s.io"
resources:
- ingresses/status
- ingresses
verbs:
- get
- list
- watch
- nonResourceURLs:
- "/metrics"
verbs:
- get
说明:
若采用kube-promethenus提供的配置,则在创建ServiceMonitor时不会被prometheus识别。
安装
# kubectl create -f setup/
# kubectl apply -f ./prometheusAdapter-clusterRoleAggregatedMetricsReader.yaml
# rm -f prometheusAdapter-clusterRoleAggregatedMetricsReader.yaml
# kubectl create -f ./
查看svc
# kubectl -n monitoring get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-main ClusterIP 10.10.251.51 <none> 9093/TCP,8080/TCP 7m18s
alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 6m51s
blackbox-exporter ClusterIP 10.10.195.115 <none> 9115/TCP,19115/TCP 7m17s
grafana ClusterIP 10.10.121.183 <none> 3000/TCP 7m13s
kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 7m12s
node-exporter ClusterIP None <none> 9100/TCP 7m10s
prometheus-k8s ClusterIP 10.10.230.211 <none> 9090/TCP,8080/TCP 7m9s
prometheus-operated ClusterIP None <none> 9090/TCP 6m48s
prometheus-operator ClusterIP None <none> 8443/TCP 7m8s
域名解析
prometheus.demo.com 192.168.3.180
grafana.demo.com 192.168.3.180
alert.demo.com 192.168.3.180
对外开放服务
# cat open-ui.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: monitor-prometheus
namespace: monitoring
spec:
ingressClassName: nginx
rules:
- host: prometheus.demo.com
http:
paths:
- backend:
service:
name: prometheus-k8s
port:
number: 9090
path: /
pathType: Prefix
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: monitor-grafana
namespace: monitoring
spec:
ingressClassName: nginx
rules:
- host: grafana.demo.com
http:
paths:
- backend:
service:
name: grafana
port:
number: 3000
path: /
pathType: Prefix
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: monitor-alert
namespace: monitoring
spec:
ingressClassName: nginx
rules:
- host: alert.demo.com
http:
paths:
- backend:
service:
name: alertmanager-main
port:
number: 9093
path: /
pathType: Prefix
# kubectl apply -f open-ui.yaml
访问 http://grafana.demo.com (默认用户名admin,密码admin)
kube-prometheus已配置一些模板,如: 可以从 https://grafana.com/grafana/dashboards 找到所需模板,如14518模板