原文内容:https://gitee.com/dev-99cloud/training-kubernetes ,在此基础上有新增。

Lesson 08:Advance

8.1 监控、日志、排错

  • 监控:Grafana / Prometheus / AlertManager
  • 日志:ElasticSearch / Fluent ( Logstash ) / Kibana
  • 排错:
    • 怎么对 pod 的网口抓包?

      # 查看指定 pod 运行在那个 node 上
      kubectl describe pod <pod> -n <namespace>
      
      # 获得容器的 pid
      docker inspect -f {{.State.Pid}} <container>
      
      # 进入该容器的 network namespace
      nsenter --target <PID> -n
      
      # 使用 tcpdump 抓包,指定 eth0 网卡
      tcpdump -i eth0 tcp and port 80 -vvv
      
      # 或者抓包并导出到文件
      tcpdump -i eth0 -w ./out.cap
      
      # 退出 nsenter,要记得退出!!
      exit
      
    • Debug tools

      apiVersion: v1
      kind: Pod
      metadata:
        name: demo-pod
        labels:
          app: demo-pod
      spec:
        containers:
          - name: nginx
            image: nginx
            ports:
              - containerPort: 80
            env:
              - name: DEMO_GREETING
                value: "Hello from the environment"
              - name: DEMO_FAREWELL
                value: "Such a sweet sorrow"
          - name: busybox
            image: busybox
            args:
              - sleep
              - "1000000"
      
      root@ckatest001:~# kubectl exec -it demo-pod -c busybox -- /bin/sh
      / # ping 192.168.209.193
      PING 192.168.209.193 (192.168.209.193): 56 data bytes
      64 bytes from 192.168.209.193: seq=0 ttl=63 time=0.099 ms
      64 bytes from 192.168.209.193: seq=1 ttl=63 time=0.093 ms
      64 bytes from 192.168.209.193: seq=2 ttl=63 time=0.089 ms
      
      root@ckalab001:~# tcpdump -i eth0 udp
      12:34:10.972395 IP 45.77.183.254.vultr.com.42125 > 45.32.33.135.vultr.com.4789: VXLAN, flags [I] (0x08), vni 4096
      IP 192.168.208.4 > 192.168.209.193: ICMP echo request, id 3072, seq 0, length 64
      12:34:10.972753 IP 45.32.33.135.vultr.com.41062 > 45.77.183.254.vultr.com.4789: VXLAN, flags [I] (0x08), vni 4096
      IP 192.168.209.193 > 192.168.208.4: ICMP echo reply, id 3072, seq 0, length 64
      12:34:11.972537 IP 45.77.183.254.vultr.com.42125 > 45.32.33.135.vultr.com.4789: VXLAN, flags [I] (0x08), vni 4096
      IP 192.168.208.4 > 192.168.209.193: ICMP echo request, id 3072, seq 1, length 64
      

      在阿里云上比较特别,阿里云把 calico / flannel 的 vxlan 魔改成路由协议了,因此需要在 VPC 上加路由才行,每个节点一条路由协议。该节点的 pod 网段都发给该节点的 node ip。这篇 KB 非常含糊其辞,翻译一下就是不支持 vxlan,魔改 vxlan,底层用路由协议实现,所以需要在 VPC 上加路由。

      正常的 vxlan,

      root@ckalab001:~# ip r
      default via 45.77.182.1 dev ens3 proto dhcp src 45.77.183.254 metric 100
      192.168.208.1 dev cali65a032ad3e5 scope link
      192.168.209.192/26 via 192.168.209.192 dev vxlan.calico onlink
      
      root@ckalab001:~# ip a
      2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
          link/ether 56:00:02:d8:35:5a brd ff:ff:ff:ff:ff:ff
          inet 45.77.183.254/23 brd 45.77.183.255 scope global dynamic ens3
            valid_lft 73639sec preferred_lft 73639sec
          inet6 fe80::5400:2ff:fed8:355a/64 scope link
            valid_lft forever preferred_lft forever
      4: cali65a032ad3e5@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue state UP group default
          link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 0
          inet6 fe80::ecee:eeff:feee:eeee/64 scope link
            valid_lft forever preferred_lft forever
      9: vxlan.calico: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue state UNKNOWN group default
          link/ether 66:f1:80:3e:ea:c6 brd ff:ff:ff:ff:ff:ff
          inet 192.168.208.0/32 brd 192.168.208.0 scope global vxlan.calico
            valid_lft forever preferred_lft forever
          inet6 fe80::64f1:80ff:fe3e:eac6/64 scope link
            valid_lft forever preferred_lft forever
      

      阿里云魔改的 vxlan:192.168.209.192/26 via 172.31.43.146 dev eth0 proto 80 onlink,80 是 IGRP 网关间路由协议。所以跨 node 的 pod 间发 ping 包时,tcpdump 抓 tcp 包抓不到,抓 icmp 包可以抓到。

      root@ckalab001:~# ip r
      default via 172.31.47.253 dev eth0 proto dhcp src 172.31.43.145 metric 100
      192.168.208.1 dev cali77bffbebec8 scope link
      192.168.209.192/26 via 172.31.43.146 dev eth0 proto 80 onlink
      
      root@ckalab001:~# ip a
      2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
          link/ether 00:16:3e:08:2e:5f brd ff:ff:ff:ff:ff:ff
          inet 172.31.43.145/20 brd 172.31.47.255 scope global dynamic eth0
            valid_lft 315356000sec preferred_lft 315356000sec
          inet6 fe80::216:3eff:fe08:2e5f/64 scope link
            valid_lft forever preferred_lft forever
      4: cali77bffbebec8@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue state UP group default
          link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 0
          inet6 fe80::ecee:eeff:feee:eeee/64 scope link
            valid_lft forever preferred_lft forever
      8: vxlan.calico: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc noqueue state UNKNOWN group default
          link/ether 66:f1:80:3e:ea:c6 brd ff:ff:ff:ff:ff:ff
          inet 192.168.208.0/32 brd 192.168.208.0 scope global vxlan.calico
            valid_lft forever preferred_lft forever
          inet6 fe80::64f1:80ff:fe3e:eac6/64 scope link
            valid_lft forever preferred_lft forever
      

8.2 什么是 HPA / CA / VA?

  • 怎么理解 HPA / CA / VPA?

  • 配置 metrics server

    mkdir metrics
    cd metrics
    # 在 github 对应仓库中下载全部 yaml 文件
    # for file in auth-delegator.yaml auth-reader.yaml metrics-apiservice.yaml metrics-server-deployment.yaml metrics-server-service.yaml resource-reader.yaml ; do wget https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/metrics-server/$file;done
    # 最新版本可能会出现无法通过健康检查的问题,可以根据自己的 kubernetes 版本,选择相同的 metrics server 版本
    # v1.20.1 版本国内仓库下载
    for file in auth-delegator.yaml auth-reader.yaml metrics-apiservice.yaml metrics-server-deployment.yaml metrics-server-service.yaml resource-reader.yaml ; do wget https://gitee.com/dev-99cloud/training-kubernetes/raw/master/src/amd-lab/metrics-server/$file;done
    kubectl apply -f .
    

    修改 metrics-server-deployment.yaml

    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: metrics-server
      namespace: kube-system
      labels:
        kubernetes.io/cluster-service: "true"
        addonmanager.kubernetes.io/mode: Reconcile
    ---
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: metrics-server-config
      namespace: kube-system
      labels:
        kubernetes.io/cluster-service: "true"
        addonmanager.kubernetes.io/mode: EnsureExists
    data:
      NannyConfiguration: |-
        apiVersion: nannyconfig/v1alpha1
        kind: NannyConfiguration
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: metrics-server-v0.3.6
      namespace: kube-system
      labels:
        k8s-app: metrics-server
        kubernetes.io/cluster-service: "true"
        addonmanager.kubernetes.io/mode: Reconcile
        version: v0.3.6
    spec:
      selector:
        matchLabels:
          k8s-app: metrics-server
          version: v0.3.6
      template:
        metadata:
          name: metrics-server
          labels:
            k8s-app: metrics-server
            version: v0.3.6
        spec:
          securityContext:
            seccompProfile:
              type: RuntimeDefault
          priorityClassName: system-cluster-critical
          serviceAccountName: metrics-server
          nodeSelector:
            kubernetes.io/os: linux
          containers:
          - name: metrics-server
            # image: k8s.gcr.io/metrics-server-amd64:v0.3.6
            image: opsdockerimage/metrics-server-amd64:v0.3.6
            command:
            - /metrics-server
            - --metric-resolution=30s
            # These are needed for GKE, which doesn't support secure communication yet.
            # Remove these lines for non-GKE clusters, and when GKE supports token-based auth.
            # - --kubelet-port=10255
            # - --deprecated-kubelet-completely-insecure=true
            - --kubelet-insecure-tls
            - --kubelet-preferred-address-types=InternalIP
            # - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
            ports:
            - containerPort: 443
              name: https
              protocol: TCP
          - name: metrics-server-nanny
            # image: k8s.gcr.io/addon-resizer:1.8.11
            image: opsdockerimage/addon-resizer:1.8.11
            resources:
              limits:
                cpu: 100m
                memory: 300Mi
              requests:
                cpu: 5m
                memory: 50Mi
            env:
              - name: MY_POD_NAME
                valueFrom:
                  fieldRef:
                    fieldPath: metadata.name
              - name: MY_POD_NAMESPACE
                valueFrom:
                  fieldRef:
                    fieldPath: metadata.namespace
            volumeMounts:
            - name: metrics-server-config-volume
              mountPath: /etc/config
            command:
              - /pod_nanny
              - --config-dir=/etc/config
              # - --cpu={{ base_metrics_server_cpu }}
              - --extra-cpu=0.5m
              # - --memory={{ base_metrics_server_memory }}
              # - --extra-memory={{ metrics_server_memory_per_node }}Mi
              - --threshold=5
              - --deployment=metrics-server-v0.3.6
              - --container=metrics-server
              - --poll-period=300000
              - --estimator=exponential
              # Specifies the smallest cluster (defined in number of nodes)
              # resources will be scaled to.
              # - --minClusterSize={{ metrics_server_min_cluster_size }}
              - --minClusterSize=2
              # Use kube-apiserver metrics to avoid periodically listing nodes.
              - --use-metrics=true
          volumes:
            - name: metrics-server-config-volume
              configMap:
                name: metrics-server-config
          tolerations:
            - key: "CriticalAddonsOnly"
              operator: "Exists"
    

    修改 resource-reader.yaml

    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: system:metrics-server
      labels:
        kubernetes.io/cluster-service: "true"
        addonmanager.kubernetes.io/mode: Reconcile
    rules:
    - apiGroups:
      - ""
      resources:
      - pods
      - nodes
      # 添加 nodes/stats
      - nodes/stats
      - namespaces
      verbs:
      - get
      - list
      - watch
    - apiGroups:
      - "apps"
      resources:
      - deployments
      verbs:
      - get
      - list
      - update
      - watch
    - nonResourceURLs:
      - /metrics
      verbs:
      - get
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: system:metrics-server
      labels:
        kubernetes.io/cluster-service: "true"
        addonmanager.kubernetes.io/mode: Reconcile
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: system:metrics-server
    subjects:
    - kind: ServiceAccount
      name: metrics-server
      namespace: kube-system
    
  • 检查 metrics 是否可用

    # 查看 pod 是否正常运行
    kubectl get pods -n kube-system
    # 查看 api-versions ,正常情况下应当多出 metrics.k8s.io/v1beta1
    kubectl api-versions
    # 查看 node 监控指标
    kubectl top nodes
    NAME                      CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
    izuf6g226c4titrnrwds2tz   129m         6%     1500Mi          42%
    # 查看 pod 监控指标
    kubectl top pods
    NAME                                 CPU(cores)   MEMORY(bytes)
    myapp-backend-pod-58b7f5cf77-krmzh   0m           1Mi
    myapp-backend-pod-58b7f5cf77-vqlgl   0m           1Mi
    myapp-backend-pod-58b7f5cf77-z7j7z   0m           1Mi
    
    • 如果 metrics 不可用,报错 unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary ,可以尝试修改证书

      mkdir certs; cd certs
      cp /etc/kubernetes/pki/ca.crt ca.pem
      cp /etc/kubernetes/pki/ca.key ca-key.pem
      

      创建文件 kubelet-csr.json

      {
        "CN": "kubernetes",
        "hosts": [
          "127.0.0.1",
          "<node_name>",
          "kubernetes",
          "kubernetes.default",
          "kubernetes.default.svc",
          "kubernetes.default.svc.cluster",
          "kubernetes.default.svc.cluster.local"
        ],
        "key": {
          "algo": "rsa",
          "size": 2048
        },
        "names": [{
          "C": "US",
          "ST": "NY",
          "L": "City",
          "O": "Org",
          "OU": "Unit"
        }]
      }
      

      创建文件 ca-config.json

      {
        "signing": {
          "default": {
            "expiry": "8760h"
          },
          "profiles": {
            "kubernetes": {
              "usages": [
                "signing",
                "key encipherment",
                "server auth",
                "client auth"
              ],
              "expiry": "8760h"
            }
          }
        }
      }
      

      更新证书

      cfssl gencert -ca=ca.pem -ca-key=ca-key.pem --config=ca-config.json -profile=kubernetes kubelet-csr.json | cfssljson -bare kubelet
      scp kubelet.pem <nodeip>:/var/lib/kubelet/pki/kubelet.crt
      scp kubelet-key.pem <nodeip>:/var/lib/kubelet/pki/kubelet.key
      systemctl restart kubelet
      
  • 定制 docker 镜像

    FROM php:5-apache
    COPY index.php /var/www/html/index.php
    RUN chmod a+rx index.php
    

    index.php 文件内容如下:

    <?php
      $x = 0.0001;
      for ($i = 0; $i <= 1000000; $i++) {
          $x += sqrt($x);
      }
    ?>
    
  • 配置 Deployment 运行镜像并暴露服务

    php-apache.yaml:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: php-apache
    spec:
      selector:
        matchLabels:
          run: php-apache
      replicas: 1
      template:
        metadata:
          labels:
            run: php-apache
        spec:
          containers:
          - name: php-apache
            #image: k8s.gcr.io/hpa-example
            image: 0layfolk0/hpa-example
            ports:
            - containerPort: 80
            resources:
              limits:
                cpu: 50m
              requests:
                cpu: 20m
    
    ---
    
    apiVersion: v1
    kind: Service
    metadata:
      name: php-apache
      labels:
        run: php-apache
    spec:
      ports:
      - port: 80
      selector:
        run: php-apache
    
  • 创建 HPA

    # 实验过程中副本、 CPU 负载等变化需要一定的时间,不会即时改变,一般耗时几分钟
    # 创建一个 HPA 控制上一步骤中的 Deployment ,使副本数量维持在1-10之间,平均 CPU 利用率50%
    kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
    horizontalpodautoscaler.autoscaling/php-apache autoscaled
    # 查看Autoscaler状态
    kubectl get hpa
    # 在新终端中启动容器,增加负载
    kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"
    # 回到旧终端
    # 查看 CPU 负载情况,升高
    kubectl get hpa
    # 查看 deployment 的副本数量,增多
    kubectl get deployment php-apache
    # 新终端ctrl+C终止容器运行,旧终端检查负载状态, CPU 利用率应当降为0,副本数量变为1
    kubectl get hpa
    kubectl get deployment php-apache
    

8.3 什么是 Federation?

  • Kubenetes Federation vs ManageIQ

8.4 K8S 怎么处理有状态服务?

8.5 其它参考

  • K8S 测试
  • 搭建 OpenStack 测试环境
    • vultr 开 Ubuntu 18.04 / 20.4 Bare Metal 服务器

    • 安装 DevStack

    • 下载云镜像

      wget http://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud-2009.qcow2
      wget https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img
      wget https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img
      
    • 上传云镜像

      openstack image create --public --disk-format qcow2 --file focal-server-cloudimg-amd64.img ubuntu-20.04
      openstack image create --public --disk-format qcow2 --file bionic-server-cloudimg-amd64.img ubuntu-18.04
      openstack image create --public --disk-format qcow2 --file CentOS-7-x86_64-GenericCloud-2009.qcow2 centos-7.8
      
    • 启用 ipip

  • k3s 环境搭建
  • KubeEdge Demo