CNI 网络插件

1. Flannel

Flannel 是 Kubernetes 生态系统中最早、最经典的容器网络插件(CNI) 之一,由 CoreOS 团队开发。它的核心目标是为一个集群中的所有 Pod 提供一个统一的、三层(IP层)的虚拟网络,使得不同节点上的 Pod 之间能够直接通过 IP 地址进行通信。Flannel 有多种模式,常用的主要有两种,VXLAN 以及 host-gw。

1.1. VXLAN 模式

工作原理:

  • 采用 VXLAN 模式实现的虚拟大二层网络,它的物理网络可以二层连通也可以三层连通。
  • VXLAN 是 Flannel 默认和推荐的模式。当我们使用默认配置安装 Flannel 时,它会为每个物理节点分配一个24位子网,该物理节点上的 Pod 就使用该网段,每个物理节点分配的段都不一样(例如物理节点1的网络为10.244.0.0/24,物理节点2的网络为10.244.1.0/24),并在每个节点上创建两张虚机网卡:cni0 和 flannel.1。
  • cni0 是一个网桥设备,类似于 docker0 ,节点上所有的 Pod 都通过 veth pair的形式与 cni0 相连。
# 进入容器内,查看容器内eth0网卡关联宿主机veth对的id号
cat /sys/class/net/eth0/iflink  # 如果结果为61

# 那么去宿主机上,执行
ip link show # 可以找到一个编号为61的veth对
Bash
  • flannel.1 则是一个 VXLAN 类型的设备,充当 VTEP 的角色,实现对 VXLAN 报文的封包解包。.1 后缀代表封包的 vni id 为1,这表示 Flannel 的 VXLAN 模式下所有 pod 都在一个二层网络里。

节点内通信:

节点内的容器间通信通过 cni0 网桥就能完成,不涉及任何 VXLAN 报文的封包解包。例如图例中,Node1的子网为10.244.0.1/24, PodA 10.244.0.20 和 PodB 10.224.0.21通过 cni0 网桥实现互通。

跨节点通信:

  • 发送端:在 PodA 中发起 ping 10.244.1.21,ICMP 报文经过 cni0 网桥后交由 flannel.1 设备处理。flannel.1 设备是一个 VXLAN 类型的设备,负责 VXLAN 封包解包。因此,在发送端,flannel.1 将原始 L2 报文封装成 VXLAN UDP 报文,然后从物理网卡 eth0 发送。
  • 接收端:Node2 收到 UDP 报文,发现是一个 VXLAN 类型报文,交由 flannel.1 进行解包。根据解包后得到的原始报文中的目的 IP,将原始报文经由 cni0 网桥发送给 PodB。

flanneld 从 etcd 中可以获取所有节点的子网情况,以此为依据为各节点配置路由,将属于非本节点的子网IP都路由到 flannel.1 处理,本节点的子网路由到 cni0 网桥处理。如果节点信息有变化, flanneld 也会同步的对路由信息做修改。

Flannel VXLAN 模式总结

Flannel 的 VXLAN 模式的核心为三张表,由 flanneld 进程维护,从 etcd 获取所有pod 网络信息写入这三张表中。

  • 路由表:(route -n)
    • 本地网段,交给 cni0。
    • 其他网段交给 flannel.1。
  • arp 表:(ip n)
    • 找到去往目标网段对应的 flannel.1设备的 mac 地址。
  • fdb 表:(bridge fdb show)
    • 根据 flannel.1 的 mac 地址定位到他在哪台物理机的 ip 地址。

VXLAN 模式的优缺点:

  • 优点:物理网络可以二层也可以是三层,通用性强。
  • 缺点:集群规模大时,转发效率较低。

1.2. host-gw 模式

host-gw 模式基于路由表来实现转发,核心就一句话:把每台物理节点变成路由器。

工作原理:

  • host-gw 的核心原理就是由 flanneld 守护进程在每台物理节点上配置路由信息,以实现pod 的跨主机通信,每台物理节点都充当容器通信的“网关”。这也正是“host-gw”的含义。
  • Flannel 子网和主机的信息保存在 Etcd。flanneld 只需要 WACTH 数据的变化实时更新路由表即可。
  • Flannel host-gw 模式必须要求集群宿主机之间在同一个局域网。

要使用 host-gw 模式,需要修改 ConfigMap kube-flannel-cfg ,将 Backend.Type 从vxlan 改为 host-gw,然后重启所有 kube-flannel Pod 即可。

host-gw 模式下的通信过程:

host-gw 模式的优缺点:

  • 优点:转发路径少,集群规模大的情况下,转发效率更高。
  • 缺点:
    • 要求物理网络必须是二层网络。
    • 在一些云平台上会限制你对云主机添加路由条目,而无法使用。

1.3. 部署示例

k8s 控制平面配置:

我们需要在 Kubernetes 控制平面开启分配 Pod 地址的设置,包括 kube-controller- anager、kube-proxy 的启动参数 --cluster-cidr=10.244.0.0/16,这个 Pod IP CIDR 地址池应与 Flannel 的配置一致。

安装 Flannel:

在 Kubernetes 集群中,Flannel 以 DaemonSet 模式运行,参考配置文件(kube-
lannel.yaml )内容如下。

# kube-flannel.yaml
---
apiVersion: v1
kind: Namespace
metadata:
  labels:
    k8s-app: flannel
    pod-security.kubernetes.io/enforce: privileged
  name: kube-flannel
---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: flannel
  name: flannel
  namespace: kube-flannel
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: flannel
  name: flannel
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes/status
  verbs:
  - patch
- apiGroups:
  - networking.k8s.io
  resources:
  - clustercidrs
  verbs:
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: flannel
  name: flannel
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
- kind: ServiceAccount
  name: flannel
  namespace: kube-flannel
---
# 在 net-conf.json 中配置 Pod IP CIDR 地址池,设置 Flannel 工作模式
apiVersion: v1
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
kind: ConfigMap
metadata:
  labels:
    app: flannel
    k8s-app: flannel
    tier: node
  name: kube-flannel-cfg
  namespace: kube-flannel
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    app: flannel
    k8s-app: flannel
    tier: node
  name: kube-flannel-ds
  namespace: kube-flannel
spec:
  selector:
    matchLabels:
      app: flannel
      k8s-app: flannel
  template:
    metadata:
      labels:
        app: flannel
        k8s-app: flannel
        tier: node
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/os
                operator: In
                values:
                - linux
      containers:
      - args:
        - --ip-masq
        - --kube-subnet-mgr
        command:
        - /opt/bin/flanneld
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: EVENT_QUEUE_DEPTH
          value: "5000"
        image: docker.io/flannel/flannel:v0.23.0
        imagePullPolicy: IfNotPresent
        name: kube-flannel
        resources:
          requests:
            cpu: 100m
            memory: 50Mi
        securityContext:
          capabilities:
            add:
            - NET_ADMIN
            - NET_RAW
          privileged: false
        volumeMounts:
        - mountPath: /run/flannel
          name: run
        - mountPath: /etc/kube-flannel/
          name: flannel-cfg
        - mountPath: /run/xtables.lock
          name: xtables-lock
      hostNetwork: true
      initContainers:
      - args:
        - -f
        - /flannel
        - /opt/cni/bin/flannel
        command:
        - cp
        image: docker.io/flannel/flannel-cni-plugin:v1.2.0
        imagePullPolicy: IfNotPresent
        name: install-cni-plugin
        volumeMounts:
        - mountPath: /opt/cni/bin
          name: cni-plugin
      - args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        command:
        - cp
        image: docker.io/flannel/flannel:v0.23.0
        imagePullPolicy: IfNotPresent
        name: install-cni
        volumeMounts:
        - mountPath: /etc/cni/net.d
          name: cni
        - mountPath: /etc/kube-flannel/
          name: flannel-cfg
      priorityClassName: system-node-critical
      serviceAccountName: flannel
      tolerations:
      - effect: NoSchedule
        operator: Exists
      volumes:
      - hostPath:
          path: /run/flannel
        name: run
      - hostPath:
          path: /opt/cni/bin
        name: cni-plugin
      - hostPath:
          path: /etc/cni/net.d
        name: cni
      - configMap:
          name: kube-flannel-cfg
        name: flannel-cfg
      - hostPath:
          path: /run/xtables.lock
          type: FileOrCreate
        name: xtables-lock
YAML

2. Calico

2.1. BGP 模式

2.1.1. 概述

Calico 的 BGP 模式 与 Flannel 的 host-gw 模式几乎一样。也会在宿主机上,添加一个格式如下所示的路由规则:

< 目的容器 IP 地址段 > via < 网关的 IP 地址 > dev eth0

# 网关的 IP 地址是目的容器所在宿主机的 IP 地址。
Bash

不同于 Flannel 通过 Etcd 和 flanneld 维护路由信息的做法,Calico 使用“BGP”来自动地在整个集群中分发路由信息。

Calico 在每个 Node 上都利用 Linux Kernel 实现了一个高效的 vRouter 来负责数据转发。每个 vRouter 都通过 BGP1 协议把在本 Node 上运行的容器的路由信息向整个 Calico 网络广播,并自动设置到达其他 Node 的路由转发规则。Calico 保证所有容器之间的数据都是通过IP路由的方式完成互联互通的。

2.1.2. BGP 模式的三个主要组成部分

  • Calico 的 CNI 插件,这是 Calico 与 K8s 对接的部分。
  • Felix:是一个守护程序(在k8s中以DeaemonSe部署),以 agent 形式运行在宿主机上,负责在宿主机上插入路由规则(即:写入 Linux 内核的 FIB 转发信息库),以及维护 Calico 所需的网络设备等工作。
  • BIRD。BGP 的客户端,作用是读取 Felix 编写到内核中的路由信息,并将路由信息分发到集群中的其他节点。

除了对路由信息的维护方式之外,Calico 与 Flannel 的 host-gw 模式的另一个不同之处就是它不会在宿主机上创建任何网桥设备。Calico 的工作方式如下:

  • Calico 的 CNI 插件会为每个容器设置一个 Veth Pair 设备,把其中的一端放置在宿主机上。
    • 由于 Calico 没有使用 CNI 的网桥模式,所以 Calico 的 CNI 插件还需要在宿主机上为每个容器的 VethPair 设备配置一条路由规则,用于接收传入的 IP 包。比如,宿主机 Node 2 上的 Container 4 对应的路由规则,如下所示:10.233.2.3 dev cali5863f3 scope link
  • 容器发出的 IP 包经过 Veth Pair 出现在宿主机上。宿主机根据路由规则的下一跳 IP地址,把它们转发给正确的网关。
  • 最核心的“下一跳”路由规则,就是由 Calico 的 Felix 进程负责维护的。这些路由规则信息,则是通过 BGP Client 也就是 BIRD 组件,使用 BGP 协议传输而来的。

Calico 实际上将集群里的所有节点都当作是边界路由器来处理,一起组成了一个全连通的网络,互相之间通过 BGP 协议交换路由规则。这些节点称为 BGP Peer。

2.1.3. BGP Route Reflector

Calico 维护的网络在默认配置下,是一个被称为“Node-to-Node Mesh/网格”的模式。每台宿主机上的 BGP Client 都需要跟其他所有节点的 BGP Client 通信以交换路由信息。随着节点 N 的增加,连接数量就会以 N² 的规模快速增长,给集群网络带来巨大的压力。在更大规模的集群中,用到的是 Route Reflector 的模式。

在这种模式下,Calico 会指定一个或者几个专门的节点负责跟所有节点建立 BGP 连接从而学习到全局的路由规则。而其他节点只需跟这几个节点交换路由信息,就可以获得整个集群的路由规则信息了。

2.2. IPIP 隧道模式

Calico 的 BGP 模式和 Flannel 的 host-gw 一样要求二层网络是可达的,但是在实际的部署中集群的网络通常划分在不同的网段中,即物理层面只能是三层连通的,在这种情况下就需要为 Calico 打开 IPIP 模式。

在 IPIP 模式下,Calico 会使用 tunl0 设备,tunl0 是一个 IP 隧道设备,IP包会被
tunl0 设备接管。tunl0 设备会将 IP 包封装为宿主机 IP 包。这样新的IP包就可以通过宿主机路由器发送了。

与 vxlan 类似,本质就是协议封装,底层的物理网络可以是二层也可以是三层,效率比 vxlan 略高,但是维护成本也高,所有了解即可。

2.3. 部署示例

在 Kubernetes 中部署 Calico,主要包括 calico-node 和 calico policy controller。需要创建的资源对象和配置信息如下:

  • ConfigMap calico-config:包含 Calico 所需的配置参数。
  • 在每个 Node 上都运行 calico/node 容器,部署为 DaemonSet。
  • 在每个 Node 上都安装 Calico CNI 二进制文件并设置网络配置参数(由 install-cni 容器完成)。
  • 部署一个名为“calico/kube-policy-controller”的 Deployment,以对接Kubernetes 集群中为 Pod 设置的 NetworkPolicy。

官网下载 yaml 文件:

curl https://raw.githubusercontent.com/projectcalico/calico/v3.30.3/manifests/calico.yaml -O
Bash

主要配置介绍如下:

  • ConfigMap 配置
# Source: calico/templates/calico-config.yaml
# This ConfigMap is used to configure a self-hosted Calico installation.
kind: ConfigMap
apiVersion: v1
metadata:
  name: calico-config
  namespace: kube-system
data:
  # Typha is disabled.
  # 这是一个明确的信号,表示此集群部署中不包含 Typha 服务。因此,Calico 的组件将配置为绕过 Typha,直接与 etcd(或 Kubernetes API Server,如果使用 Kubernetes 作为数据存储)进行通信。
  # Typha 是 Calico 项目中的一个轻量级代理。它的主要目的是减少后端数据存储的连接数。
  typha_service_name: "none"
  # Configure the backend to use.
  calico_backend: "bird"

  # Configure the MTU to use for workload interfaces and tunnels.
  # By default, MTU is auto-detected, and explicitly setting this field should not be required.
  # You can override auto-detection by providing a non-zero value.
  veth_mtu: "0"

  # The CNI network configuration to install on each node. The special
  # values in this config will be automatically populated.
  cni_network_config: |-
    {
      "name": "k8s-pod-network",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "calico",
          "log_level": "info",
          "log_file_path": "/var/log/calico/cni/cni.log",
          "datastore_type": "kubernetes",
          "nodename": "__KUBERNETES_NODE_NAME__",
          "mtu": __CNI_MTU__,
          "ipam": {
              "type": "calico-ipam"
          },
          "policy": {
              "type": "k8s"
          },
          "kubernetes": {
              "kubeconfig": "__KUBECONFIG_FILEPATH__"
          }
        },
        {
          "type": "portmap",
          "snat": true,
          "capabilities": {"portMappings": true}
        }
      ]
    }
YAML
  • calico-node
# Source: calico/templates/calico-node.yaml
# This manifest installs the calico-node container, as well
# as the CNI plugins and network config on
# each master and worker node in a Kubernetes cluster.
kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: calico-node
  namespace: kube-system
  labels:
    k8s-app: calico-node
spec:
  selector:
    matchLabels:
      k8s-app: calico-node
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  template:
    metadata:
      labels:
        k8s-app: calico-node
    spec:
      nodeSelector:
        kubernetes.io/os: linux
      hostNetwork: true
      tolerations:
        # Make sure calico-node gets scheduled on all nodes.
        - effect: NoSchedule
          operator: Exists
        # Mark the pod as a critical add-on for rescheduling.
        - key: CriticalAddonsOnly
          operator: Exists
        - effect: NoExecute
          operator: Exists
      serviceAccountName: calico-node
      securityContext:
        seccompProfile:
          type: RuntimeDefault
      # Minimize downtime during a rolling upgrade or deletion; tell Kubernetes to do a "force
      # deletion": https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods.
      terminationGracePeriodSeconds: 0
      priorityClassName: system-node-critical
      initContainers:
        # This container performs upgrade from host-local IPAM to calico-ipam.
        # It can be deleted if this is a fresh installation, or if you have already
        # upgraded to use calico-ipam.
        - name: upgrade-ipam
          image: docker.io/calico/cni:v3.30.3
          imagePullPolicy: IfNotPresent
          command: ["/opt/cni/bin/calico-ipam", "-upgrade"]
          envFrom:
          - configMapRef:
              # Allow KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT to be overridden for eBPF mode.
              name: kubernetes-services-endpoint
              optional: true
          env:
            - name: KUBERNETES_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: CALICO_NETWORKING_BACKEND
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: calico_backend
          volumeMounts:
            - mountPath: /var/lib/cni/networks
              name: host-local-net-dir
            - mountPath: /host/opt/cni/bin
              name: cni-bin-dir
          securityContext:
            privileged: true
        # This container installs the CNI binaries
        # and CNI network config file on each node.
        - name: install-cni
          image: docker.io/calico/cni:v3.30.3
          imagePullPolicy: IfNotPresent
          command: ["/opt/cni/bin/install"]
          envFrom:
          - configMapRef:
              # Allow KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT to be overridden for eBPF mode.
              name: kubernetes-services-endpoint
              optional: true
          env:
            # Name of the CNI config file to create.
            - name: CNI_CONF_NAME
              value: "10-calico.conflist"
            # The CNI network config to install on each node.
            - name: CNI_NETWORK_CONFIG
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: cni_network_config
            # Set the hostname based on the k8s node name.
            - name: KUBERNETES_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            # CNI MTU Config variable
            - name: CNI_MTU
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: veth_mtu
            # Prevents the container from sleeping forever.
            - name: SLEEP
              value: "false"
          volumeMounts:
            - mountPath: /host/opt/cni/bin
              name: cni-bin-dir
            - mountPath: /host/etc/cni/net.d
              name: cni-net-dir
          securityContext:
            privileged: true
        # This init container mounts the necessary filesystems needed by the BPF data plane
        # i.e. bpf at /sys/fs/bpf and cgroup2 at /run/calico/cgroup. Calico-node initialisation is executed
        # in best effort fashion, i.e. no failure for errors, to not disrupt pod creation in iptable mode.
        - name: "mount-bpffs"
          image: docker.io/calico/node:v3.30.3
          imagePullPolicy: IfNotPresent
          command: ["calico-node", "-init", "-best-effort"]
          volumeMounts:
            - mountPath: /sys/fs
              name: sys-fs
              # Bidirectional is required to ensure that the new mount we make at /sys/fs/bpf propagates to the host
              # so that it outlives the init container.
              mountPropagation: Bidirectional
            - mountPath: /var/run/calico
              name: var-run-calico
              # Bidirectional is required to ensure that the new mount we make at /run/calico/cgroup propagates to the host
              # so that it outlives the init container.
              mountPropagation: Bidirectional
            # Mount /proc/ from host which usually is an init program at /nodeproc. It's needed by mountns binary,
            # executed by calico-node, to mount root cgroup2 fs at /run/calico/cgroup to attach CTLB programs correctly.
            - mountPath: /nodeproc
              name: nodeproc
              readOnly: true
          securityContext:
            privileged: true
      containers:
        # Runs calico-node container on each Kubernetes node. This
        # container programs network policy and routes on each
        # host.
        - name: calico-node
          image: docker.io/calico/node:v3.30.3
          imagePullPolicy: IfNotPresent
          envFrom:
          - configMapRef:
              # Allow KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT to be overridden for eBPF mode.
              name: kubernetes-services-endpoint
              optional: true
          env:
            # Use Kubernetes API as the backing datastore.
            - name: DATASTORE_TYPE
              value: "kubernetes"
            # Wait for the datastore.
            - name: WAIT_FOR_DATASTORE
              value: "true"
            # Set based on the k8s node name.
            - name: NODENAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            # Choose the backend to use.
            - name: CALICO_NETWORKING_BACKEND
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: calico_backend
            # Cluster type to identify the deployment type
            - name: CLUSTER_TYPE
              value: "k8s,bgp"
            # Auto-detect the BGP IP address.
            - name: IP
              value: "autodetect"
            # 修改为 disEnable IPIP
            - name: CALICO_IPV4POOL_IPIP
              value: "Never"
            # Enable or Disable VXLAN on the default IP pool.
            - name: CALICO_IPV4POOL_VXLAN
              value: "Never"
            # Enable or Disable VXLAN on the default IPv6 IP pool.
            - name: CALICO_IPV6POOL_VXLAN
              value: "Never"
            # Set MTU for tunnel device used if ipip is enabled
            - name: FELIX_IPINIPMTU
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: veth_mtu
            # Set MTU for the VXLAN tunnel device.
            - name: FELIX_VXLANMTU
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: veth_mtu
            # Set MTU for the Wireguard tunnel device.
            - name: FELIX_WIREGUARDMTU
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: veth_mtu
            # The default IPv4 pool to create on startup if none exists. Pod IPs will be
            # chosen from this range. Changing this value after installation will have
            # no effect. This should fall within `--cluster-cidr`.
            - name: CALICO_IPV4POOL_CIDR
              value: "10.244.0.0/16"
            # Disable file logging so `kubectl logs` works.
            - name: CALICO_DISABLE_FILE_LOGGING
              value: "true"
            # Set Felix endpoint to host default action to ACCEPT.
            - name: FELIX_DEFAULTENDPOINTTOHOSTACTION
              value: "ACCEPT"
            # Disable IPv6 on Kubernetes.
            - name: FELIX_IPV6SUPPORT
              value: "false"
            - name: FELIX_HEALTHENABLED
              value: "true"
          securityContext:
            privileged: true
          resources:
            requests:
              cpu: 250m
          lifecycle:
            preStop:
              exec:
                command:
                - /bin/calico-node
                - -shutdown
          livenessProbe:
            exec:
              command:
              - /bin/calico-node
              - -felix-live
              - -bird-live
            periodSeconds: 10
            initialDelaySeconds: 10
            failureThreshold: 6
            timeoutSeconds: 10
          readinessProbe:
            exec:
              command:
              - /bin/calico-node
              - -felix-ready
              - -bird-ready
            periodSeconds: 10
            timeoutSeconds: 10
          volumeMounts:
            # For maintaining CNI plugin API credentials.
            - mountPath: /host/etc/cni/net.d
              name: cni-net-dir
              readOnly: false
            - mountPath: /lib/modules
              name: lib-modules
              readOnly: true
            - mountPath: /run/xtables.lock
              name: xtables-lock
              readOnly: false
            - mountPath: /var/run/calico
              name: var-run-calico
              readOnly: false
            - mountPath: /var/lib/calico
              name: var-lib-calico
              readOnly: false
            - name: policysync
              mountPath: /var/run/nodeagent
            # For eBPF mode, we need to be able to mount the BPF filesystem at /sys/fs/bpf so we mount in the
            # parent directory.
            - name: bpffs
              mountPath: /sys/fs/bpf
            - name: cni-log-dir
              mountPath: /var/log/calico/cni
              readOnly: true
      volumes:
        # Used by calico-node.
        - name: lib-modules
          hostPath:
            path: /lib/modules
        - name: var-run-calico
          hostPath:
            path: /var/run/calico
            type: DirectoryOrCreate
        - name: var-lib-calico
          hostPath:
            path: /var/lib/calico
            type: DirectoryOrCreate
        - name: xtables-lock
          hostPath:
            path: /run/xtables.lock
            type: FileOrCreate
        - name: sys-fs
          hostPath:
            path: /sys/fs/
            type: DirectoryOrCreate
        - name: bpffs
          hostPath:
            path: /sys/fs/bpf
            type: Directory
        # mount /proc at /nodeproc to be used by mount-bpffs initContainer to mount root cgroup2 fs.
        - name: nodeproc
          hostPath:
            path: /proc
        # Used to install CNI.
        - name: cni-bin-dir
          hostPath:
            path: /opt/cni/bin
            type: DirectoryOrCreate
        - name: cni-net-dir
          hostPath:
            path: /etc/cni/net.d
        # Used to access CNI logs.
        - name: cni-log-dir
          hostPath:
            path: /var/log/calico/cni
        # Mount in the directory for host-local IPAM allocations. This is
        # used when upgrading from host-local to calico-ipam, and can be removed
        # if not using the upgrade-ipam init container.
        - name: host-local-net-dir
          hostPath:
            path: /var/lib/cni/networks
        # Used to create per-pod Unix Domain Sockets
        - name: policysync
          hostPath:
            type: DirectoryOrCreate
            path: /var/run/nodeagent
YAML
  • calico-policy-controller
# Source: calico/templates/calico-kube-controllers.yaml
# See https://github.com/projectcalico/kube-controllers
apiVersion: apps/v1
kind: Deployment
metadata:
  name: calico-kube-controllers
  namespace: kube-system
  labels:
    k8s-app: calico-kube-controllers
spec:
  # The controllers can only have a single active instance.
  replicas: 1
  selector:
    matchLabels:
      k8s-app: calico-kube-controllers
  strategy:
    type: Recreate
  template:
    metadata:
      name: calico-kube-controllers
      namespace: kube-system
      labels:
        k8s-app: calico-kube-controllers
    spec:
      nodeSelector:
        kubernetes.io/os: linux
      tolerations:
        # Mark the pod as a critical add-on for rescheduling.
        - key: CriticalAddonsOnly
          operator: Exists
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
        - key: node-role.kubernetes.io/control-plane
          effect: NoSchedule
      serviceAccountName: calico-kube-controllers
      securityContext:
        seccompProfile:
          type: RuntimeDefault
      priorityClassName: system-cluster-critical
      containers:
        - name: calico-kube-controllers
          image: docker.io/calico/kube-controllers:v3.30.3
          imagePullPolicy: IfNotPresent
          env:
            # Choose which controllers to run.
            - name: ENABLED_CONTROLLERS
              value: node,loadbalancer
            - name: DATASTORE_TYPE
              value: kubernetes
          livenessProbe:
            exec:
              command:
              - /usr/bin/check-status
              - -l
            periodSeconds: 10
            initialDelaySeconds: 10
            failureThreshold: 6
            timeoutSeconds: 10
          readinessProbe:
            exec:
              command:
              - /usr/bin/check-status
              - -r
            periodSeconds: 10
          securityContext:
            runAsNonRoot: true
YAML

2.4. 部署过程问题记录

Containerd 运行时配置了代理,但代理无法将流量转发到集群 IP。创建 Pod 时报错:

Events:
  Type     Reason                  Age               From               Message
  ----     ------                  ----              ----               -------
  Normal   Scheduled               86s               default-scheduler  Successfully assigned kube-system/calico-kube-controllers-5c7545f596-lrx25 to k8s-master-01
  Warning  FailedCreatePodSandBox  67s               kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "9842e0a5f676046bb23fd6dfd96118d228ed3fd0e6c807fa2c57784f71c5cef5": plugin type="calico" failed (add): error getting ClusterInformation: Get "https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": net/http: TLS handshake timeout
  Warning  FailedCreatePodSandBox  46s               kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "77722de246e94cc8fef8ff4861f3c377549042702dccfeed5e1e12b03a895484": plugin type="calico" failed (add): error getting ClusterInformation: Get "https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": net/http: TLS handshake timeout
  Warning  FailedCreatePodSandBox  25s               kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "1ba87a3bd73285cd51a9bcc3ea643ac4406477ed3c9a399d89330c4ee1d42370": plugin type="calico" failed (add): error getting ClusterInformation: Get "https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": net/http: TLS handshake timeout
  Warning  FailedCreatePodSandBox  4s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "f21b8420c7ef18e0d1406bb71f88d6f6f3000cc72167d69a7d25420b177b48d1": plugin type="calico" failed (add): error getting ClusterInformation: Get "https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": net/http: TLS handshake timeout
  Normal   SandboxChanged          3s (x4 over 66s)  kubelet            Pod sandbox changed, it will be killed and re-created.
Bash

解决方案:移除代理或者设置 Pod IP 绕过代理。

# /etc/systemd/system/containerd.service.d/proxy.conf
[Service]
Environment="HTTP_PROXY=http://hypervisor-01:20172"
Environment="HTTPS_PROXY=http://hypervisor-01:20172"
# 排除内网地址
Environment="NO_PROXY=10.0.0.0/8,192.168.0.0/16,172.16.0.0/12,127.0.0.1,localhost,.svc,.cluster.local"
Bash
    上一篇
    下一篇