前置条件

docker安装

官方文档: https://docs.docker.com/get-docker/
相关教程: https://yeasy.gitbook.io/docker_practice/install/mirror

禁止swap分区

根据K8s的要求,确保禁止swap分区使用,不禁止,初始化会报错.

在每个宿主机上执行:

sudo swapoff -a
#修改/etc/fstab,注释掉swap那行,持久化生效
sudo vi /etc/fstab

关闭防火墙和selinux

ubuntu 查看防火墙命令,ufw status可查看状态,ubuntu20.04默认全部关闭,无需设置。

主机名和hosts设置

非必须,但是为了直观方便管理,建议设置。

在宿主机分别设置主机名:k8s-master,k8s-node01,k8s-node02

sudo vim /etc/hostname

hosts设置

sudo vim /etc/hosts
#添加内容
192.168.152.100 k8s-master
192.168.152.101 k8s-node01
192.168.152.102 k8s-node02

安装组件

更改默认驱动

为防止初始化出现一系列的错误,请检查docker和kubectl驱动是否一致,否则kubectl没法启动造成报错。版本不一样,docker有些为cgroupfs,而kubectl默认驱动为systemd,所以需要更改docker驱动。

查看docker驱动命令

sudo docker info|grep Driver

更改docker驱动

#编辑创建文件
sudo vim /etc/docker/daemon.json
#添加内容
{
"exec-opts": ["native.cgroupdriver=systemd"]
}

重启docker

sudo systemctl restart docker.service

更新apt

sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl

添加k8s库

国外 :下载 Google Cloud 公开签名秘钥:

curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add
sudo apt-add-repository "deb http://apt.kubernetes.io/ kubernetes-xenial main"

国内:可以用阿里源即可:

curl -s https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add -
sudo tee /etc/apt/sources.list.d/kubernetes.list <<EOF
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF

安装

锁定版本,防止出现不兼容情况,例如,1.7.0 版本的 kubelet 可以完全兼容 1.8.0 版本的 API 服务器,反之则不可以。

  • kubeadm:用来初始化集群的指令。
  • kubelet:在集群中的每个节点上用来启动 Pod 和容器等。
  • kubectl:用来与集群通信的命令行工具。
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

初始化主节点

只需要在master上操作即可。

kubeadm init

初始化完成之后

[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.152.100:6443 --token tojkw9.v4nageqhftd7v2vc \
--discovery-token-ca-cert-hash sha256:6a5b372144d6cc2a12f8e41853554549f1290e665381f48fdd92bbf92de7b884

根据用户不同,执行以下命令,

root用户

export KUBECONFIG=/etc/kubernetes/admin.conf

普通用户

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

最后,到node节点机器上执行上面init输出的join语句即可!

网络设置

curl https://docs.projectcalico.org/manifests/calico.yaml -O
kubectl apply -f calico.yaml

安装完成后,kubectl get node 可查看节点状态,由NotReady变成Ready则正常,需要等几分钟完成。

#未安装网络插件
ubuntu@k8s-master:~$ kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master NotReady control-plane,master 80m v1.22.3
#已安装网络插件
ubuntu@k8s-master:~$ kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane,master 83m v1.22.3

常见错误

问题一

[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.

原因:kubectl没法启动,journalctl -xe查看启动错误信息。

journalctl -xe
#信息显示docker和kubectel驱动不一致
kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""

解决方案:k8s建议systemd驱动,所以更改docker驱动即可,编辑 /etc/docker/daemon.json (没有就新建一个),添加如下启动项参数即可:

#编辑创建文件
sudo vim /etc/docker/daemon.json
#添加内容
{
"exec-opts": ["native.cgroupdriver=systemd"]
}

重启docker和kubectel

#重启docker
sudo systemctl restart docker.service
#重载kubectl
sudo systemctl daemon-reload
#重启kubectl
sudo systemctl restart kubelet.service
#查看kubectl服务状态恢复正常
sudo systemctl status kubelet.service

问题二

error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`

原因:初始化生产的文件,重新初始化,需要删除即可

rm -fr /etc/kubernetes/manifests/*

问题三

error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Port-10250]: Port 10250 is in use

解决方法:重置配置

sudo kubeadm reset

加入node节点

在所有node节点机器操作,统一已安装完成 kubelet、kubeadm 和 kubectl,用master初始化完成后最后提示命令加入,切记要用root用户。

kubeadm join 192.168.152.100:6443 --token tojkw9.v4nageqhftd7v2vc \
--discovery-token-ca-cert-hash sha256:6a5b372144d6cc2a12f8e41853554549f1290e665381f48fdd92bbf92de7b884

执行完成之后,输出以下命令

[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

在master节点,执行kubectl get node可查看已加入的所有节点

ubuntu@k8s-master:~$ kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane,master 16h v1.22.3
k8s-node01 Ready <none> 24m v1.22.3
k8s-node02 Ready <none> 6m54s v1.22.3

常见问题

问题一

[root@k8snode1 kubernetes]# kubectl get pod
The connection to the server localhost:8080 was refused - did you specify the right host or port?

原因:出现这个问题的原因是kubectl命令需要使用kubernetes-admin来运行

解决:将主节点中的【/etc/kubernetes/admin.conf】文件拷贝到从节点相同目录下,然后配置环境变量

echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
source ~/.bash_profile

问题二

[ERROR CRI]: container runtime is not running: output: time="2020-09-24T11:49:16Z" level=fatal msg="getting status of runtime failed: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"

解决方案:

rm /etc/containerd/config.toml
systemctl restart containerd
kubeadm init