我目前正在完成学士学位的学习实习,我的项目正在做 CI/CD 管道,所以我对 DevOps 有点陌生,所以我希望有人能给我一些宝贵的时间来帮助我。使用 NAT 网络在 vmware 上设置 3 个 ubuntu 服务器实例后,这里列出了我使用 kubeadm 设置具有 1 个主节点和 2 个工作节点的 kubernetes 集群所执行的命令
apt install docker.io -y
sudo swapoff -a
nano /etc/fstab
#ive commented the swap line here
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gpg
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
sudo systemctl enable --now kubelet
#then on my master node ive executed the kubeadm init and did the export config thing and joined the worker nodes using the cmd the master gave me
sudo kubeadm init
前 10 分钟一切正常,我加入了安装了 calico 等的节点,然后每当我尝试执行“kubectl getnodes”时,我都会收到错误
The connection to the server 192.168.149.141:6443 was refused - did you specify the right host or port?
我尝试重新启动 docker 和 kubelet 服务,它在再次崩溃之前又解决了 10 分钟的问题,还有另一件事我注意到一些 Pod 不断重新启动,因为你可以看到 CrashLoopBackOff ,它不是性能问题,因为当我安装指标时,Cpu 和 ram 使用情况是所有节点上低于 40%
这是 kubelet 的日志:
我的 kube-apiserver.yaml
GNU nano 6.2 kube-apiserver.yaml apiVersion: v1
kind: Pod
metadata:
annotations:
kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 192.168.149.141:6443
creationTimestamp: null
labels:
component: kube-apiserver
tier: control-plane
name: kube-apiserver
namespace: kube-system
spec:
containers:
- command:
- kube-apiserver
- --advertise-address=192.168.149.141
- --allow-privileged=true
- --authorization-mode=Node,RBAC
- --client-ca-file=/etc/kubernetes/pki/ca.crt
- --enable-admission-plugins=NodeRestriction
- --enable-bootstrap-token-auth=true
- --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
- --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
- --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
- --etcd-servers=https://127.0.0.1:2379
- --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
- --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
- --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
- --requestheader-allowed-names=front-proxy-client
- --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
- --requestheader-extra-headers-prefix=X-Remote-Extra-
- --requestheader-group-headers=X-Remote-Group
- --requestheader-username-headers=X-Remote-User
- --secure-port=6443
- --service-account-issuer=https://kubernetes.default.svc.cluster.local
- --service-account-key-file=/etc/kubernetes/pki/sa.pub
- --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
- --service-cluster-ip-range=10.96.0.0/12
- --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
- --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
image: registry.k8s.io/kube-apiserver:v1.29.3
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 192.168.149.141
httpGet:
host: 192.168.149.141
path: /livez
port: 6443
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
name: kube-apiserver
readinessProbe:
failureThreshold: 3
httpGet:
host: 192.168.149.141
path: /readyz
port: 6443
scheme: HTTPS
periodSeconds: 1
timeoutSeconds: 15
resources:
requests:
cpu: 250m
startupProbe:
failureThreshold: 24
httpGet:
host: 192.168.149.141
path: /livez
port: 6443
scheme: HTTPS
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
volumeMounts:
- mountPath: /etc/ssl/certs
name: ca-certs
readOnly: true
- mountPath: /etc/ca-certificates
name: etc-ca-certificates
readOnly: true
- mountPath: /etc/pki
name: etc-pki
readOnly: true
- mountPath: /etc/kubernetes/pki
name: k8s-certs
readOnly: true
- mountPath: /usr/local/share/ca-certificates
name: usr-local-share-ca-certificates
readOnly: true
- mountPath: /usr/share/ca-certificates
readOnly: true
- mountPath: /usr/share/ca-certificates
name: usr-share-ca-certificates
readOnly: true
hostNetwork: true
priority: 2000001000
priorityClassName: system-node-critical
securityContext:
seccompProfile:
type: RuntimeDefault
volumes:
- hostPath:
path: /etc/ssl/certs
type: DirectoryOrCreate
name: ca-certs
- hostPath:
path: /etc/ca-certificates
type: DirectoryOrCreate
name: etc-ca-certificates
- hostPath:
path: /etc/pki
type: DirectoryOrCreate
name: etc-pki
- hostPath:
path: /etc/kubernetes/pki
type: DirectoryOrCreate
name: k8s-certs
- hostPath:
path: /usr/local/share/ca-certificates
type: DirectoryOrCreate
name: usr-local-share-ca-certificates
- hostPath:
path: /usr/share/ca-certificates
type: DirectoryOrCreate
name: usr-share-ca-certificates
status: {}
我准备提供更多信息
2 个回答
2
经过更多研究和检查后发现,kubernetes 删除了对 docker 作为容器运行时的支持,因此我使用 cri-o 重新初始化了我的集群,一切正常,按照此博客进行安装:
|
您可以尝试执行以下步骤来缩小导致问题的组件范围:
- 登录其中之一
Node
(192.168.149.141),尝试从 localhost:6443 连接到 API 服务器
这应该可以帮助您验证 API 服务器是否无法正常工作或其网络相关。
- 在 (192.168.149.141) 上打开一个
nc -v -l 50000
进程Node
,并连接到Node
vianc -v 192.168.149.141 50000
,并尝试发送消息以查看远程是否Node
收到它们。
如果是网络相关问题,这将帮助您确定是否与 NAT(VMWare) 相关或与路由器相关。
- 当您尝试连接时,您可能想查看 Api 服务器的日志,看看它是否可以提供任何有用的信息。
如果你不耐烦,可以尝试使用Rancher
(或RKE)来部署集群,而不是kubeadmin
,它对初学者更友好。
1
-
您好,感谢您尝试帮助我。阅读多篇文章后,我发现问题出在 docker 运行时,因为 kubernetes 已弃用它,我尝试使用 cri-dockerd 但没有帮助,所以我迁移到 cri-o一切似乎又恢复正常了,非常感谢
–
|
|