Installation Troubleshooting

This article is mainly to help you solve the problems encountered during the installation process.

During the installation process, if you encounter problems, the first task is to confirm whether the underlying kubernetes cluster is normal, including checking nodes, kubelet, kube-apiserver, kube-controller-manager, kube-schedler, coredns, network plug-ins (such as calico , flannel) etc. Because the deployment method of Kubernetes can be very flexible, some components sometimes run as a container, and sometimes run directly as a binary file, so the troubleshooting method will be different. For details, please refer to Kubernetes cluster troubleshooting.

In the following kubectl commands, the default namespace rbd-system will be used. If a custom domain name is used, please replace it yourself.

Cannot Select Gateway Node

The gateway needs to occupy these ports 80, 443, 6060, 7070, 8443, 10254, 18080, 18081, so the node where the gateway is installed must ensure that these ports are not occupied. Otherwise, it cannot be recognized and can be searched and selected. You can deploy applications that occupy the above ports to other nodes, or change the corresponding ports to other ports.

10001 error

The 10001 error indicates that the service of assigning the default domain name is temporarily unable to provide the service. At this time, you can turn off Auto assign domain name and enter a custom domain name.

Processing Image Stuck

The image processing is stuck. It may be caused by an error in the processing, or it may be due to network reasons, or a configuration problem of docker, which causes the image not to be pulled. At this time, you need to observe the operator log to confirm the problem, the command is as follows:

kubectl logs -f kato-operator-0 operator -n rbd-system

If there is an error, it is an error during processing. Then you can execute the following command to restart the processing of the installation package:

kubectl get katopackage katopackage -o yaml -n rbd-system> rbd-pkg.yaml
kubectl delete -f rbd-pkg.yaml
kubectl create -f rbd-pkg.yaml

If there is no error, and the log contains the words docker pull xxx, docker push xxx, the image is being processed. But due to the network or docker configuration, As a result, it is slow to pull the mirror image. What needs to be done at this time is:

  • Check the network
  • Check the docker configuration, if necessary, add Alibaba Cloud’s image acceleration
  • After ensuring that the image can be obtained normally, execute the following command to restart the processing of the installation package:
 kubectl get katopackage katopackage -o yaml -n rbd-system> rbd-pkg.yaml
 kubectl delete -f rbd-pkg.yaml
 kubectl create -f rbd-pkg.yaml

The NFS service cannot be mounted, causing the Kato component to be in FailedMount

During the creation of the Kato component, the status is NotReady, the reason is FailedMount, the message is as follows:

MountVolume.SetUp failed for volume "pvc-c600f712-e11e-48da-835e-29bca056f253": mount failed: exit status 32 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/ pods/595230c2-6354-4fd3-8122-0a20b4a1c223/volumes/kubernetes.io~nfs/pvc-c600f712-e11e-48da-835e-29bca056f253 --scope - mount -t nfs 10.97.58.170:/export/pvc-c600f712 -e11e-48da-835e-29bca056f253 /var/lib/kubelet/pods/595230c2-6354-4fd3-8122-0a20b4a1c223/volumes/kubernetes.io~nfs/pvc-c600f712-e11e-48da-835e-29bca056f253 Output: Running scope as unit: run-r599b201f5404416cbead03d30b6029be.scope mount: wrong fs type, bad option, bad superblock on 10.97.58.170:/export/pvc-c600f712-e11e-48da-835e-29bca056f253, missing codepage or helper program, or other error (for several filesystems (eg nfs,cifs) you might need a /sbin/mount.<type> helper program) In some cases useful info is found in syslog-try dmesg | tail or so.

Especially when mount failed: exit status 32 appears, it means that your machine lacks NFS client. Please execute the corresponding command on each machine according to the actual system:

# Centos
yum install -y nfs-utils

# Ubuntu or Debian
apt install -y nfs-common