[ubuntu] Kubernetes 1.28 설치


Azure, VMWARE 설치검증 완료

https://domdom.tistory.com/591
  • Category : Kubernetes
  • Tag : Azure


사전작업

해당 작업은 vmware나 virtual box로 진행할때 필요한 진행과정임 Azure 등 작업필요 없음

1. 방화벽 해제

# 아래 포트를 허용시켜준다.
ufw enable
sudo ufw allow 22/tcp
sudo ufw allow 8080/tcp
sudo ufw allow 6443/tcp
sudo ufw status

2. ip 접근 허용

sudo ufw allow from <master ip>
sudo ufw allow from <node1 ip>
sudo ufw allow from <node2 ip>
  • memo
    sudo ufw allow from 192.168.123.120
    sudo ufw allow from 192.168.123.121
    sudo ufw allow from 192.168.123.122
    

[ubuntu] Docker 설치

1-1. Docker 자동설치 스크립트

sudo wget -qO- http://get.docker.com/ | sh

1-2 자동설치 스크립트로 설치가 안되는 경우

  1. 설치전 필요한 패키지설치
    # Update the apt package index and install packages to allow apt to use a repository over HTTPS:
    sudo apt-get update
    sudo apt-get install -y ca-certificates curl gnupg
    
  2. 도커에 GPG키 설치
    # Add Docker’s official GPG key:
    sudo install -m 0755 -d /etc/apt/keyrings
    curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
    sudo chmod a+r /etc/apt/keyrings/docker.gpg
    
  3. repository등록
    # Use the following command to set up the repository:
    echo \
    "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
    "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
    sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
    
  4. apt 업데이트
    sudo apt-get update
    
  5. docker-ce docker-ce-cli containerd.io(도커엔진) 설치 진행
    sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
    

2. Docker 활성화

sudo systemctl enable docker
sudo systemctl start docker
sudo systemctl enable containerd
sudo systemctl start containerd

3. Docker 컨테이너 실행 테스트

sudo docker run --rm hello-world

4. Docker의 cgroup driver를 cgroupfs에서 systemd로 변경

#sudo mkdir /etc/docker
cat <<EOF | sudo tee /etc/docker/daemon.json
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF

# 재시작
sudo systemctl enable docker
sudo systemctl daemon-reload
sudo systemctl restart docker  

# 변경된 cgroup driver 확인
sudo docker info | grep -i cgroup

5. Swap 메모리를 비활성화 [root 권한 필요]

# swap disable
swapoff -a
echo 0 > /proc/sys/vm/swappiness
sed -e '/swap/ s/^#*/#/' -i /etc/fstab

Kubernetes 설치

1. kubeadm, kubelet, kubectl을 설치

curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF

sudo apt-get update

1번을 진행 후 아래와 같은 문제가 발생하는 경우

  • 404 Not Found
master@master:~$ sudo apt-get update
Hit:1 http://azure.archive.ubuntu.com/ubuntu focal InRelease
Hit:2 http://azure.archive.ubuntu.com/ubuntu focal-updates InRelease
Hit:3 http://azure.archive.ubuntu.com/ubuntu focal-backports InRelease
Hit:4 http://azure.archive.ubuntu.com/ubuntu focal-security InRelease
Hit:5 https://download.docker.com/linux/ubuntu focal InRelease
Ign:6 https://packages.cloud.google.com/apt kubernetes-xenial InRelease
Err:7 https://packages.cloud.google.com/apt kubernetes-xenial Release
  404  Not Found [IP: 142.250.206.206 443]
Reading package lists... Done
E: The repository 'https://apt.kubernetes.io kubernetes-xenial Release' does not have a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
  • 다음 과정 수행
sudo su -
sudo apt-get install -y apt-transport-https ca-certificates curl gpg
sudo mkdir -p -m 755 /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
apt-get update

[2-1]. 1.18버전 설치

sudo apt install -y kubelet=1.18.15-00 kubeadm=1.18.15-00 kubectl=1.18.15-00

[2-2]. 1.19버전 설치 [flannal_test]

sudo apt-get install -y kubelet=1.19.15-00 kubeadm=1.19.15-00 kubectl=1.19.15-00

[2-3] 최신 버전 설치

sudo apt-get install -y kubelet kubeadm kubectl

3. 자동으로 업데이트 되지 않도록 패키지 버전을 고정

sudo apt-mark hold kubelet kubeadm kubectl

4. [MASTER] 마스터노드와 워커노드를 설정

다음으로, 마스터노드와 워커노드를 설정해줍니다.

kubeadm init 명령어를 통해 마스터 노드를 초기화 및 실행할 수 있습니다.

마스터 노드 세팅을 위해 필요한 옵션은 아래와 같습니다.

명령어 내용
–pod-network-cidr Pod 네트워크를 설정합니다.
–apiserver-advertise-address 마스터노드의 API Server 주소를 설정합니다.

쿠버네티스 네트워크 모델 중 하나인 Flannel을 사용하기 위해
–pod-network-cidr 옵션에 10.244.0.0/16을 넣어줍니다.
Flannel은 서로다른 노드에 있는 Pod간 통신을 원활히 하기 위해 필요한 네트워크 플러그인 이며,
Flannel의 기본 네트워크 대역은 10.244.0.0/16입니다.
apiserver-advertise-address 에는 ifconfig 명령어를 입력하면 나오는 eth0 IP를 넣어주면 됩니다.

# 명령어 실행 후 join 토큰 발급급
sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address={마스터 노드 IP}
  • memo
    sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.0.0.4
    sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.123.120
    

브리지 네트워크 관련 커널 모듈 로드 에러

master@AKS-MasterNode-vm:~$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.0.0.5
I0806 12:52:17.597677    5540 version.go:256] remote version is much newer: v1.33.3; falling back to: stable-1.29
[init] Using Kubernetes version: v1.29.15
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

$ [ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist 오류는 브리지 네트워크 관련 커널 모듈이 로드되지 않았을 때 발생합니다. 이 오류는 일반적으로 net.bridge.bridge-nf-call-iptables 파라미터가 설정되어 있지 않아 kubeadm의 사전 검증 단계에서 실패하기 때문에 발생합니다. 해당 파라미터는 쿠버네티스 파드가 올바르게 통신하기 위해 필요한 브리지 네트워크의 패킷 처리 규칙을 정의합니다.

  • 해결방법법
      # 1. 브리지 네트워크 모듈 로드
      sudo modprobe br_netfilter
        
      # 2. 임시 sysctl 파라미터 설정
      sudo sysctl -w net.bridge.bridge-nf-call-iptables=1
      sudo sysctl -w net.bridge.bridge-nf-call-ip6tables=1
        
      # 3. 설정 파일 생성 및 내용 추가
      sudo nano /etc/sysctl.d/99-kubernetes-cri.conf
        
      # 4. 생성한 파일에 아래 내용 추가
      net.bridge.bridge-nf-call-iptables  = 1
      net.bridge.bridge-nf-call-ip6tables = 1
      net.ipv4.ip_forward                 = 1
      ## Ctrl + O를 눌러 저장하고, Enter를 누른 뒤 Ctrl + X를 눌러 편집기를 종료
        
      # 5. 변경 사항 적용: 새로 생성한 파일의 설정을 시스템에 적용합니다.
      sudo sysctl --system
    

트러블슈팅 진행 로그

master@AKS-MasterNode-vm:~$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.0.0.5
I0806 12:52:17.597677    5540 version.go:256] remote version is much newer: v1.33.3; falling back to: stable-1.29
[init] Using Kubernetes version: v1.29.15
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
master@AKS-MasterNode-vm:~$ ^C
master@AKS-MasterNode-vm:~$ sudo modprobe br_netfilter
master@AKS-MasterNode-vm:~$ sudo sysctl -w net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-iptables = 1
master@AKS-MasterNode-vm:~$ sudo sysctl -w net.bridge.bridge-nf-call-ip6tables=1
net.bridge.bridge-nf-call-ip6tables = 1
master@AKS-MasterNode-vm:~$ vi /etc/sysctl.d/
10-bufferbloat.conf       10-ipv6-privacy.conf      10-magic-sysrq.conf       10-network-security.conf  10-zeropage.conf          99-cloudimg-udp.conf      README.sysctl
10-console-messages.conf  10-kernel-hardening.conf  10-map-count.conf         10-ptrace.conf            99-cloudimg-ipv6.conf     99-sysctl.conf
master@AKS-MasterNode-vm:~$ sudo nano /etc/sysctl.d/99-kubernetes-cri.conf
master@AKS-MasterNode-vm:~$ sudo sysctl --system
* Applying /usr/lib/sysctl.d/10-apparmor.conf ...
* Applying /etc/sysctl.d/10-bufferbloat.conf ...
* Applying /etc/sysctl.d/10-console-messages.conf ...
* Applying /etc/sysctl.d/10-ipv6-privacy.conf ...
* Applying /etc/sysctl.d/10-kernel-hardening.conf ...
* Applying /etc/sysctl.d/10-magic-sysrq.conf ...
* Applying /etc/sysctl.d/10-map-count.conf ...
* Applying /etc/sysctl.d/10-network-security.conf ...
* Applying /etc/sysctl.d/10-ptrace.conf ...
* Applying /etc/sysctl.d/10-zeropage.conf ...
* Applying /usr/lib/sysctl.d/50-pid-max.conf ...
* Applying /etc/sysctl.d/99-cloudimg-ipv6.conf ...
* Applying /etc/sysctl.d/99-cloudimg-udp.conf ...
* Applying /etc/sysctl.d/99-kubernetes-cri.conf ...
* Applying /usr/lib/sysctl.d/99-protect-links.conf ...
* Applying /etc/sysctl.d/99-sysctl.conf ...
* Applying /etc/sysctl.conf ...
kernel.apparmor_restrict_unprivileged_userns = 1
net.core.default_qdisc = fq_codel
kernel.printk = 4 4 1 7
net.ipv6.conf.all.use_tempaddr = 2
net.ipv6.conf.default.use_tempaddr = 2
kernel.kptr_restrict = 1
kernel.sysrq = 176
vm.max_map_count = 1048576
net.ipv4.conf.default.rp_filter = 2
net.ipv4.conf.all.rp_filter = 2
kernel.yama.ptrace_scope = 1
vm.mmap_min_addr = 65536
kernel.pid_max = 4194304
net.ipv6.conf.all.use_tempaddr = 0
net.ipv6.conf.default.use_tempaddr = 0
net.core.rmem_max = 1048576
net.core.rmem_default = 1048576
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
fs.protected_fifos = 1
fs.protected_hardlinks = 1
fs.protected_regular = 2
fs.protected_symlinks = 1
master@AKS-MasterNode-vm:~$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.0.0.5
I0806 12:59:37.257433    5730 version.go:256] remote version is much newer: v1.33.3; falling back to: stable-1.29
[init] Using Kubernetes version: v1.29.15
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0806 12:59:57.669823    5730 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.8" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.k8s.io/pause:3.9" as the CRI sandbox image.
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [aks-masternode-vm kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.0.0.5]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [aks-masternode-vm localhost] and IPs [10.0.0.5 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [aks-masternode-vm localhost] and IPs [10.0.0.5 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 8.003554 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node aks-masternode-vm as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node aks-masternode-vm as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: 3181wt.bmkup7hpmvz7mvpo
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 10.0.0.5:6443 --token 3181wt.bmkup7hpmvz7mvpo \
        --discovery-token-ca-cert-hash sha256:1508fb54b4adbec06f3967fac1c0d35d7ec991a85300f5005d2e576f85a0f62b

[ERROR CRI]: container runtime is not running 오류

master@master:~$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.0.0.4
[init] Using Kubernetes version: v1.28.3
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR CRI]: container runtime is not running: output: time="2023-11-01T15:17:07Z" level=fatal msg="validate service connection: CRI v1 runtime API is not implemented for endpoint \"unix:///var/run/containerd/containerd.sock\": rpc 
error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
  • 해결방법

/etc/containerd/config.toml을 삭제하고 containerd 서비스를 다시 시작

sudo rm /etc/containerd/config.toml
systemctl restart containerd

1. worker node1,2에서 config.toml 확인

sudo nano /etc/containerd/config.toml

2. “cir” 삭제 후 저장

  • GNU nano 4.8 화면창 참고

    disabled_plugins = [“cri”]에서 cri를 삭제 후 Ctrl+o, Ctrl+x를 눌러 저장 후 나온다

#   Copyright 2018-2022 Docker Inc.

#   Licensed under the Apache License, Version 2.0 (the "License");
#   you may not use this file except in compliance with the License.
#   You may obtain a copy of the License at

#       http://www.apache.org/licenses/LICENSE-2.0

#   Unless required by applicable law or agreed to in writing, software
#   distributed under the License is distributed on an "AS IS" BASIS,
#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#   See the License for the specific language governing permissions and
#   limitations under the License.

disabled_plugins = []

#root = "/var/lib/containerd"
#state = "/run/containerd"
#subreaper = true
#oom_score = 0

#[grpc]
#  address = "/run/containerd/containerd.sock"
#  uid = 0
#  gid = 0

#[debug]
#  address = "/run/containerd/debug.sock"
#  uid = 0
#  gid = 0
#  level = "info"

3. 컨테이너 런타임 재시작

sudo systemctl restart containerd

4. 클러스터 조인 재실행

root@node1:~# sudo systemctl restart containerd
root@node1:~# kubeadm join 10.0.0.4:6443 --token jvelw6.y3jp89wc1s3zkf52         --discovery-token-ca-cert-hash sha256:e42cb2a4262e8106142b3dba64713d13b9757aa648775e9a841e42b3432589a2
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

5. [MASTER] 토큰 복사

cat > token.txt
sudo kubeadm join 10.0.0.4:6443 --token fz8v03.ouh96b4fenr45xbw \
        --discovery-token-ca-cert-hash sha256:9ffebbe31308a9698a357f28349a194d7e1ebf03efdd8003e5502d6f6767cacf

sudo kubeadm join 192.168.123.120:6443 --token 76r7q8.kj3508ymnonx63zd \
        --discovery-token-ca-cert-hash sha256:cea33b0f64bd4c561a8e09a2bbe6f18c411b0178837541ab83f2506bc12e8fa6

6. [MASTER] 클러스터를 사용하기 위한 권한 변경작업 [일반, 관리자 둘다해줌]

sudo mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

7. [MASTER] Pod 끼리의 통신을 도와주는 Flannel Pod 네트워크를 클러스터에 배포

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

[ERROR] 2025-08-06

master@AKS-MasterNode-vm:~$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
error: error validating "https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml": error validating data: failed to download openapi: Get "https://10.0.0.5:6443/openapi/v2?timeout=32s": dial tcp 10.0.0.5:6443: connect: connection refused; if you choose to ignore these errors, turn validation off with --validate=false
  • kubeadm init 실패 원인 요약 | 오류 유형 | 설명 | | —————————- | —————————————————— | | *.yaml already exists | 기존 클러스터 설정 파일이 남아 있음 (/etc/kubernetes/manifests/...) | | Port 10250 is in use | 해당 포트를 이미 사용 중인 프로세스가 있음 | | /var/lib/etcd is not empty | etcd 데이터 디렉토리가 남아 있음 |

이 문제는 이전에 kubeadm init을 수행했거나 중간에 실패하여 잔여 설정이 남아있는 상태입니다. 따라서 다음 순서로 완전 초기화 후 다시 시도하면 됩니다.

  1. kubeadm reset 으로 초기화
    sudo kubeadm reset -f
    
  2. 남아있는 설정/디렉토리 삭제
    sudo rm -rf /etc/kubernetes/manifests/*
    sudo rm -rf /var/lib/etcd
    
  3. kubeadm init 재실행
    sudo kubeadm init --pod-network-cidr=10.244.0.0/16
    
  4. kubectl 설정
    mkdir -p $HOME/.kube
    sudo cp /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config
    
  5. Flannel CNI 적용
    kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
    
  6. 최종 확인
    kubectl get nodes
    

    STATUS가 Ready로 표시되면 정상

8. 마스터 서버의 포트설정 후 접속테스트

https://<master node public ip>:6443

# 접속시 아래처럼 웹 페이지를 볼 수 있음
Client sent an HTTP request to an HTTPS server.

9. [NODE1, NODE2] 워커노드 조인

복사했던 토큰을 각 노드에 실행해준다.

  • 만약 다음과 같은 에러가 발생한 경우
    node1@node1:~$ kubeadm join 10.0.0.4:6443 --token fz8v03.ouh
    >         --discovery-token-ca-cert-hash sha256:9ffebbe31308ebf03efdd8003e5502d6f6767cacf
    [preflight] Running pre-flight checks
    error execution phase preflight: [preflight] Some fatal erro
            [ERROR IsPrivilegedUser]: user is not running as roo
    [preflight] If you know what you are doing, you can make a cignore-preflight-errors=...`
    To see the stack trace of this error execute with --v=5 or h
    
  • kubeadm을 root 사용자 권한으로 실행하지 않아서 발생한 것
  • 다음과 같이 실행
    sudo kubeadm join 10.0.0.4:6443 --token fz8v03.ouh96b4fenr45xbw --discovery-token-ca-cert-hash sha256:9ffebbe31308a9698a357f28349a194d7e1ebf03efdd8003e5502d6f6767cacf
    
    kubeadm join 192.168.123.120:6443 --token 76r7q8.kj3508ymnonx63zd --discovery-token-ca-cert-hash sha256:cea33b0f64bd4c561a8e09a2bbe6f18c411b0178837541ab83f2506bc12e8fa6 --node-name new-node-name
    sudo kubeadm join 192.168.123.120:6443 --token ion4b3.32c033cpyif8su15 --discovery-token-ca-cert-hash sha256:75af9c3c175620235728ed1c3ad4bffc32fc0c042ceefab264e18cddf15ffce8
    
  • 토큰이 만료되어 재발행이 필요한 경우
    1. 발행된 토큰 리스트 확인
        kubeadm token list
      
    2. 기존 토큰 삭제
        kubeadm token delete <TOKEN>
      
    3. 신규 토큰 발행
        kubeadm token create --print-join-command
      
  • 조인 자체가 안되는 경우
    node1@node1:~$ sudo kubeadm join 192.168.123.120:6443 --token xgb8m3.h7hbxmpt6zpnq1pq --discovery-token-ca-cert-hash sha256:be0caf1a69e12f7742294b3d07bfa2d1c8fe5ab8c1cd3c105442b0010a78c057
    [preflight] Running pre-flight checks
    error execution phase preflight: [preflight] Some fatal errors occurred:
            [ERROR CRI]: container runtime is not running: output: time="2023-11-05T22:43:00+09:00" level=fatal msg="validate service connection: CRI v1 runtime API is not implemented for endpoint \"unix:///var/run/containerd/containerd.sock\": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
    , error: exit status 1
    [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
    To see the stack trace of this error execute with --v=5 or higher
    
    1. [WORKERNODE] vi 편집기로 config.toml파일 수정
        sudo vi /etc/containerd/config.toml
      
    2. [WORKERNODE] disabled_plugins 라인 주석처리 후 저장
        # disabled_plugins = ["cri"]
      
    3. [WORKERNODE] 컨테이너 재실행
        sudo systemctl restart containerd
      
    4. [WORKERNODE] 다시 조인
        sudo kubeadm join 192.168.123.120:6443 --token xgb8m3.h7hbxmpt6zpnq1pq --discovery-token-ca-cert-hash sha256:be0caf1a69e12f7742294b3d07bfa2d1c8fe5ab8c1cd3c105442b0010a78c057
      
  • 정상적으로 조인이 되는것을 확인할 수 있다.
    node1@node1:~$ sudo kubeadm join 10.0.0.4:6443 --token fz8v0
    >         --discovery-token-ca-cert-hash sha256:9ffebbe31308ebf03efdd8003e5502d6f6767cacf
    [preflight] Running pre-flight checks
    [preflight] Reading configuration from the cluster...
    [preflight] FYI: You can look at this config file with 'kube
    cm kubeadm-config -o yaml'
    
    This node has joined the cluster:
    
    * Certificate signing request was sent to apiserver and a response was received.
    * The Kubelet was informed of the new secure connection details.
    
    Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
    

설치 확인

master@master:~$ kubectl get pod --all-namespaces
NAMESPACE      NAME                             READY   STATUS    RESTARTS   AGE
kube-flannel   kube-flannel-ds-gm8tj            1/1     Running   0          83s
kube-flannel   kube-flannel-ds-w9j26            1/1     Running   0          86s
kube-flannel   kube-flannel-ds-zwzp4            1/1     Running   0          14m
kube-system    coredns-5dd5756b68-2wht8         1/1     Running   0          18m
kube-system    coredns-5dd5756b68-89ns4         1/1     Running   0          18m
kube-system    etcd-master                      1/1     Running   0          18m
kube-system    kube-apiserver-master            1/1     Running   0          18m
kube-system    kube-controller-manager-master   1/1     Running   0          18m
kube-system    kube-proxy-2qv8m                 1/1     Running   0          86s
kube-system    kube-proxy-nwfnn                 1/1     Running   0          18m
kube-system    kube-proxy-xvg7w                 1/1     Running   0          83s
kube-system    kube-scheduler-master            1/1     Running   0          18m
master@master:~$ kubectl get nodes
NAME     STATUS   ROLES           AGE     VERSION
master   Ready    control-plane   19m     v1.28.2
node1    Ready    <none>          2m36s   v1.28.2
node2    Ready    <none>          2m33s   v1.28.2

Share this post