- Create a K8s master and worker node on GCE with
kubeadm
- Create a bash CNI
- Deploy Pods and test connectivity
- Configure host, container and external networking
-
Kubernetes network model:
-
Create a master VM.
gcloud compute instances create k8s-master \ --zone us-west2-b \ --image-family ubuntu-1604-lts \ --image-project ubuntu-os-cloud \ --can-ip-forward
-
Create a worker VM.
gcloud compute instances create k8s-worker \ --zone us-west2-b \ --image-family ubuntu-1604-lts \ --image-project ubuntu-os-cloud \ --can-ip-forward
-
Open two cloud-shell terminals, SSH into
k8s-master
in one, andk8s-worker
on the other.gcloud compute ssh "k8s-master"
gcloud compute ssh "k8s-worker"
-
On both master and worker nodes install prerequisite packages.
sudo apt-get update sudo apt-get install -y docker.io apt-transport-https curl jq nmap iproute2
-
On both master and worker nodes install
kubeadm
,kubelet
, andkubectl
.sudo su curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - cat > /etc/apt/sources.list.d/kubernetes.list <<EOF deb http://apt.kubernetes.io/ kubernetes-xenial main EOF apt-get update && apt-get install -y kubeadm kubelet kubectl
-
On the master node start the cluster.
sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=NumCPU
Record the command provided in the output. It will used to join additional nodes to the cluster
The command to join the cluster should look something like this:
kubeadm join 10.168.0.6:6443 --token abcdef.1234567890123456789 --discovery-token-ca-cert-hash sha256:12345
-
On the worker node, join the worker to the cluster. Run the command from the previous master terminal output.
-
On the master node, initialize
kubectl
.exit # Assuming still root, this must be run as a regular user mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
-
On the master node, get Pod subnets.
kubectl describe node k8s-master | grep PodCIDR
PodCIDR: 10.244.0.0/24
kubectl describe node k8s-worker | grep PodCIDR
PodCIDR: 10.244.1.0/24
-
On both master and worker nodes, create
/etc/cni/net.d/10-bash-cni-plugin.conf
.sudo mkdir -p /etc/cni/net.d/ sudo touch /etc/cni/net.d/10-bash-cni-plugin.conf
Master:
{ "cniVersion": "0.3.1", "name": "mynet", "type": "bash-cni", "network": "10.244.0.0/16", "subnet": "10.244.0.0/24" }
Worker:
{ "cniVersion": "0.3.1", "name": "mynet", "type": "bash-cni", "network": "10.244.0.0/16", "subnet": "10.244.1.0/24" }
-
On both master and worker nodes, create the bridge.
Master:
sudo brctl addbr cni0 sudo ip link set cni0 up sudo ip addr add 10.244.0.1/24 dev cni0
Worker:
sudo brctl addbr cni0 sudo ip link set cni0 up sudo ip addr add 10.244.1.1/24 dev cni0
-
On both master and worker nodes, check generated routes.
Master:
ip route | grep cni0
10.244.0.0/24 dev cni0 proto kernel scope link src 10.244.0.1
Worker:
ip route | grep cni0
10.244.1.0/24 dev cni0 proto kernel scope link src 10.244.1.1
-
On both master and worker nodes, create
/opt/cni/bin/bash-cni
as root and give it executable permissionssudo chmod +x
.sudo touch /opt/cni/bin/bash-cni sudo chmod +x /opt/cni/bin/bash-cni
#!/bin/bash -e if [[ ${DEBUG} -gt 0 ]]; then set -x; fi exec 3>&1 # make stdout available as fd 3 for the result exec &>> /var/log/bash-cni-plugin.log IP_STORE=/tmp/reserved_ips # all reserved ips will be stored there echo "CNI command: $CNI_COMMAND" stdin=`cat /dev/stdin` echo "stdin: $stdin" allocate_ip(){ for ip in "${all_ips[@]}" do reserved=false for reserved_ip in "${reserved_ips[@]}" do if [ "$ip" = "$reserved_ip" ]; then reserved=true break fi done if [ "$reserved" = false ] ; then echo "$ip" >> $IP_STORE echo "$ip" return fi done } case $CNI_COMMAND in ADD) network=$(echo "$stdin" | jq -r ".network") subnet=$(echo "$stdin" | jq -r ".subnet") subnet_mask_size=$(echo $subnet | awk -F "/" '{print $2}') all_ips=$(nmap -sL $subnet | grep "Nmap scan report" | awk '{print $NF}') all_ips=(${all_ips[@]}) skip_ip=${all_ips[0]} gw_ip=${all_ips[1]} reserved_ips=$(cat $IP_STORE 2> /dev/null || printf "$skip_ip\n$gw_ip\n") # reserving 10.244.0.0 and 10.244.0.1 reserved_ips=(${reserved_ips[@]}) printf '%s\n' "${reserved_ips[@]}" > $IP_STORE container_ip=$(allocate_ip) mkdir -p /var/run/netns/ ln -sfT $CNI_NETNS /var/run/netns/$CNI_CONTAINERID rand=$(tr -dc 'A-F0-9' < /dev/urandom | head -c4) host_if_name="veth$rand" ip link add $CNI_IFNAME type veth peer name $host_if_name ip link set $host_if_name up ip link set $host_if_name master cni0 ip link set $CNI_IFNAME netns $CNI_CONTAINERID ip netns exec $CNI_CONTAINERID ip link set $CNI_IFNAME up ip netns exec $CNI_CONTAINERID ip addr add $container_ip/$subnet_mask_size dev $CNI_IFNAME ip netns exec $CNI_CONTAINERID ip route add default via $gw_ip dev $CNI_IFNAME mac=$(ip netns exec $CNI_CONTAINERID ip link show eth0 | awk '/ether/ {print $2}') echo "{ \"cniVersion\": \"0.3.1\", \"interfaces\": [ { \"name\": \"eth0\", \"mac\": \"$mac\", \"sandbox\": \"$CNI_NETNS\" } ], \"ips\": [ { \"version\": \"4\", \"address\": \"$container_ip/$subnet_mask_size\", \"gateway\": \"$gw_ip\", \"interface\": 0 } ] }" >&3 ;; DEL) ip=$(ip netns exec $CNI_CONTAINERID ip addr show eth0 | awk '/inet / {print $2}' | sed s%/.*%% || echo "") if [ ! -z "$ip" ] then sed -i "/$ip/d" $IP_STORE fi ;; GET) echo "GET not supported" exit 1 ;; VERSION) echo '{ "cniVersion": "0.3.1", "supportedVersions": [ "0.3.0", "0.3.1", "0.4.0" ] }' >&3 ;; *) echo "Unknown cni commandn: $CNI_COMMAND" exit 1 ;; esac
-
On the master node, check the node status.
kubectl get node
-
Untaint the master node (to allow the schedulerto schedule pods everywhere).
kubectl taint nodes k8s-master node-role.kubernetes.io/master-
The following will be performed on the Master nodes (where kubectl
is configured)
-
Add a sample Deployment.
kubectl apply -f https://raw.githubusercontent.com/s-matyukevich/bash-cni-plugin/master/01_gcp/test-deployment.yml
-
Get the Pod IPs.
kubectl describe pod | grep IP
IP: 10.244.0.4 IP: 10.244.1.3 IP: 10.244.0.6 IP: 10.244.1.2
-
Exec into one of the Pods.
kubectl exec -it bash-master bash
-
Verify network connectivity.
ping 10.168.0.6 # Can ping own host (e.g. k8s-master) ping 10.168.0.7 # Can’t ping different host (e.g. k8s-worker) ping 10.244.0.5 # Can ping own container ping 10.244.0.6 # Can’t ping a different container on the same host ping 10.244.1.3 # Can’t ping a container on a different host ping 108.177.121.113 # Can’t ping any external address
-
On the Worker node, examine forwarding rules.
sudo iptables -S FORWARD -P FORWARD DROP -A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD -A FORWARD -j DOCKER-ISOLATION -A FORWARD -o docker0 -j DOCKER -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A FORWARD -i docker0 ! -o docker0 -j ACCEPT -A FORWARD -i docker0 -o docker0 -j ACCEPT
-
On both master and worker nodes, allow forwarding.
sudo iptables -t filter -A FORWARD -s 10.244.0.0/16 -j ACCEPT sudo iptables -t filter -A FORWARD -d 10.244.0.0/16 -j ACCEPT
-
On both master and worker nodes, add masquerade rules.
Master:
sudo iptables -t nat -A POSTROUTING -s 10.244.0.0/24 ! -o cni0 -j MASQUERADE
Worker:
sudo iptables -t nat -A POSTROUTING -s 10.244.1.0/24 ! -o cni0 -j MASQUERADE
-
On a regular cloud-shell window, configure GCP routes.
gcloud compute routes create k8s-master --destination-range 10.244.0.0/24 --network default --next-hop-instance-zone us-west2-b --next-hop-instance k8s-master gcloud compute routes create k8s-worker --destination-range 10.244.1.0/24 --network default --next-hop-instance-zone us-west2-b --next-hop-instance k8s-worker
-
Retest network connectivity, all pings should be successful (see previous exercise).
-
Resulting solution:
Check our blog post Kubernetes Networking: How to Write Your Own CNI Plug-in with Bash for more details.
-
Delete GCP routes.
gcloud compute routes delete k8s-master k8s-worker --quiet
-
Delete the k8s VMs.
gcloud compute instances delete k8s-master k8s-worker --quiet
Next: Services