I recently stood up an Apache Kubernetes cluster in my home lab and am very happy with the results. Here's what I did.
My homelab is comprised not of Raspberry Pi nodes, but old abandoned laptops. This gives me some surprising power overall. I have 4 laptops, which I have set up as 1 master and 3 workers.
The single master has:
- 8 GB Ram
- 100GB available disk
- 4 CPU cores +hyperthreading
The workers have:
- 16GB RAM
- between 100GB and 500GB available disk
- 4 CPU cores +hyperthreading
The available disk is all formatted as ext4
- as it turns out I was able to use that space for
my persistent volumes without reformatting or partitioning.
Let's get to the steps you need to follow
Choose an OS, Install kubeadm
and configure as usual
I chose current Ubuntu Server LTS (20.04) for my setup.
There are a series of steps to be taken in almost any scenario installing Kubernetes or
other cluster compute scenarios. I am not going to go deeply into these here. You
should set up passwordless sudo
, and follow the requirements needed for kubeadm
to
work as described in this k8s.io link:
One aspect I want to point out is setting the cgroup for use by kubeadm
. If you
choose to use Docker as your container runtime, you should set it to use the cgroup
systemd
rather than cgroupfs
as explained here.
In /etc/docker/daemon.conf
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
Then reload
sudo systemctl daemon-reload
sudo systemctl restart docker
Set up the master node (aka control plane)
The one thing you should keep in mind prior to initializing the master node: decide
which pod networking system you will use, and make sure you prepare your kubeadm init
parameters to suite that system.
In my case, I chose Calico with the default Pod CIDR.
You can find the requirements here:
Run kubeadm init
with POD CIDR set for Calico
sudo kubeadm init --pod-network-cidr=192.168.0.0/16
Assure that Kubeadm detected the systemd
cgroup - you will see it in the command
output.
If everything goes well, you'll get a command listed that you must save in order to join worker nodes to this master.
Install pod networking
Until this step is done, kubectl get nodes
will show all nodes as "not ready"
I have chosen Calico
curl https://docs.projectcalico.org/manifests/calico.yaml -O
kubectl apply -f calico.yaml
Additional instructions can be found here:
Give it some time to start up, then test that nodes are ready with kubectl get nodes
Install the worker nodes
Use the join command given at the end of the master node's kubeadm init
command
output.
On each worker node repeat the join command - similar to the below
sudo kubeadm join <master IP>:6443 --token <token string> \
--discovery-token-ca-cert-hash sha256:<long hexadecimal string>
Install Metrics Server
The very least you need - prior to getting Prometheus or something similar working.
This will enable kubectl top nodes/pods
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
Install package managers
You already have the ability to install any applications you want just using the
kubectl
command. For complex software applications, I like to additionally have both
Helm and Apache KUDO. These manage what is known as the Operator Pattern in Kubernetes.
Helm
- Follow the instructions to install the Helm client: Helm | Installing Helm
-
Add the default repository
helm repo add stable https://kubernetes-charts.storage.googleapis.com/
KUDO
- Follow the instructions to install the
kubectl-kudo
client: Getting Started | KUDO -
Install the server side components
kubectl kudo init
TODO Investigate Kustomize
- This is an alternative to KUDO - another declarative approach
Set up a storage solution
You will want to have more flexibility than provided by Kubernetes default storage
types like hostPath
and local
. With more than one node, those options are brittle
and limiting.
While you have many persistent storage options, I was taken with Rancher Labs' recent contribution to the storage fray - a new OSS project called Longhorn. I'm delighted with how easy it was to install, as well as its ease of use and nice UI.
One great thing about it is that it can just use directory paths of already-formatted disk. So I am easily able to mount partitions and even use directory paths mounted on root if I want - and not have to set up raw partitions. This is just ideal for a non-production home lab situation. Longhorn creates replicas of each volume created from a PVC, making it robust in the face of failing nodes and power outages.
Simple Kubectl way to install
kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/master/deploy/longhorn.yaml
One thing I did to make things simpler is to set the longhorn
storage class to be
the default on my cluster. Make sure the relevant annotation for the storage class is
set to true
annotations:
storageclass.kubernetes.io/is-default-class: "true"
Now when you or one of your managed packages creates a PersistentVolumeClaim
, Longhorn
will generate the volume from the disk you have allocated for its use - with automatic
replication, monitoring and options for backup and restore!
Implement an Ingress Controller
In a home lab environment this is definitely optional. Proxy with kubectl proxy
may
work well for you, or even several sessions running kubectl port-forward
. But I
wanted something a little close to the load balancer resources provided by the cloud
platforms.
The obvious choice for bare metal is MetalLB, so that's what put in for accessing
applications on the cluster. This step can easily be delayed until after you decide you
have too many applications installed to manage with port forwards. Once implemented,
you just need to switch relevant Kubernetes Service resources type from e.g. NodeIP
to
Ingress
and you will have both a NodePort
and a load balancer IP added!
- ref Lab Guide - Kubernetes Load Balancer and Ingress with MetalLB
- ref MetalLB, bare metal load-balancer for Kubernetes
# use new namespace metallb-system
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/namespace.yaml
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/metallb.yaml
# On first install only
kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"
- Use simple Layer 2 allocation with pool of reserved IPs
-
Here's the resource pool I implemented
apiVersion: v1 kind: ConfigMap metadata: namespace: metallb-system name: config data: config: | address-pools: - name: default protocol: layer2 addresses: - 192.168.1.230-192.168.1.250
Go crazy
After getting all this done, I've installed things I wanted to get running such as MySQL via Helm, Jupyter+Spark via custom Helm chart, Folding@Home via kubectl, Kafka via KUDO and other applications.
Many interesting projects are now filling up my TODO list - but this is a great start. I hope you found it useful!