June 3, 2020

Set up Kubernetes in your Home Lab

Inline Non-hyperlinked Image

I recently stood up an Apache Kubernetes cluster in my home lab and am very happy with the results. Here's what I did.

My homelab is comprised not of Raspberry Pi nodes, but old abandoned laptops. This gives me some surprising power overall. I have 4 laptops, which I have set up as 1 master and 3 workers.

The single master has:

  • 8 GB Ram

  • 100GB available disk

  • 4 CPU cores +hyperthreading

The workers have:

  • 16GB RAM

  • between 100GB and 500GB available disk

  • 4 CPU cores +hyperthreading

The available disk is all formatted as ext4 - as it turns out I was able to use that space for my persistent volumes without reformatting or partitioning.

Let's get to the steps you need to follow

Choose an OS, Install kubeadm and configure as usual

I chose current Ubuntu Server LTS (20.04) for my setup.

There are a series of steps to be taken in almost any scenario installing Kubernetes or other cluster compute scenarios. I am not going to go deeply into these here. You should set up passwordless sudo, and follow the requirements needed for kubeadm to work as described in this k8s.io link:

One aspect I want to point out is setting the cgroup for use by kubeadm. If you choose to use Docker as your container runtime, you should set it to use the cgroup systemd rather than cgroupfs as explained here.

In /etc/docker/daemon.conf

  {
    "exec-opts": ["native.cgroupdriver=systemd"],
    "log-driver": "json-file",
    "log-opts": {
      "max-size": "100m"
    },
    "storage-driver": "overlay2"
  }

Then reload

  sudo systemctl daemon-reload
  sudo systemctl restart docker

Set up the master node (aka control plane)

The one thing you should keep in mind prior to initializing the master node: decide which pod networking system you will use, and make sure you prepare your kubeadm init parameters to suite that system.

In my case, I chose Calico with the default Pod CIDR.

You can find the requirements here:

Run kubeadm init with POD CIDR set for Calico

  sudo kubeadm init --pod-network-cidr=192.168.0.0/16

Assure that Kubeadm detected the systemd cgroup - you will see it in the command output.

If everything goes well, you'll get a command listed that you must save in order to join worker nodes to this master.

Install pod networking

Until this step is done, kubectl get nodes will show all nodes as "not ready"

I have chosen Calico

        curl https://docs.projectcalico.org/manifests/calico.yaml -O
        kubectl apply -f calico.yaml

Additional instructions can be found here:

Give it some time to start up, then test that nodes are ready with kubectl get nodes

Install the worker nodes

Use the join command given at the end of the master node's kubeadm init command output.

On each worker node repeat the join command - similar to the below

  sudo kubeadm join <master IP>:6443 --token <token string> \
      --discovery-token-ca-cert-hash sha256:<long hexadecimal string>

Install Metrics Server

The very least you need - prior to getting Prometheus or something similar working.

This will enable kubectl top nodes/pods

  kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml

Install package managers

You already have the ability to install any applications you want just using the kubectl command. For complex software applications, I like to additionally have both Helm and Apache KUDO. These manage what is known as the Operator Pattern in Kubernetes.

Helm

  • Follow the instructions to install the Helm client: Helm | Installing Helm

  • Add the default repository

      helm repo add stable https://kubernetes-charts.storage.googleapis.com/

KUDO

  • Follow the instructions to install the kubectl-kudo client: Getting Started | KUDO

  • Install the server side components

      kubectl kudo init

TODO Investigate Kustomize

  • This is an alternative to KUDO - another declarative approach

Set up a storage solution

You will want to have more flexibility than provided by Kubernetes default storage types like hostPath and local. With more than one node, those options are brittle and limiting.

While you have many persistent storage options, I was taken with Rancher Labs' recent contribution to the storage fray - a new OSS project called Longhorn. I'm delighted with how easy it was to install, as well as its ease of use and nice UI.

One great thing about it is that it can just use directory paths of already-formatted disk. So I am easily able to mount partitions and even use directory paths mounted on root if I want - and not have to set up raw partitions. This is just ideal for a non-production home lab situation. Longhorn creates replicas of each volume created from a PVC, making it robust in the face of failing nodes and power outages.

Simple Kubectl way to install

  kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/master/deploy/longhorn.yaml

One thing I did to make things simpler is to set the longhorn storage class to be the default on my cluster. Make sure the relevant annotation for the storage class is set to true

    annotations:
      storageclass.kubernetes.io/is-default-class: "true"

Now when you or one of your managed packages creates a PersistentVolumeClaim, Longhorn will generate the volume from the disk you have allocated for its use - with automatic replication, monitoring and options for backup and restore!

Implement an Ingress Controller

In a home lab environment this is definitely optional. Proxy with kubectl proxy may work well for you, or even several sessions running kubectl port-forward. But I wanted something a little close to the load balancer resources provided by the cloud platforms.

The obvious choice for bare metal is MetalLB, so that's what put in for accessing applications on the cluster. This step can easily be delayed until after you decide you have too many applications installed to manage with port forwards. Once implemented, you just need to switch relevant Kubernetes Service resources type from e.g. NodeIP to Ingress and you will have both a NodePort and a load balancer IP added!

  # use new namespace metallb-system
  kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/namespace.yaml
  kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/metallb.yaml
  # On first install only
  kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"
  • Use simple Layer 2 allocation with pool of reserved IPs

  • Here's the resource pool I implemented

      apiVersion: v1
      kind: ConfigMap
      metadata:
        namespace: metallb-system
        name: config
      data:
        config: |
          address-pools:
          - name: default
            protocol: layer2
            addresses:
            - 192.168.1.230-192.168.1.250

Go crazy

After getting all this done, I've installed things I wanted to get running such as MySQL via Helm, Jupyter+Spark via custom Helm chart, Folding@Home via kubectl, Kafka via KUDO and other applications.

Many interesting projects are now filling up my TODO list - but this is a great start. I hope you found it useful!

© Greg Grubbs 2008-2020

Powered by Hugo & Kiss.