June 19, 2020

Putting up a Rancher Kubernetes Cluster on Bare Metal

Inline Non-hyperlinked Image

When it comes time to set up a multi-node cluster on bare metal servers, it doesn't get easier than Rancher's RKE. Here's a quick guide for standing up your first development cluster.

Set up your servers

Do the usual prep for cluster software: same user, same SSH key accepted for that user, passwordless SUDO etc. Install Docker on the nodes.

Download the Command Line tool

Grab the rke CLI for your OS from the releases page

Set up the initial RKE cluster

Have a look at some starter configs and adjust as necessary:

Once you have the config to your liking, run

  # rke up
  rke up --config cluster.yaml --ssh-agent-auth
  KUBECONFIG=kube_config_cluster.yaml kubectl get nodes

You get to skip setting up Pod Networking for RKE - Canal is already installed and configured

Set up Load Balancing

On our bare metal cluster, we'll use MetalLB - be sure to check releases to get the right URL

  # use new namespace metallb-system
  kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/namespace.yaml
  kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/metallb.yaml
  # On first install only
  kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"

Give MetalLB a pool of IP addresses

  • Here I'm using a pool from the cluster's network

      apiVersion: v1
      kind: ConfigMap
      metadata:
        namespace: metallb-system
        name: config
      data:
        config: |
          address-pools:
          - name: default
            protocol: layer2
            addresses:
            - 172.16.27.230-172.16.27.250      

Install Rancher on the cluster

We're going to install a cluster manager … on the cluster being managed.

There are several options for getting a Web UI overview of either a single cluster or multiple clusters. These will usually offer the ability to display resource usage, view and edit running resources, and create new resources. Some allow higher level options like setting workloads to run on multiple clusters, deploying secrets and config maps across clusters, etc.

A great choice for this is Rancher (not RKE or K3s, which are Kubernetes distributions offered by Rancher Labs). All you have to do to get started is to follow the guide at Rancher Docs: Manual Quick Start

Rancher gives us a great UI for management of this cluster and any other clusters we may want to manage in the future (RKE or not)

  kubectl create ns cattle-system
  helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
  helm install rancher rancher-latest/rancher \
       --namespace cattle-system \
       --set hostname=rancher.local

I find that the Helm chart rancher-2.4.4 exposes port 80 but does not expose port 443. So I needed to expose that before I could run Rancher UI in a browser.

I edited the rancher Service to expose the SSL port under spec.ports

  - name: https
    port: 443
    protocol: TCP
    targetPort: 443

If you should ever need to reset the Rancher admin user password

  kubectl  exec -n cattle-system \
           $(kubectl get pods -n cattle-system -o json | jq -r '.items[] | select(.spec.containers[].name=="rancher") | .metadata.name') -- reset-password

After all that hoohaw, I should mention that you can also simply run Rancher from a Docker container - and use that to manage multiple clusters.

  # Run Rancher on all network interfaces
  docker run --name rancher -d --restart=unless-stopped -p 0.0.0.0:80:80 -p 0.0.0.0:443:443 rancher/rancher
  # Reset password when needed
  # docker exec -ti <container_id> reset-password

Set up an ingress controller

I used Traefik, installed using the handy Rancher catalog

Set parameters:

Service Type: NodePort
SSL: True
Force HTTP to HTTPS
Enable Dashbord - domain traefik.local

Once deployed, enable load balancer access to the Traefik UI (or create an Ingress, of course)

kubectl -n traefik edit svc traefik-dashboard

  • Change to LoadBalancer

An example of creating an Ingress. This is to get into the Longhorn UI

  apiVersion: extensions/v1beta1
  kind: Ingress
  metadata:
    namespace: longhorn-system
    labels:
      app: longhorn-frontend
    name: longhorn-ui
  spec:
    rules:
    - host: longhorn-frontend.example.com
      http:
        paths:
        - backend:
            serviceName: longhorn-frontend
            servicePort: http

Establish a storage solution

Longhorn

Longhorn is an open source project created by Rancher Labs

I love this solution because it' so easy to use in my homelab setting: just point it to any unused already-formatted disk on each of your nodes. Longhorn will manage all that space in a pool, and automatically create replication sets of any persistent volumes you create.

Install prerequisite: open-iscsi

  sudo apt-get -y install open-iscsi

Install Longhorn

Since we have Rancher installed now, you can use the app catalog feature to do this. Simply create a project - I called mine Storage. Then add a newly created longhorn-system namespace to that project. Then select the Longhorn app from the catalog and install!

Alternatively you can install using kubectl:

  kubectl apply -f  https://raw.githubusercontent.com/longhorn/longhorn/master/deploy/longhorn.yaml

If you want to create easy access to the Longhorn UI, change the longhorn-frontend service to either NodePort or LoadBalancer. If the latter, you will need to implement a load balancer solution such as MetalLB (see below)

Optionally make one storage class the default

Add annotation to the desired StorageClass resource

  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
  • Check with kubectl get sc

      NAME                 PROVISIONER          RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
      longhorn (default)   driver.longhorn.io   Delete          Immediate           true                   21m
    

Tear down your cluster when the time comes

  rke remove --config cluster.yaml --ssh-agent-auth

Cleanup after removal of any distribution

Some components may need manual removal

  sudo rm -rf /var/lib/longhorn /etc/cni/net.d/ /etc/kubernetes /data/longhorn/*

Optionally assure that all the Docker containers are retired

  docker stop `docker ps -a -q`
  docker rm `docker ps -a -q`

© Greg Grubbs 2008-2021

Powered by Hugo & Kiss.