As you may know, I selfhost several services such as Matrix (federated messaging), Mastodon (federated microblogging), CalDAV/CardDAV server, etc.

I have several reasons why I selfhost these services:

  • Own your own data instead of storing these on servers of Big Tech companies who violate your privacy.
  • Be a part of the ‘fediverse’ or Matrix federation, instead of registering on centralized servers.
  • Learn something about DevOps and keep your systems online.

In this series of blogposts, I will explain how I configured my homeservers as a Nomad cluster with Consul as a DNS resolver for the cluster nodes and services.

Architecture

Currently, I have the following hardware at my disposal:

In a perfect world, the cluster should have redudancy in terms of worker nodes, master nodes and even have multiple instances of each service running to avoid any down time (redudancy). However, I don’t have the budget available or the space to accomplish this. Because of this restriction, I will use the Raspberry Pi as master node and the Odroid HC2s as worker nodes. This way, I have redudancy in terms of worker nodes, but no backup in case the master node goes down. Fortunately, addtional master and worker nodes can always be added later on without having any down time.

Cluster architecture

Configuring a worker node

Install OS

Most SBC use a microSD card as disk for the OS. It’s advised to use at least a UHS-1 class 10 or higher microSD card. Use GParted or GNOME disks to format the microSD card as EXT4 with a MS DOS partition table:

GParted for formatting the microSD card

I picked Armbian as OS for the worker nodes since it provides great support for Odroid devices and is configured out-of-the-box with zRAM support and tmpfs filesystem for logging. This greatly improves the performance of the SBC and reduces the wear levelling for the microSD card.

Note: Armbian can be installed on a disk through armbian-config, however I don’t use this feature for easier backups.

Download the latest Armbian release for your SBC, in my case, I downloaded Armbian 20.04 (focal) with Linux kernel 5.4 for Odroid HC2 SBCs.

Armbian download page for the Odroid HC2

Flash Armbian using dd or GNOME disks. I used GNOME disks to make this process easy when setting up multiple worker nodes.

  1. Click on your microSD card in GNOME disks
  2. Under the 3-dots button, you can click on ‘Restore image’
  3. Select the Armbian image and click on ‘Restore’

Grab a cup of coffee, this can take some time :smile:

GNOME disks for flashing an Armbian image on the microSD card

Login over SSH

Properly eject your microSD card and put it into the SBC. Let it boot for some time and try to login over SSH:

ssh root@<IP>

If you don’t know the IP of the SBC, you can use arp-scan:

sudo arp-scan --localnet
Starting arp-scan 1.9.7 with 256 hosts (https://github.com/royhills/arp-scan)
<IP>            <MAC>                   <Ethernet interface name of the device>

Armbian will greet you and ask you to configure a new default USER and passwords. Once configured, logout as root and copy your SSH key of your machine to the SBC:

ssh-copy-id <USER>@<IP>

Try to login again as your user, you should not get a SSH password prompt if your SSH key is unlocked:

ssh <USER>@<IP>

Install the UFW firewall

Most Linux distributions do not have a firewall installed and enabled by default. I am a big fan of UFW (Uncomplicated FireWall) because it is so easy to configure :smile:

sudo apt install ufw  # Install UFW from the repositories
sudo ufw allow ssh  # Allow SSH access
sudo ufw enable  # Enable firewall

Mount SSD on boot

The Odroid HC2 has a USB3 <-> SATA convertor which I use for accessing a SSD. This SSD needs to be mounted during boot, we can use /etc/fstab for this purpose.

Create a mounting point for your drive:

sudo mkdir /data  # Location
sudo groupadd data  # Create a group for accessing this drive
sudo usermod -aG data <USER>  # Add the users who needs access to the group
sudo chown -R :data /data  # Apply a group change on your mount point

Find the UUID of your drive:

sudo blkid

Write down your UUID and edit /etc/fstab:

sudo vim /etc/fstab

Add the following line to the file:

UUID=<UUID>    /data   auto    nosuid,noatime,nodiratime,nodev,nofail  0   0
  • UUID: The UUID of your drive
  • /data: Location to mount the drive
  • auto: Determine the filesystem automatically
  • nosuid: Disable setting the useruid for security reasons
  • noatime and nodiratime: Disable timestamp for reducing the wearleveling of the SSD
  • nofail: Do not fail to boot when the drive cannot be mounted

Test the changes before rebooting:

sudo mount -a

Run armbian-config

Armbian provides a configuration utitly like raspi-config on the Raspberry Pi. Launch it: sudo armbian-config

  • Configure system with ondemand gov
  • Switch DTB to HC1/HC2 instead of XU4
  • Disable root and password login through SSH
  • Upgrade firmware to the latest version
  • Configure static IP, need to reconnect over SSH afterwards
  • Configure hostname

Further hardening of SSH

Limit SSH logins to only certain users, such as your default user. If an user like nomad or consul is compromised, it cannot be used to login over SSH.

sudo vim /etc/ssh/sshd_config
# Add the following at the end of the file:
AllowUsers <USER>

# Check if password login and root access is disabled:
PasswordAuthentication no
PermitRootLogin no

# Restart SSH
sudo systemctl restart sshd
Armbian configuration utility

Install cluster software

The cluster is operated by Nomad and Consul, an alternative to Kubernetes.

I picked this setup over Kubernetes because:

  1. Easy to install, just download a binary from the download page and run it on each node. There’s no difference between master and worker nodes.
  2. Less components than Kubernetes which makes maintenance easier.
  3. Nomad can not only run Docker jobs, but also Java VMs or even a simple script as job.

Consul

Consul is responsible for resolving FQDNs of services and nodes. Consul provides a DNS service on port 8600 and a UI on port 8500. It doesn’t matter on which node you access the UI, they act as a cluster.

Create a user to run Consul and become that user:

sudo useradd consul -m --shell=/bin/bash
sudo su consul

Downloading consul for the Odroid HC2:

wget https://releases.hashicorp.com/consul/1.8.4/consul_1.8.4_linux_armhfv6.zip
unzip consul_1.8.4_linux_armhfv6.zip
rm consul_1.8.4_linux_armhfv6.zip

Verify the consul binary:

./consul -v
Consul v1.8.4
Revision 12b16df32

Create a config file /home/consul/config.json for a worker node:

{
    "client_addr": "<IP>",
    "datacenter": "<DATACENTER>",
    "data_dir": "<STORAGE LOCATION>",
    "domain": "consul",
    "dns_config": {
        "enable_truncate": true,
        "only_passing": true
    },
    "enable_syslog": true,
    "encrypt": "<ENCRYPTION KEY>",
    "leave_on_terminate": true,
    "log_level": "INFO",
    "rejoin_after_leave": true,
    "ui": true,
    "start_join": [
        "<IP CONSUL MASTER NODE>"
    ]
}
  • IP: IP address of the node
  • DATACENTER: Name of the datacenter to join
  • STORAGE LOCATION: Location where consul may write to
  • ENCRYPTION KEY: A symetric key used by consul agents to encrypt their traffic. You have to generate one.
  • IP CONSUL MASTER NODE: The IP address of the consul master node. The node will join the consul cluster by contacting the master node.

Now that Consul is ready to go, we can install consul as a systemd service, by creating a new service:

sudo vim /etc/systemd/system/consul.service

And add the following content with IP the IP address of the node and CONFIG FILE the path to the consul config file.

[Unit]
Description=Consul cluster leader
Documentation=https://consul.io/docs/
Wants=network-online.target
After=network-online.target

[Service]
User=consul
Group=consul
ExecReload=/bin/kill -HUP $MAINPID
ExecStart=/home/consul/consul agent -bind <IP> -config-file <CONFIG FILE>
KillMode=process
KillSignal=SIGINT
LimitNOFILE=infinity
LimitNPROC=infinity
Restart=on-failure
RestartSec=10
StartLimitIntervalSec=0  # Disable rate limiting for restarts
TasksMax=infinity

[Install]
WantedBy=multi-user.target

Enable and start the service as your default USER:

sudo systemctl enable consul
sudo systemctl start consul

Nomad

Nomad can run Docker containers, Java VMs and scripts as cluster jobs. It monitors jobs, assigns them to workers and registers everything with Consul. No configuration is needed to access the services when Consul integration is enabled.

First, create a user to run Nomad as your default USER and become that user:

sudo useradd nomad -m --shell=/bin/bash
sudo su nomad

The installation of Nomad is almost the same as Consul, download the binary and verify it:

wget https://releases.hashicorp.com/nomad/0.12.4/nomad_0.12.4_linux_arm.zip
unzip nomad_0.12.4_linux_arm.zip
rm nomad_0.12.4_linux_arm.zip
./nomad -v
Nomad v0.12.4 (8efaee4ba5e9727ab323aaba2ac91c2d7b572d84)

Create a Nomad config /home/nomad/config.hcl and add the following content with IP the IP address of the node, <STORAGE LOCATION> where Nomad may write to, <DATACENTER NAME> name of the datacenter and NOMAD MASTER NODE with the IP address of the Nomad master node.

# Increase log verbosity
log_level = "INFO"

# Setup data dir
data_dir = "<STORAGE LOCATION>"

# Datacenter to join
datacenter = "<DATACENTER NAME>"

# Enable the client
client {
    enabled = true

    # This can be changed to nomad.service.consul if DNS forwarding is enabled 
    # with Consul.
    servers = ["<NOMAD MASTER NODE>:4647"]
}

# Disable the server
server {
    enabled = false
}

# Prometheus configuration
telemetry {
    collection_interval = "5s"
    disable_hostname = true
    prometheus_metrics = true
    publish_allocation_metrics = true
    publish_node_metrics = true
}

# Consul configuration
consul {
    address             = "<IP>:8500"
}

Now that Nomad is ready to go, we can install consul as a systemd service, by creating a new service:

sudo vim /etc/systemd/system/nomad.service

And set CONFIG to the path of the Nomad config file.

[Unit]
Description=Nomad cluster leader
Documentation=https://nomadproject.io/docs/
Wants=network-online.target
After=network-online.target

[Service]
User=nomad
Group=nomad
ExecReload=/bin/kill -HUP $MAINPID
ExecStart=/home/nomad/nomad agent -config <CONFIG>
KillMode=process
KillSignal=SIGINT
LimitNOFILE=infinity
LimitNPROC=infinity
Restart=on-failure
RestartSec=10
StartLimitIntervalSec=0  # Disable rate limiting for restarts
TasksMax=infinity

[Install]
WantedBy=multi-user.target

Enable and start the service as your default USER:

sudo systemctl enable nomad
sudo systemctl start nomad