371 lines
14 KiB
Markdown
371 lines
14 KiB
Markdown
# Homelab Kubernetes Pipeline
|
|
|
|
This cool repo bootstraps a hybrid kubeadm cluster and then hands app delivery to
|
|
Argo CD.
|
|
|
|
## Architecture
|
|
|
|
The lab is intentionally small but production-shaped:
|
|
|
|
- a Debian amd64 host runs the kubeadm control plane and local deployment tools
|
|
- a Raspberry Pi arm64 node runs selected workloads
|
|
- a provisioning layer can PXE boot Debian 13 arm64 VMs for Pimox worker
|
|
templates
|
|
- OpenTofu owns the bootstrap layers for cluster, platform, apps, and edge
|
|
- Argo CD continuously reconciles Kubernetes manifests from this repo
|
|
- a local registry stores the website and demos images built for the worker
|
|
architecture
|
|
- an OCI jump box provides the public edge path back into the homelab over
|
|
Tailscale
|
|
|
|
Run `./lab.sh up` and `./lab.sh nuke` only from the Debian homelab server. The
|
|
script intentionally refuses to run from non-Debian machines so a laptop cannot
|
|
accidentally modify the cluster.
|
|
|
|
## Flow
|
|
|
|
1. `bootstrap/provisioning`
|
|
- prepares a Debian server as a PXE and preseed service for arm64 VMs
|
|
- serves Debian 13 arm64 netboot assets through TFTP and HTTP
|
|
- creates a golden image install path with Kubernetes, containerd,
|
|
qemu-guest-agent, cloud-init, and storage client packages ready
|
|
- stays out of `./lab.sh up` so VM template creation remains manual
|
|
|
|
2. `bootstrap/cluster`
|
|
- creates the kubeadm control plane on the Debian amd64 node
|
|
- joins worker nodes such as Raspberry Pi arm64 nodes
|
|
- configures Calico-compatible pod CIDR
|
|
- configures containerd to pull from the in-cluster NodePort registry
|
|
- creates retained host directories under `/var/openebs/local`
|
|
|
|
3. `bootstrap/platform`
|
|
- installs a minimal Calico deployment through the Tigera operator
|
|
- installs OpenEBS
|
|
- creates `openebs-hostpath-retain`
|
|
- installs Argo CD
|
|
- registers the private GitOps repo without storing the SSH private key in
|
|
Terraform state
|
|
|
|
4. `bootstrap/apps`
|
|
- registers Argo CD Applications from the `applications` map
|
|
- default apps are `container-registry`, `gitea`, `website-production`, and
|
|
`demos-static`
|
|
|
|
5. `bootstrap/edge`
|
|
- connects to the OCI jump box
|
|
- uploads nginx, HAProxy, Varnish, and Squid configs
|
|
- obtains and renews Let's Encrypt certificates for the configured hostname
|
|
- runs the edge cache/proxy chain with Docker Compose
|
|
|
|
## Prerequisites
|
|
|
|
On the Debian host:
|
|
|
|
- OpenTofu
|
|
- Docker with Buildx
|
|
- kubeadm, kubelet, kubectl, and containerd
|
|
- SSH access to worker nodes
|
|
- SSH access to the OCI edge host
|
|
- enough persistent storage for `/var/openebs/local` and `/var/lib/docker`
|
|
|
|
The default kubeconfig path is `/home/jv/.kube/config`. Override it with
|
|
`KUBECONFIG_PATH` or `TF_VAR_kubeconfig_path` when needed.
|
|
|
|
## Deploying
|
|
|
|
From the Debian server:
|
|
|
|
```bash
|
|
cd ~/my-homelab-configs
|
|
./lab.sh up
|
|
```
|
|
|
|
The script applies the OpenTofu stacks in order, refreshes Argo CD apps, waits
|
|
for the local registry, builds the website and demos images when their source
|
|
changed, pushes them to the registry, recreates pods only after a new image is
|
|
built, and then applies the edge stack.
|
|
|
|
The website and demos images default to `linux/arm64` because both deployments
|
|
are pinned to the Raspberry Pi worker. Override with `WEBSITE_IMAGE_PLATFORMS`
|
|
or `DEMOS_IMAGE_PLATFORMS` only if node placement changes.
|
|
|
|
Build metadata is written under `.lab/` so repeat runs can skip the website
|
|
or demos image build when the source hash, platform, image reference, and
|
|
registry manifest still match.
|
|
|
|
## Validation
|
|
|
|
Useful checks after a rebuild:
|
|
|
|
```bash
|
|
export KUBECONFIG=/home/jv/.kube/config
|
|
|
|
kubectl get nodes
|
|
kubectl -n argocd get applications
|
|
kubectl -n container-registry get pods
|
|
kubectl -n gitea-system get pods
|
|
kubectl -n website-production get pods -o wide
|
|
kubectl -n demos-static get pods -o wide
|
|
|
|
docker info --format '{{.DockerRootDir}}'
|
|
df -h / /var/openebs/local /var/lib/docker
|
|
```
|
|
|
|
The website should be reached through the configured public hostname, not the raw
|
|
OCI IP address, because the Let's Encrypt certificate is issued for the
|
|
hostname.
|
|
|
|
## Adding Nodes
|
|
|
|
For Pimox on Orange Pi 5 Plus, use `bootstrap/provisioning` to create a Debian
|
|
13 arm64 golden image first. The layer serves PXE, preseed, and guest-prep
|
|
assets from the Debian homelab server, then the installed VM can be sealed and
|
|
converted to a Pimox template. Details are in `bootstrap/provisioning/README.md`.
|
|
|
|
Add entries to `bootstrap/cluster/variables.tf` or a `.tfvars` file:
|
|
|
|
```hcl
|
|
worker_nodes = {
|
|
raspberrypi = {
|
|
host = "192.168.100.89"
|
|
user = "jv"
|
|
node_name = "raspberry"
|
|
ssh_key_path = "/home/jv/.ssh/id_ed25519"
|
|
}
|
|
}
|
|
```
|
|
|
|
Stateful apps currently pin retained local PVs to the `debian` node. Move or
|
|
duplicate those PV manifests when you want storage on another node.
|
|
|
|
The website and demos NodePorts are reachable from the OCI jump box through the
|
|
Raspberry Pi Tailscale interface. `bootstrap/cluster` installs a persistent
|
|
`homelab-tailscale-nodeport.service` on the configured worker to restore the
|
|
route, rp_filter settings, and iptables rules after reboot. Override the
|
|
defaults through `tailscale_nodeport_access` when the jump-box IP, Pi Tailscale
|
|
IP, pod CIDR, primary NodePort, or pod target port changes. Add any additional
|
|
public NodePorts to `tailscale_nodeport_extra_ports`:
|
|
|
|
```hcl
|
|
tailscale_nodeport_access = {
|
|
enabled = true
|
|
worker_key = "raspberrypi"
|
|
peer_ip = "100.118.255.19"
|
|
node_tailscale_ip = "100.77.80.72"
|
|
pod_cidr = "10.244.0.0/16"
|
|
node_port = 30080
|
|
target_port = 80
|
|
}
|
|
|
|
tailscale_nodeport_extra_ports = [30081]
|
|
```
|
|
|
|
For `./lab.sh nuke`, set `WORKER_SSH_TARGETS` to a space-separated list of
|
|
remote SSH targets when more worker nodes exist. Set it to an empty string for a
|
|
single-node rebuild.
|
|
|
|
## Adding Platform Tools
|
|
|
|
Add Helm releases through `bootstrap/platform`'s `extra_helm_releases` map.
|
|
|
|
## Edge Services
|
|
|
|
The OCI jump box runs the public edge path:
|
|
|
|
```text
|
|
nginx -> HAProxy -> Varnish/Squid -> Raspberry Pi Tailscale NodePort
|
|
```
|
|
|
|
The `bootstrap/edge` stack renders configs from `bootstrap/edge/templates` and
|
|
deploys them to `/opt/homelab-edge` on the OCI host. Defaults are in
|
|
`bootstrap/edge/variables.tf`; override them through `TF_VAR_*` or a `.tfvars`
|
|
file when the public host, SSH key, server name, backend Tailscale IP, or
|
|
NodePort changes.
|
|
|
|
Use the configured `server_name` in the browser, for example
|
|
`https://lab2025.duckdns.org`. A raw OCI IP address will still show a browser
|
|
certificate warning because the trusted certificate is issued for the hostname.
|
|
|
|
The edge stack uses HTTP-01 validation, so public DNS for `server_name` must
|
|
point to the OCI public IP and inbound TCP 80 and 443 must be open before
|
|
`./lab.sh up` runs. Set `TF_VAR_letsencrypt_email` to receive expiry notices,
|
|
or leave it empty to register without an email. Set
|
|
`TF_VAR_enable_letsencrypt=false` to keep using the temporary local certificate.
|
|
|
|
## Adding Apps
|
|
|
|
Add Kubernetes manifests under `apps/<name>` and register them in
|
|
`bootstrap/apps`'s `applications` map. Argo CD will own sync, pruning, and
|
|
self-healing for the app.
|
|
|
|
## Storage
|
|
|
|
OpenEBS provides the platform storage provisioner. Stateful homelab apps use
|
|
retained local PV paths such as `/var/openebs/local/gitea` and
|
|
`/var/openebs/local/registry`; these paths are intentionally outside kubeadm
|
|
reset paths so data can survive cluster destroy/create cycles. Those critical
|
|
volumes are declared explicitly as retained local PVs so a rebuilt cluster binds
|
|
back to the same host paths instead of creating fresh directories.
|
|
|
|
For the current lab, `/var/openebs/local` and `/var/lib/docker` are expected to
|
|
live on larger storage than the root filesystem. This keeps retained PVs,
|
|
container layers, Buildx state, and image caches from filling `/`.
|
|
|
|
## Gitea
|
|
|
|
Gitea is deployed from `apps/gitea`, stores data in the retained local PV at
|
|
`/var/openebs/local/gitea`, and is exposed through the public edge path at
|
|
`https://lab2025.duckdns.org/git/`. HTTP clone and push traffic goes through the
|
|
same path. The NodePort remains available inside the lab at port `30300`.
|
|
|
|
`./lab.sh up` applies the Gitea manifests directly before creating Argo CD
|
|
Applications. This keeps the Git service bootstrap-safe if the GitOps repo is
|
|
later moved into in-cluster Gitea.
|
|
|
|
After the repo exists in Gitea, Argo CD can be pointed at the internal service
|
|
URL so it no longer depends on the old external Git server:
|
|
|
|
```bash
|
|
export TF_VAR_gitops_repo_url='http://gitea.gitea-system.svc.cluster.local:3000/jv/my-homelab-configs.git'
|
|
tofu -chdir=bootstrap/platform apply -auto-approve
|
|
tofu -chdir=bootstrap/apps apply -auto-approve
|
|
```
|
|
|
|
## Gitea Backups
|
|
|
|
`./lab.sh up` installs a Debian-host systemd timer named
|
|
`homelab-gitea-backup.timer`. The timer runs daily, executes `gitea dump` inside
|
|
the Gitea pod, copies the dump out of Kubernetes, and stores it under
|
|
`/var/backups/homelab/gitea` on the Debian server. The default retention is 30
|
|
days.
|
|
|
|
Run a manual backup from the Debian server with:
|
|
|
|
```bash
|
|
./lab.sh backup-gitea
|
|
```
|
|
|
|
Useful checks:
|
|
|
|
```bash
|
|
systemctl list-timers homelab-gitea-backup.timer
|
|
sudo systemctl start homelab-gitea-backup.service
|
|
sudo ls -lh /var/backups/homelab/gitea
|
|
```
|
|
|
|
## Gitea Actions
|
|
|
|
This repo includes a Gitea Actions workflow at
|
|
`.gitea/workflows/homelab-main.yml`. It runs only on pushes to `main` and targets
|
|
a repository-scoped Debian host runner with the label `homelab-debian`.
|
|
|
|
The workflow validates shell syntax, Kubernetes manifests, and all OpenTofu
|
|
stacks before deployment. It automatically stops when high-impact files under
|
|
`bootstrap/cluster`, `bootstrap/platform`, `bootstrap/edge`, `lab.sh`, or
|
|
`.gitea/workflows` change; those changes still require a manual Debian run.
|
|
Lower-risk app changes proceed to `./lab.sh up` after validation passes.
|
|
|
|
Enable Actions for the repository in Gitea, then create a repository-level runner
|
|
token from:
|
|
|
|
```text
|
|
https://lab2025.duckdns.org/git/jv/my-homelab-configs/settings/actions/runners
|
|
```
|
|
|
|
Register and start the Debian runner from the Debian server:
|
|
|
|
```bash
|
|
cd ~/my-homelab-configs
|
|
GITEA_RUNNER_REGISTRATION_TOKEN='<repo-runner-token>' ./lab.sh install-gitea-runner
|
|
```
|
|
|
|
The runner is installed as `homelab-gitea-runner.service`, runs as user `jv`, and
|
|
uses a host label instead of a Docker job container because deployment needs the
|
|
Debian host's Docker, OpenTofu, kubeconfig, SSH keys, and local state.
|
|
|
|
The deployment job is non-interactive. User `jv` must be able to run `sudo -n
|
|
true` on the Debian host or the workflow will fail before deployment.
|
|
|
|
Useful checks:
|
|
|
|
```bash
|
|
systemctl status homelab-gitea-runner.service
|
|
journalctl -u homelab-gitea-runner.service -n 100 --no-pager
|
|
```
|
|
|
|
## Destructive Rebuilds
|
|
|
|
`./lab.sh nuke` resets kubeadm, containerd runtime state, CNI files, Calico
|
|
links, iptables rules, local OpenTofu state, and configured worker nodes. It does
|
|
not delete retained data under `/var/openebs/local`.
|
|
|
|
For multi-node labs, set `WORKER_SSH_TARGETS` to a space-separated list of SSH
|
|
targets. For a single-node rebuild, set it to an empty string.
|
|
|
|
## Website App
|
|
|
|
The website is a PHP app under `apps/website`. It includes a home page, CV page,
|
|
blog page, and demos page, plus a lightweight translation flow backed by Ollama.
|
|
Static language files live in `apps/website/lang`; unsupported browser languages
|
|
can be translated by the client and saved through `save_lang.php` as runtime
|
|
JSON data on the website PVC.
|
|
|
|
The CV page has two client-side presentation modes:
|
|
|
|
- `Elegant`: dark, minimal, terminal-inspired styling with a square profile
|
|
image and light green console text.
|
|
- `Fancy`: centered circular profile image, cursive orbit text, and a
|
|
cursor-following portrait rotation effect.
|
|
|
|
The Demos page is a catalog in the PHP website. The actual demo applications are
|
|
served from a separate `demos-static` artifact under `apps/demos-static` and are
|
|
published through the `demos-static` Argo CD application. Public traffic reaches
|
|
them through the edge path at `/demo-apps/`.
|
|
|
|
`./lab.sh up` builds and pushes two independent images:
|
|
|
|
- `php-website:latest` from `apps/website`
|
|
- `demos-static:latest` from `apps/demos-static`
|
|
|
|
The first demo, `The Client-Side Media Cruncher (Wasm + TS)`, currently performs
|
|
private, browser-only image compression and conversion using native Canvas APIs.
|
|
Heavier video conversion, such as MP4 to WebM, should use a Rust core compiled
|
|
to WebAssembly with a TypeScript UI so the codec work stays fast and still
|
|
avoids backend uploads.
|
|
|
|
The demos are designed to be local-first so the current cluster can serve them
|
|
from the Raspberry Pi worker without turning either pod into an application
|
|
server. The website pod serves the portfolio shell and the `demos-static` pod
|
|
serves static demo bundles; CPU-heavy work runs in the visitor's browser. With
|
|
the current deployments pinned to the Raspberry Pi, avoid bundling large ML
|
|
models, server-side WebSocket probes, or backend video transcoders into either
|
|
image. If those demos become production-grade, lazy load model assets in the
|
|
browser or move backend workers to a larger node, such as VMs on the Orange Pi 5
|
|
Plus.
|
|
|
|
Current demo inventory:
|
|
|
|
- Client-side media cruncher: image conversion/compression with Canvas; future
|
|
Rust/Wasm codec path for video.
|
|
- Internet quality visualizer: live Canvas graph for latency, jitter, and
|
|
stability using same-origin browser probes; a dedicated WebSocket echo endpoint
|
|
would be the production version.
|
|
- Local log and JSON toolbelt: JSON formatting, JWT decoding, URL parsing, and
|
|
local text-log filtering.
|
|
- Architecture simulator: click-driven load, crash, and auto-scale simulation.
|
|
- Offline traveler converter: PWA shell with timezone, currency, and GB/GiB
|
|
conversions.
|
|
- Privacy-first redactor: local image redaction prototype; future
|
|
onnxruntime-web plus quantized YOLO or face model path.
|
|
- Local sentiment sandbox: lightweight local sentiment, keyword, and summary
|
|
prototype; future Transformers.js/ONNX path.
|
|
- Model drift simulator: visual MLOps playground for spikes, corrupted inputs,
|
|
and retraining.
|
|
|
|
The Kubernetes deployment uses `apps/website/web-app.yaml`. Keep the image
|
|
reference there aligned with `TF_VAR_registry_endpoint`, because `lab.sh` derives
|
|
the registry endpoint from that manifest.
|
|
|
|
Keep the `.terraform.lock.hcl` files committed. They pin provider selections and
|
|
make bootstrap behavior reproducible across nodes and rebuilds.
|