my-homelab-configs/README.md

# Homelab Kubernetes Pipeline

This cool repo bootstraps a hybrid kubeadm cluster and then hands app delivery to
Argo CD.

## Architecture

The lab is intentionally small but production-shaped:

- a Debian amd64 host runs the kubeadm control plane and local deployment tools
- a Raspberry Pi arm64 node runs selected workloads
- a provisioning layer can PXE boot Debian 13 arm64 VMs for Pimox worker
  templates
- OpenTofu owns the bootstrap layers for cluster, platform, apps, and edge
- Argo CD continuously reconciles Kubernetes manifests from this repo
- a local registry stores the website and demos images built for the worker
  architecture
- an OCI jump box provides the public edge path back into the homelab over
  Tailscale

Run `./lab.sh up` and `./lab.sh nuke` only from the Debian homelab server. The
script intentionally refuses to run from non-Debian machines so a laptop cannot
accidentally modify the cluster.

## Flow

1. `bootstrap/provisioning`
   - prepares a Debian server as a PXE and preseed service for arm64 VMs
   - serves Debian 13 arm64 netboot assets through TFTP and HTTP
   - creates a golden image install path with Kubernetes, containerd,
     qemu-guest-agent, cloud-init, and storage client packages ready
   - stays out of `./lab.sh up` so VM template creation remains manual

2. `bootstrap/cluster`
   - creates the kubeadm control plane on the Debian amd64 node
   - joins worker nodes such as Raspberry Pi arm64 nodes
   - configures Calico-compatible pod CIDR
   - configures containerd to pull from the in-cluster NodePort registry
   - creates retained host directories under `/var/openebs/local`

3. `bootstrap/platform`
   - installs a minimal Calico deployment through the Tigera operator
   - installs OpenEBS
   - creates `openebs-hostpath-retain`
   - installs Argo CD
   - registers the private GitOps repo without storing the SSH private key in
     Terraform state

4. `bootstrap/apps`
   - registers Argo CD Applications from the `applications` map
   - default apps are `container-registry`, `gitea`, `website-production`, and
     `demos-static`

5. `bootstrap/edge`
   - connects to the OCI jump box
   - uploads nginx, HAProxy, Varnish, and Squid configs
   - obtains and renews Let's Encrypt certificates for the configured hostname
   - runs the edge cache/proxy chain with Docker Compose

## Prerequisites

On the Debian host:

- OpenTofu
- Docker with Buildx
- kubeadm, kubelet, kubectl, and containerd
- SSH access to worker nodes
- SSH access to the OCI edge host
- enough persistent storage for `/var/openebs/local` and `/var/lib/docker`

The default kubeconfig path is `/home/jv/.kube/config`. Override it with
`KUBECONFIG_PATH` or `TF_VAR_kubeconfig_path` when needed.

## Deploying

From the Debian server:

```bash
cd ~/my-homelab-configs
./lab.sh up
```

The script applies the OpenTofu stacks in order, refreshes Argo CD apps, waits
for the local registry, builds the website and demos images when their source
changed, pushes them to the registry, recreates pods only after a new image is
built, and then applies the edge stack.

The website and demos images default to `linux/arm64` because both deployments
are pinned to the Raspberry Pi worker. Override with `WEBSITE_IMAGE_PLATFORMS`
or `DEMOS_IMAGE_PLATFORMS` only if node placement changes.

Build metadata is written under `.lab/` so repeat runs can skip the website
or demos image build when the source hash, platform, image reference, and
registry manifest still match.

## Validation

Useful checks after a rebuild:

```bash
export KUBECONFIG=/home/jv/.kube/config

kubectl get nodes
kubectl -n argocd get applications
kubectl -n container-registry get pods
kubectl -n gitea-system get pods
kubectl -n website-production get pods -o wide
kubectl -n demos-static get pods -o wide

docker info --format '{{.DockerRootDir}}'
df -h / /var/openebs/local /var/lib/docker
```

The website should be reached through the configured public hostname, not the raw
OCI IP address, because the Let's Encrypt certificate is issued for the
hostname.

## Adding Nodes

For Pimox on Orange Pi 5 Plus, use `bootstrap/provisioning` to create a Debian
13 arm64 golden image first. The layer serves PXE, preseed, and guest-prep
assets from the Debian homelab server, then the installed VM can be sealed and
converted to a Pimox template. Details are in `bootstrap/provisioning/README.md`.

Add entries to `bootstrap/cluster/variables.tf` or a `.tfvars` file:

```hcl
worker_nodes = {
  raspberrypi = {
    host         = "192.168.100.89"
    user         = "jv"
    node_name    = "raspberry"
    ssh_key_path = "/home/jv/.ssh/id_ed25519"
  }
}
```

Stateful apps currently pin retained local PVs to the `debian` node. Move or
duplicate those PV manifests when you want storage on another node.

The website and demos NodePorts are reachable from the OCI jump box through the
Raspberry Pi Tailscale interface. `bootstrap/cluster` installs a persistent
`homelab-tailscale-nodeport.service` on the configured worker to restore the
route, rp_filter settings, and iptables rules after reboot. Override the
defaults through `tailscale_nodeport_access` when the jump-box IP, Pi Tailscale
IP, pod CIDR, primary NodePort, or pod target port changes. Add any additional
public NodePorts to `tailscale_nodeport_extra_ports`:

```hcl
tailscale_nodeport_access = {
  enabled           = true
  worker_key        = "raspberrypi"
  peer_ip           = "100.118.255.19"
  node_tailscale_ip = "100.77.80.72"
  pod_cidr          = "10.244.0.0/16"
  node_port         = 30080
  target_port       = 80
}

tailscale_nodeport_extra_ports = [30081]
```

For `./lab.sh nuke`, set `WORKER_SSH_TARGETS` to a space-separated list of
remote SSH targets when more worker nodes exist. Set it to an empty string for a
single-node rebuild.

## Adding Platform Tools

Add Helm releases through `bootstrap/platform`'s `extra_helm_releases` map.

## Edge Services

The OCI jump box runs the public edge path:

```text
nginx -> HAProxy -> Varnish/Squid -> Raspberry Pi Tailscale NodePort
```

The `bootstrap/edge` stack renders configs from `bootstrap/edge/templates` and
deploys them to `/opt/homelab-edge` on the OCI host. Defaults are in
`bootstrap/edge/variables.tf`; override them through `TF_VAR_*` or a `.tfvars`
file when the public host, SSH key, server name, backend Tailscale IP, or
NodePort changes.

Use the configured `server_name` in the browser, for example
`https://lab2025.duckdns.org`. A raw OCI IP address will still show a browser
certificate warning because the trusted certificate is issued for the hostname.

The edge stack uses HTTP-01 validation, so public DNS for `server_name` must
point to the OCI public IP and inbound TCP 80 and 443 must be open before
`./lab.sh up` runs. Set `TF_VAR_letsencrypt_email` to receive expiry notices,
or leave it empty to register without an email. Set
`TF_VAR_enable_letsencrypt=false` to keep using the temporary local certificate.

## Adding Apps

Add Kubernetes manifests under `apps/<name>` and register them in
`bootstrap/apps`'s `applications` map. Argo CD will own sync, pruning, and
self-healing for the app.

## Storage

OpenEBS provides the platform storage provisioner. Stateful homelab apps use
retained local PV paths such as `/var/openebs/local/gitea` and
`/var/openebs/local/registry`; these paths are intentionally outside kubeadm
reset paths so data can survive cluster destroy/create cycles. Those critical
volumes are declared explicitly as retained local PVs so a rebuilt cluster binds
back to the same host paths instead of creating fresh directories.

For the current lab, `/var/openebs/local` and `/var/lib/docker` are expected to
live on larger storage than the root filesystem. This keeps retained PVs,
container layers, Buildx state, and image caches from filling `/`.

## Gitea

Gitea is deployed from `apps/gitea`, stores data in the retained local PV at
`/var/openebs/local/gitea`, and is exposed through the public edge path at
`https://lab2025.duckdns.org/git/`. HTTP clone and push traffic goes through the
same path. The NodePort remains available inside the lab at port `30300`.

`./lab.sh up` applies the Gitea manifests directly before creating Argo CD
Applications. This keeps the Git service bootstrap-safe if the GitOps repo is
later moved into in-cluster Gitea.

After the repo exists in Gitea, Argo CD can be pointed at the internal service
URL so it no longer depends on the old external Git server:

```bash
export TF_VAR_gitops_repo_url='http://gitea.gitea-system.svc.cluster.local:3000/jv/my-homelab-configs.git'
tofu -chdir=bootstrap/platform apply -auto-approve
tofu -chdir=bootstrap/apps apply -auto-approve
```

## Gitea Backups

`./lab.sh up` installs a Debian-host systemd timer named
`homelab-gitea-backup.timer`. The timer runs daily, executes `gitea dump` inside
the Gitea pod, copies the dump out of Kubernetes, and stores it under
`/var/backups/homelab/gitea` on the Debian server. The default retention is 30
days.

Run a manual backup from the Debian server with:

```bash
./lab.sh backup-gitea
```

Useful checks:

```bash
systemctl list-timers homelab-gitea-backup.timer
sudo systemctl start homelab-gitea-backup.service
sudo ls -lh /var/backups/homelab/gitea
```

## Gitea Actions

This repo includes a Gitea Actions workflow at
`.gitea/workflows/homelab-main.yml`. It runs only on pushes to `main` and targets
a repository-scoped Debian host runner with the label `homelab-debian`.

The workflow validates shell syntax, Kubernetes manifests, and all OpenTofu
stacks before deployment. It automatically stops when high-impact files under
`bootstrap/cluster`, `bootstrap/platform`, `bootstrap/edge`, `lab.sh`, or
`.gitea/workflows` change; those changes still require a manual Debian run.
Lower-risk app changes proceed to `./lab.sh up` after validation passes.

Enable Actions for the repository in Gitea, then create a repository-level runner
token from:

```text
https://lab2025.duckdns.org/git/jv/my-homelab-configs/settings/actions/runners
```

Register and start the Debian runner from the Debian server:

```bash
cd ~/my-homelab-configs
GITEA_RUNNER_REGISTRATION_TOKEN='<repo-runner-token>' ./lab.sh install-gitea-runner
```

The runner is installed as `homelab-gitea-runner.service`, runs as user `jv`, and
uses a host label instead of a Docker job container because deployment needs the
Debian host's Docker, OpenTofu, kubeconfig, SSH keys, and local state.

The deployment job is non-interactive. User `jv` must be able to run `sudo -n
true` on the Debian host or the workflow will fail before deployment.

Useful checks:

```bash
systemctl status homelab-gitea-runner.service
journalctl -u homelab-gitea-runner.service -n 100 --no-pager
```

## Destructive Rebuilds

`./lab.sh nuke` resets kubeadm, containerd runtime state, CNI files, Calico
links, iptables rules, local OpenTofu state, and configured worker nodes. It does
not delete retained data under `/var/openebs/local`.

For multi-node labs, set `WORKER_SSH_TARGETS` to a space-separated list of SSH
targets. For a single-node rebuild, set it to an empty string.

## Website App

The website is a PHP app under `apps/website`. It includes a home page, CV page,
blog page, and demos page, plus a lightweight translation flow backed by Ollama.
Static language files live in `apps/website/lang`; unsupported browser languages
can be translated by the client and saved through `save_lang.php` as runtime
JSON data on the website PVC.

The CV page has two client-side presentation modes:

- `Elegant`: dark, minimal, terminal-inspired styling with a square profile
  image and light green console text.
- `Fancy`: centered circular profile image, cursive orbit text, and a
  cursor-following portrait rotation effect.

The Demos page is a catalog in the PHP website. The actual demo applications are
served from a separate `demos-static` artifact under `apps/demos-static` and are
published through the `demos-static` Argo CD application. Public traffic reaches
them through the edge path at `/demo-apps/`.

`./lab.sh up` builds and pushes two independent images:

- `php-website:latest` from `apps/website`
- `demos-static:latest` from `apps/demos-static`

The first demo, `The Client-Side Media Cruncher (Wasm + TS)`, currently performs
private, browser-only image compression and conversion using native Canvas APIs.
Heavier video conversion, such as MP4 to WebM, should use a Rust core compiled
to WebAssembly with a TypeScript UI so the codec work stays fast and still
avoids backend uploads.

The demos are designed to be local-first so the current cluster can serve them
from the Raspberry Pi worker without turning either pod into an application
server. The website pod serves the portfolio shell and the `demos-static` pod
serves static demo bundles; CPU-heavy work runs in the visitor's browser. With
the current deployments pinned to the Raspberry Pi, avoid bundling large ML
models, server-side WebSocket probes, or backend video transcoders into either
image. If those demos become production-grade, lazy load model assets in the
browser or move backend workers to a larger node, such as VMs on the Orange Pi 5
Plus.

Current demo inventory:

- Client-side media cruncher: image conversion/compression with Canvas; future
  Rust/Wasm codec path for video.
- Internet quality visualizer: live Canvas graph for latency, jitter, and
  stability using same-origin browser probes; a dedicated WebSocket echo endpoint
  would be the production version.
- Local log and JSON toolbelt: JSON formatting, JWT decoding, URL parsing, and
  local text-log filtering.
- Architecture simulator: click-driven load, crash, and auto-scale simulation.
- Offline traveler converter: PWA shell with timezone, currency, and GB/GiB
  conversions.
- Privacy-first redactor: local image redaction prototype; future
  onnxruntime-web plus quantized YOLO or face model path.
- Local sentiment sandbox: lightweight local sentiment, keyword, and summary
  prototype; future Transformers.js/ONNX path.
- Model drift simulator: visual MLOps playground for spikes, corrupted inputs,
  and retraining.

The Kubernetes deployment uses `apps/website/web-app.yaml`. Keep the image
reference there aligned with `TF_VAR_registry_endpoint`, because `lab.sh` derives
the registry endpoint from that manifest.

Keep the `.terraform.lock.hcl` files committed. They pin provider selections and
make bootstrap behavior reproducible across nodes and rebuilds.