From b76075e0fce3869a2cb164e7ca318d850e21fb23 Mon Sep 17 00:00:00 2001 From: juvdiaz Date: Fri, 29 May 2026 15:14:51 -0600 Subject: [PATCH] Update READMEs --- README.md | 46 +++++++++++++++++++++++--------- apps/mlops-platform/README.md | 30 ++++++++++++++++++++- bootstrap/provisioning/README.md | 11 +++++++- infra/gitea/README.md | 19 +++++++++++++ 4 files changed, 91 insertions(+), 15 deletions(-) diff --git a/README.md b/README.md index b940229..e4f704a 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # Homelab Kubernetes Pipeline -This cool repo bootstraps a hybrid kubeadm cluster and then hands app delivery to +This repo bootstraps a hybrid kubeadm cluster and then hands app delivery to Argo CD. ## Architecture @@ -59,6 +59,8 @@ accidentally modify the cluster. 4. `bootstrap/apps` - registers Argo CD Applications from the `applications` map + - passes the website image produced by the build step into Argo CD as a + Kustomize image override - default apps are `container-registry`, `website-production`, and `demos-static` @@ -109,12 +111,11 @@ clones on `nvme_thin_pool` by default, checks that the Pimox bridge already exists, refuses `local` as worker clone storage, and refuses to edit Orange Pi host networking. -`LAB_PIMOX_SKIP_WORKER_INDEXES` defaults to `1` because the first Pimox worker -slot was created manually. With the default `LAB_PIMOX_WORKER_COUNT=1`, the -pipeline keeps the template current and leaves VMID `9010` alone. Set -`LAB_PIMOX_SKIP_WORKER_INDEXES=''` if you want the pipeline to own the first -slot, or set `LAB_PIMOX_WORKER_COUNT=2` to manage the second slot while still -skipping the first. +`LAB_PIMOX_SKIP_WORKER_INDEXES` defaults to an empty value, so the pipeline owns +worker index `1` and VMID `9010` when `LAB_PIMOX_WORKER_COUNT=1`. Set +`LAB_PIMOX_SKIP_WORKER_INDEXES=1` if an existing manually created first worker +must be left untouched, or set `LAB_PIMOX_WORKER_COUNT=2` to manage both VMID +`9010` and VMID `9011`. OpenWrt firewall VM automation is available as a standalone command because it attaches to both WAN and LAN bridges. Run `./lab.sh openwrt` after `vmbr1` @@ -186,10 +187,13 @@ That path preserves external Raspberry Pi Gitea, rebuilds the Pimox template with 2 cores and 4 GiB memory, replaces two Pimox worker VMs with 2 cores and 4 GiB memory, and joins those workers to the Kubernetes cluster. CPU affinity is disabled by default because the Bullseye-pinned Pimox `qm` does not support it. -The Raspberry Pi worker is excluded by default while it hosts external Gitea. +The Raspberry Pi is still included as a Kubernetes worker by default; `nuke` +does not clean it unless you explicitly add it to `WORKER_SSH_TARGETS`, so the +external Gitea Docker service survives cluster rebuilds. -To opt the Raspberry Pi back into the Kubernetes cluster, set -`LAB_INCLUDE_RASPBERRY_WORKER=true` or add entries to +To exclude the Raspberry Pi from the Kubernetes cluster, set +`LAB_INCLUDE_RASPBERRY_WORKER=false`. To manage workers manually instead, add +entries to `bootstrap/cluster/variables.tf` or a `.tfvars` file: ```hcl @@ -390,6 +394,12 @@ Argo CD consumes that Debian mirror through the default `gitops_repo_url`. Gitea Actions pushes the `main` commit into the mirror before running the selected deploy command. +The platform bootstrap registers the Argo CD repository secret and the SSH host +key for the Debian GitOps mirror. If Argo CD reports +`knownhosts: key is unknown` after the Debian host was rebuilt or its SSH host +key changed, refresh `argocd-ssh-known-hosts-cm` in the `argocd` namespace, +restart `argocd-repo-server`, and hard-refresh the affected Application. + Deploy or refresh the external Gitea container from the Debian host with: ```bash @@ -510,9 +520,13 @@ cleaned unless you explicitly include it. The website is a PHP app under `apps/website`. It includes a home page, CV page, blog page, and demos page, plus a lightweight translation flow backed by Ollama. -Static language files live in `apps/website/lang`; unsupported browser languages -can be translated by the client and saved through `save_lang.php` as runtime -JSON data on the website PVC. +Static language files live in `apps/website/lang`; `en.php` and `nah.php` are +curated source files, with the Nahuatl home page intentionally biased toward as +many Nahuatl words as possible while keeping technical terms understandable. +Unsupported browser languages use the same-origin `/translate.php` endpoint, +which calls Ollama server-side through `OLLAMA_HOST` and `OLLAMA_MODEL`; the +browser never calls the private Ollama IP directly. Generated runtime language +JSON is saved through `save_lang.php` on the website PVC. The CV page has two client-side presentation modes: @@ -539,6 +553,12 @@ During bootstrap, `lab.sh` hashes `apps/website`, builds Kustomize. This keeps the GitOps source generic while the deployed image remains immutable. +After `./lab.sh apps`, the live deployment image should be a content-hash tag, +for example `192.168.100.68:30500/php-website:src-...`. If it still shows +`php-website:latest`, Argo CD has not rendered the current Application source. +Check the `website-production` Application source, sync status, and repository +access before restarting pods. + The first demo, `The Client-Side Media Cruncher (Wasm + TS)`, currently performs private, browser-only image compression and conversion using native Canvas APIs. Heavier video conversion, such as MP4 to WebM, should use a Rust core compiled diff --git a/apps/mlops-platform/README.md b/apps/mlops-platform/README.md index 37a416f..7eea04e 100644 --- a/apps/mlops-platform/README.md +++ b/apps/mlops-platform/README.md @@ -1,6 +1,13 @@ # MLOps Platform Demo -Production-shaped inference demo for the portfolio site. The model is intentionally small: logistic regression coefficients trained with scikit-learn and exported to JSON so the runtime stays light enough for the homelab. +Production-shaped inference demo for the portfolio site. The model is +intentionally small: logistic regression coefficients trained with scikit-learn +and exported to JSON so the runtime stays light enough for the homelab. + +This directory contains the FastAPI service and model artifacts. It is not yet +registered as a Kubernetes application in `bootstrap/apps`; the public website +currently links to the reserved static placeholder under +`apps/demos-static/public/mlops-platform/`. ## Endpoints @@ -13,3 +20,24 @@ Production-shaped inference demo for the portfolio site. The model is intentiona - `MODEL_VERSION=v1`, `MODEL_TRACK=blue` is the stable route. - `MODEL_VERSION=v2`, `MODEL_TRACK=green` is the canary route. - Kubernetes service selectors choose the active track, so rollback is a service selector change instead of an image rebuild. + +## Local Smoke Test + +```bash +docker build -t mlops-platform:local apps/mlops-platform +docker run --rm -p 8080:8080 mlops-platform:local +``` + +In another shell: + +```bash +curl -fsS http://127.0.0.1:8080/healthz +curl -fsS http://127.0.0.1:8080/predict \ + -H 'Content-Type: application/json' \ + -d '{"latency_ms":120,"error_rate":0.01,"cpu_utilization":0.55,"memory_utilization":0.62,"queue_depth":8}' +curl -fsS http://127.0.0.1:8080/metrics +``` + +The next deployment step is to add Kubernetes manifests or a Kustomize app with +blue and green Deployments, a Service selector for the active track, resource +requests and limits, and Prometheus scraping. diff --git a/bootstrap/provisioning/README.md b/bootstrap/provisioning/README.md index a90f327..6cef40c 100644 --- a/bootstrap/provisioning/README.md +++ b/bootstrap/provisioning/README.md @@ -115,7 +115,15 @@ discovery. New workers are full clones created with land on the NVMe thin pool. Set `LAB_PIMOX_WORKER_REPLACE_EXISTING=true` to destroy and recreate existing worker VMs from the current template. The pipeline refuses `LAB_PIMOX_WORKER_STORAGE=local` so only the template VM lives on local -storage. Useful overrides: +storage. + +Worker indexes are stable: index `1` maps to VMID `9010`, +`pimox-worker-01`, and worker key `pimox01`; index `2` maps to VMID `9011`, +`pimox-worker-02`, and worker key `pimox02`. `LAB_PIMOX_SKIP_WORKER_INDEXES` +defaults to empty, so the pipeline owns index `1` unless you set +`LAB_PIMOX_SKIP_WORKER_INDEXES=1`. + +Useful overrides: ```bash ./lab.sh rebuild-cluster @@ -123,6 +131,7 @@ LAB_PIMOX_PIPELINE=false ./lab.sh up LAB_PIMOX_TEMPLATE_REPLACE_EXISTING=true ./lab.sh up LAB_PIMOX_WORKER_COUNT=0 ./lab.sh up LAB_PIMOX_WORKER_COUNT=2 ./lab.sh up +LAB_PIMOX_SKIP_WORKER_INDEXES=1 LAB_PIMOX_WORKER_COUNT=2 ./lab.sh up LAB_PIMOX_WORKER_BASE_VMID=9020 ./lab.sh up LAB_PIMOX_WORKER_STORAGE=nvme_thin_pool ./lab.sh up LAB_PIMOX_WORKER_REPLACE_EXISTING=true ./lab.sh up diff --git a/infra/gitea/README.md b/infra/gitea/README.md index dad409a..8c19a09 100644 --- a/infra/gitea/README.md +++ b/infra/gitea/README.md @@ -20,6 +20,25 @@ Kubernetes consumes Git from the Debian bare GitOps mirror at `/home/jv/git-server/my-homelab-configs.git`. Gitea is the human-facing Git service and remains available when the cluster is destroyed. +`./lab.sh bootstrap-gitea-repo` creates or validates the public Gitea repository, +adds the Debian host deploy key when needed, and points the Debian checkout's +`gitea` remote at: + +```text +ssh://git@192.168.100.89:32222/jv/my-homelab-configs.git +``` + +Argo CD does not read from the Raspberry Pi Gitea SSH port. It reads from the +Debian bare GitOps mirror through `gitops_repo_url`, normally: + +```text +ssh://jv@192.168.100.68/home/jv/git-server/my-homelab-configs.git +``` + +The platform bootstrap registers that repo secret and updates +`argocd-ssh-known-hosts-cm`. If Argo CD reports `knownhosts: key is unknown`, +refresh the Debian host key in that ConfigMap and restart `argocd-repo-server`. + Backups are installed on the Debian host by `lab.sh deploy-gitea` and `lab.sh backup-gitea`. The timer runs `gitea dump` inside the Raspberry Pi container, copies the archive to Debian, and stores it under