Update READMEs
Homelab Main / deploy (push) Failing after 6s Details

This commit is contained in:
juvdiaz 2026-05-29 15:14:51 -06:00
parent a2fab4fe5f
commit b76075e0fc
4 changed files with 91 additions and 15 deletions

View File

@ -1,6 +1,6 @@
# Homelab Kubernetes Pipeline # Homelab Kubernetes Pipeline
This cool repo bootstraps a hybrid kubeadm cluster and then hands app delivery to This repo bootstraps a hybrid kubeadm cluster and then hands app delivery to
Argo CD. Argo CD.
## Architecture ## Architecture
@ -59,6 +59,8 @@ accidentally modify the cluster.
4. `bootstrap/apps` 4. `bootstrap/apps`
- registers Argo CD Applications from the `applications` map - registers Argo CD Applications from the `applications` map
- passes the website image produced by the build step into Argo CD as a
Kustomize image override
- default apps are `container-registry`, `website-production`, and - default apps are `container-registry`, `website-production`, and
`demos-static` `demos-static`
@ -109,12 +111,11 @@ clones on `nvme_thin_pool` by default, checks that the Pimox bridge already
exists, refuses `local` as worker clone storage, and refuses to edit Orange Pi exists, refuses `local` as worker clone storage, and refuses to edit Orange Pi
host networking. host networking.
`LAB_PIMOX_SKIP_WORKER_INDEXES` defaults to `1` because the first Pimox worker `LAB_PIMOX_SKIP_WORKER_INDEXES` defaults to an empty value, so the pipeline owns
slot was created manually. With the default `LAB_PIMOX_WORKER_COUNT=1`, the worker index `1` and VMID `9010` when `LAB_PIMOX_WORKER_COUNT=1`. Set
pipeline keeps the template current and leaves VMID `9010` alone. Set `LAB_PIMOX_SKIP_WORKER_INDEXES=1` if an existing manually created first worker
`LAB_PIMOX_SKIP_WORKER_INDEXES=''` if you want the pipeline to own the first must be left untouched, or set `LAB_PIMOX_WORKER_COUNT=2` to manage both VMID
slot, or set `LAB_PIMOX_WORKER_COUNT=2` to manage the second slot while still `9010` and VMID `9011`.
skipping the first.
OpenWrt firewall VM automation is available as a standalone command because it OpenWrt firewall VM automation is available as a standalone command because it
attaches to both WAN and LAN bridges. Run `./lab.sh openwrt` after `vmbr1` attaches to both WAN and LAN bridges. Run `./lab.sh openwrt` after `vmbr1`
@ -186,10 +187,13 @@ That path preserves external Raspberry Pi Gitea, rebuilds the Pimox template
with 2 cores and 4 GiB memory, replaces two Pimox worker VMs with 2 cores and with 2 cores and 4 GiB memory, replaces two Pimox worker VMs with 2 cores and
4 GiB memory, and joins those workers to the Kubernetes cluster. CPU affinity is 4 GiB memory, and joins those workers to the Kubernetes cluster. CPU affinity is
disabled by default because the Bullseye-pinned Pimox `qm` does not support it. disabled by default because the Bullseye-pinned Pimox `qm` does not support it.
The Raspberry Pi worker is excluded by default while it hosts external Gitea. The Raspberry Pi is still included as a Kubernetes worker by default; `nuke`
does not clean it unless you explicitly add it to `WORKER_SSH_TARGETS`, so the
external Gitea Docker service survives cluster rebuilds.
To opt the Raspberry Pi back into the Kubernetes cluster, set To exclude the Raspberry Pi from the Kubernetes cluster, set
`LAB_INCLUDE_RASPBERRY_WORKER=true` or add entries to `LAB_INCLUDE_RASPBERRY_WORKER=false`. To manage workers manually instead, add
entries to
`bootstrap/cluster/variables.tf` or a `.tfvars` file: `bootstrap/cluster/variables.tf` or a `.tfvars` file:
```hcl ```hcl
@ -390,6 +394,12 @@ Argo CD consumes that Debian mirror through the default `gitops_repo_url`.
Gitea Actions pushes the `main` commit into the mirror before running the Gitea Actions pushes the `main` commit into the mirror before running the
selected deploy command. selected deploy command.
The platform bootstrap registers the Argo CD repository secret and the SSH host
key for the Debian GitOps mirror. If Argo CD reports
`knownhosts: key is unknown` after the Debian host was rebuilt or its SSH host
key changed, refresh `argocd-ssh-known-hosts-cm` in the `argocd` namespace,
restart `argocd-repo-server`, and hard-refresh the affected Application.
Deploy or refresh the external Gitea container from the Debian host with: Deploy or refresh the external Gitea container from the Debian host with:
```bash ```bash
@ -510,9 +520,13 @@ cleaned unless you explicitly include it.
The website is a PHP app under `apps/website`. It includes a home page, CV page, The website is a PHP app under `apps/website`. It includes a home page, CV page,
blog page, and demos page, plus a lightweight translation flow backed by Ollama. blog page, and demos page, plus a lightweight translation flow backed by Ollama.
Static language files live in `apps/website/lang`; unsupported browser languages Static language files live in `apps/website/lang`; `en.php` and `nah.php` are
can be translated by the client and saved through `save_lang.php` as runtime curated source files, with the Nahuatl home page intentionally biased toward as
JSON data on the website PVC. many Nahuatl words as possible while keeping technical terms understandable.
Unsupported browser languages use the same-origin `/translate.php` endpoint,
which calls Ollama server-side through `OLLAMA_HOST` and `OLLAMA_MODEL`; the
browser never calls the private Ollama IP directly. Generated runtime language
JSON is saved through `save_lang.php` on the website PVC.
The CV page has two client-side presentation modes: The CV page has two client-side presentation modes:
@ -539,6 +553,12 @@ During bootstrap, `lab.sh` hashes `apps/website`, builds
Kustomize. This keeps the GitOps source generic while the deployed image remains Kustomize. This keeps the GitOps source generic while the deployed image remains
immutable. immutable.
After `./lab.sh apps`, the live deployment image should be a content-hash tag,
for example `192.168.100.68:30500/php-website:src-...`. If it still shows
`php-website:latest`, Argo CD has not rendered the current Application source.
Check the `website-production` Application source, sync status, and repository
access before restarting pods.
The first demo, `The Client-Side Media Cruncher (Wasm + TS)`, currently performs The first demo, `The Client-Side Media Cruncher (Wasm + TS)`, currently performs
private, browser-only image compression and conversion using native Canvas APIs. private, browser-only image compression and conversion using native Canvas APIs.
Heavier video conversion, such as MP4 to WebM, should use a Rust core compiled Heavier video conversion, such as MP4 to WebM, should use a Rust core compiled

View File

@ -1,6 +1,13 @@
# MLOps Platform Demo # MLOps Platform Demo
Production-shaped inference demo for the portfolio site. The model is intentionally small: logistic regression coefficients trained with scikit-learn and exported to JSON so the runtime stays light enough for the homelab. Production-shaped inference demo for the portfolio site. The model is
intentionally small: logistic regression coefficients trained with scikit-learn
and exported to JSON so the runtime stays light enough for the homelab.
This directory contains the FastAPI service and model artifacts. It is not yet
registered as a Kubernetes application in `bootstrap/apps`; the public website
currently links to the reserved static placeholder under
`apps/demos-static/public/mlops-platform/`.
## Endpoints ## Endpoints
@ -13,3 +20,24 @@ Production-shaped inference demo for the portfolio site. The model is intentiona
- `MODEL_VERSION=v1`, `MODEL_TRACK=blue` is the stable route. - `MODEL_VERSION=v1`, `MODEL_TRACK=blue` is the stable route.
- `MODEL_VERSION=v2`, `MODEL_TRACK=green` is the canary route. - `MODEL_VERSION=v2`, `MODEL_TRACK=green` is the canary route.
- Kubernetes service selectors choose the active track, so rollback is a service selector change instead of an image rebuild. - Kubernetes service selectors choose the active track, so rollback is a service selector change instead of an image rebuild.
## Local Smoke Test
```bash
docker build -t mlops-platform:local apps/mlops-platform
docker run --rm -p 8080:8080 mlops-platform:local
```
In another shell:
```bash
curl -fsS http://127.0.0.1:8080/healthz
curl -fsS http://127.0.0.1:8080/predict \
-H 'Content-Type: application/json' \
-d '{"latency_ms":120,"error_rate":0.01,"cpu_utilization":0.55,"memory_utilization":0.62,"queue_depth":8}'
curl -fsS http://127.0.0.1:8080/metrics
```
The next deployment step is to add Kubernetes manifests or a Kustomize app with
blue and green Deployments, a Service selector for the active track, resource
requests and limits, and Prometheus scraping.

View File

@ -115,7 +115,15 @@ discovery. New workers are full clones created with
land on the NVMe thin pool. Set `LAB_PIMOX_WORKER_REPLACE_EXISTING=true` to land on the NVMe thin pool. Set `LAB_PIMOX_WORKER_REPLACE_EXISTING=true` to
destroy and recreate existing worker VMs from the current template. The pipeline destroy and recreate existing worker VMs from the current template. The pipeline
refuses `LAB_PIMOX_WORKER_STORAGE=local` so only the template VM lives on local refuses `LAB_PIMOX_WORKER_STORAGE=local` so only the template VM lives on local
storage. Useful overrides: storage.
Worker indexes are stable: index `1` maps to VMID `9010`,
`pimox-worker-01`, and worker key `pimox01`; index `2` maps to VMID `9011`,
`pimox-worker-02`, and worker key `pimox02`. `LAB_PIMOX_SKIP_WORKER_INDEXES`
defaults to empty, so the pipeline owns index `1` unless you set
`LAB_PIMOX_SKIP_WORKER_INDEXES=1`.
Useful overrides:
```bash ```bash
./lab.sh rebuild-cluster ./lab.sh rebuild-cluster
@ -123,6 +131,7 @@ LAB_PIMOX_PIPELINE=false ./lab.sh up
LAB_PIMOX_TEMPLATE_REPLACE_EXISTING=true ./lab.sh up LAB_PIMOX_TEMPLATE_REPLACE_EXISTING=true ./lab.sh up
LAB_PIMOX_WORKER_COUNT=0 ./lab.sh up LAB_PIMOX_WORKER_COUNT=0 ./lab.sh up
LAB_PIMOX_WORKER_COUNT=2 ./lab.sh up LAB_PIMOX_WORKER_COUNT=2 ./lab.sh up
LAB_PIMOX_SKIP_WORKER_INDEXES=1 LAB_PIMOX_WORKER_COUNT=2 ./lab.sh up
LAB_PIMOX_WORKER_BASE_VMID=9020 ./lab.sh up LAB_PIMOX_WORKER_BASE_VMID=9020 ./lab.sh up
LAB_PIMOX_WORKER_STORAGE=nvme_thin_pool ./lab.sh up LAB_PIMOX_WORKER_STORAGE=nvme_thin_pool ./lab.sh up
LAB_PIMOX_WORKER_REPLACE_EXISTING=true ./lab.sh up LAB_PIMOX_WORKER_REPLACE_EXISTING=true ./lab.sh up

View File

@ -20,6 +20,25 @@ Kubernetes consumes Git from the Debian bare GitOps mirror at
`/home/jv/git-server/my-homelab-configs.git`. Gitea is the human-facing Git `/home/jv/git-server/my-homelab-configs.git`. Gitea is the human-facing Git
service and remains available when the cluster is destroyed. service and remains available when the cluster is destroyed.
`./lab.sh bootstrap-gitea-repo` creates or validates the public Gitea repository,
adds the Debian host deploy key when needed, and points the Debian checkout's
`gitea` remote at:
```text
ssh://git@192.168.100.89:32222/jv/my-homelab-configs.git
```
Argo CD does not read from the Raspberry Pi Gitea SSH port. It reads from the
Debian bare GitOps mirror through `gitops_repo_url`, normally:
```text
ssh://jv@192.168.100.68/home/jv/git-server/my-homelab-configs.git
```
The platform bootstrap registers that repo secret and updates
`argocd-ssh-known-hosts-cm`. If Argo CD reports `knownhosts: key is unknown`,
refresh the Debian host key in that ConfigMap and restart `argocd-repo-server`.
Backups are installed on the Debian host by `lab.sh deploy-gitea` and Backups are installed on the Debian host by `lab.sh deploy-gitea` and
`lab.sh backup-gitea`. The timer runs `gitea dump` inside the Raspberry Pi `lab.sh backup-gitea`. The timer runs `gitea dump` inside the Raspberry Pi
container, copies the archive to Debian, and stores it under container, copies the archive to Debian, and stores it under