Update READMEs
Homelab Main / deploy (push) Failing after 6s
Details
Homelab Main / deploy (push) Failing after 6s
Details
This commit is contained in:
parent
a2fab4fe5f
commit
b76075e0fc
46
README.md
46
README.md
|
|
@ -1,6 +1,6 @@
|
||||||
# Homelab Kubernetes Pipeline
|
# Homelab Kubernetes Pipeline
|
||||||
|
|
||||||
This cool repo bootstraps a hybrid kubeadm cluster and then hands app delivery to
|
This repo bootstraps a hybrid kubeadm cluster and then hands app delivery to
|
||||||
Argo CD.
|
Argo CD.
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
|
|
@ -59,6 +59,8 @@ accidentally modify the cluster.
|
||||||
|
|
||||||
4. `bootstrap/apps`
|
4. `bootstrap/apps`
|
||||||
- registers Argo CD Applications from the `applications` map
|
- registers Argo CD Applications from the `applications` map
|
||||||
|
- passes the website image produced by the build step into Argo CD as a
|
||||||
|
Kustomize image override
|
||||||
- default apps are `container-registry`, `website-production`, and
|
- default apps are `container-registry`, `website-production`, and
|
||||||
`demos-static`
|
`demos-static`
|
||||||
|
|
||||||
|
|
@ -109,12 +111,11 @@ clones on `nvme_thin_pool` by default, checks that the Pimox bridge already
|
||||||
exists, refuses `local` as worker clone storage, and refuses to edit Orange Pi
|
exists, refuses `local` as worker clone storage, and refuses to edit Orange Pi
|
||||||
host networking.
|
host networking.
|
||||||
|
|
||||||
`LAB_PIMOX_SKIP_WORKER_INDEXES` defaults to `1` because the first Pimox worker
|
`LAB_PIMOX_SKIP_WORKER_INDEXES` defaults to an empty value, so the pipeline owns
|
||||||
slot was created manually. With the default `LAB_PIMOX_WORKER_COUNT=1`, the
|
worker index `1` and VMID `9010` when `LAB_PIMOX_WORKER_COUNT=1`. Set
|
||||||
pipeline keeps the template current and leaves VMID `9010` alone. Set
|
`LAB_PIMOX_SKIP_WORKER_INDEXES=1` if an existing manually created first worker
|
||||||
`LAB_PIMOX_SKIP_WORKER_INDEXES=''` if you want the pipeline to own the first
|
must be left untouched, or set `LAB_PIMOX_WORKER_COUNT=2` to manage both VMID
|
||||||
slot, or set `LAB_PIMOX_WORKER_COUNT=2` to manage the second slot while still
|
`9010` and VMID `9011`.
|
||||||
skipping the first.
|
|
||||||
|
|
||||||
OpenWrt firewall VM automation is available as a standalone command because it
|
OpenWrt firewall VM automation is available as a standalone command because it
|
||||||
attaches to both WAN and LAN bridges. Run `./lab.sh openwrt` after `vmbr1`
|
attaches to both WAN and LAN bridges. Run `./lab.sh openwrt` after `vmbr1`
|
||||||
|
|
@ -186,10 +187,13 @@ That path preserves external Raspberry Pi Gitea, rebuilds the Pimox template
|
||||||
with 2 cores and 4 GiB memory, replaces two Pimox worker VMs with 2 cores and
|
with 2 cores and 4 GiB memory, replaces two Pimox worker VMs with 2 cores and
|
||||||
4 GiB memory, and joins those workers to the Kubernetes cluster. CPU affinity is
|
4 GiB memory, and joins those workers to the Kubernetes cluster. CPU affinity is
|
||||||
disabled by default because the Bullseye-pinned Pimox `qm` does not support it.
|
disabled by default because the Bullseye-pinned Pimox `qm` does not support it.
|
||||||
The Raspberry Pi worker is excluded by default while it hosts external Gitea.
|
The Raspberry Pi is still included as a Kubernetes worker by default; `nuke`
|
||||||
|
does not clean it unless you explicitly add it to `WORKER_SSH_TARGETS`, so the
|
||||||
|
external Gitea Docker service survives cluster rebuilds.
|
||||||
|
|
||||||
To opt the Raspberry Pi back into the Kubernetes cluster, set
|
To exclude the Raspberry Pi from the Kubernetes cluster, set
|
||||||
`LAB_INCLUDE_RASPBERRY_WORKER=true` or add entries to
|
`LAB_INCLUDE_RASPBERRY_WORKER=false`. To manage workers manually instead, add
|
||||||
|
entries to
|
||||||
`bootstrap/cluster/variables.tf` or a `.tfvars` file:
|
`bootstrap/cluster/variables.tf` or a `.tfvars` file:
|
||||||
|
|
||||||
```hcl
|
```hcl
|
||||||
|
|
@ -390,6 +394,12 @@ Argo CD consumes that Debian mirror through the default `gitops_repo_url`.
|
||||||
Gitea Actions pushes the `main` commit into the mirror before running the
|
Gitea Actions pushes the `main` commit into the mirror before running the
|
||||||
selected deploy command.
|
selected deploy command.
|
||||||
|
|
||||||
|
The platform bootstrap registers the Argo CD repository secret and the SSH host
|
||||||
|
key for the Debian GitOps mirror. If Argo CD reports
|
||||||
|
`knownhosts: key is unknown` after the Debian host was rebuilt or its SSH host
|
||||||
|
key changed, refresh `argocd-ssh-known-hosts-cm` in the `argocd` namespace,
|
||||||
|
restart `argocd-repo-server`, and hard-refresh the affected Application.
|
||||||
|
|
||||||
Deploy or refresh the external Gitea container from the Debian host with:
|
Deploy or refresh the external Gitea container from the Debian host with:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
@ -510,9 +520,13 @@ cleaned unless you explicitly include it.
|
||||||
|
|
||||||
The website is a PHP app under `apps/website`. It includes a home page, CV page,
|
The website is a PHP app under `apps/website`. It includes a home page, CV page,
|
||||||
blog page, and demos page, plus a lightweight translation flow backed by Ollama.
|
blog page, and demos page, plus a lightweight translation flow backed by Ollama.
|
||||||
Static language files live in `apps/website/lang`; unsupported browser languages
|
Static language files live in `apps/website/lang`; `en.php` and `nah.php` are
|
||||||
can be translated by the client and saved through `save_lang.php` as runtime
|
curated source files, with the Nahuatl home page intentionally biased toward as
|
||||||
JSON data on the website PVC.
|
many Nahuatl words as possible while keeping technical terms understandable.
|
||||||
|
Unsupported browser languages use the same-origin `/translate.php` endpoint,
|
||||||
|
which calls Ollama server-side through `OLLAMA_HOST` and `OLLAMA_MODEL`; the
|
||||||
|
browser never calls the private Ollama IP directly. Generated runtime language
|
||||||
|
JSON is saved through `save_lang.php` on the website PVC.
|
||||||
|
|
||||||
The CV page has two client-side presentation modes:
|
The CV page has two client-side presentation modes:
|
||||||
|
|
||||||
|
|
@ -539,6 +553,12 @@ During bootstrap, `lab.sh` hashes `apps/website`, builds
|
||||||
Kustomize. This keeps the GitOps source generic while the deployed image remains
|
Kustomize. This keeps the GitOps source generic while the deployed image remains
|
||||||
immutable.
|
immutable.
|
||||||
|
|
||||||
|
After `./lab.sh apps`, the live deployment image should be a content-hash tag,
|
||||||
|
for example `192.168.100.68:30500/php-website:src-...`. If it still shows
|
||||||
|
`php-website:latest`, Argo CD has not rendered the current Application source.
|
||||||
|
Check the `website-production` Application source, sync status, and repository
|
||||||
|
access before restarting pods.
|
||||||
|
|
||||||
The first demo, `The Client-Side Media Cruncher (Wasm + TS)`, currently performs
|
The first demo, `The Client-Side Media Cruncher (Wasm + TS)`, currently performs
|
||||||
private, browser-only image compression and conversion using native Canvas APIs.
|
private, browser-only image compression and conversion using native Canvas APIs.
|
||||||
Heavier video conversion, such as MP4 to WebM, should use a Rust core compiled
|
Heavier video conversion, such as MP4 to WebM, should use a Rust core compiled
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,13 @@
|
||||||
# MLOps Platform Demo
|
# MLOps Platform Demo
|
||||||
|
|
||||||
Production-shaped inference demo for the portfolio site. The model is intentionally small: logistic regression coefficients trained with scikit-learn and exported to JSON so the runtime stays light enough for the homelab.
|
Production-shaped inference demo for the portfolio site. The model is
|
||||||
|
intentionally small: logistic regression coefficients trained with scikit-learn
|
||||||
|
and exported to JSON so the runtime stays light enough for the homelab.
|
||||||
|
|
||||||
|
This directory contains the FastAPI service and model artifacts. It is not yet
|
||||||
|
registered as a Kubernetes application in `bootstrap/apps`; the public website
|
||||||
|
currently links to the reserved static placeholder under
|
||||||
|
`apps/demos-static/public/mlops-platform/`.
|
||||||
|
|
||||||
## Endpoints
|
## Endpoints
|
||||||
|
|
||||||
|
|
@ -13,3 +20,24 @@ Production-shaped inference demo for the portfolio site. The model is intentiona
|
||||||
- `MODEL_VERSION=v1`, `MODEL_TRACK=blue` is the stable route.
|
- `MODEL_VERSION=v1`, `MODEL_TRACK=blue` is the stable route.
|
||||||
- `MODEL_VERSION=v2`, `MODEL_TRACK=green` is the canary route.
|
- `MODEL_VERSION=v2`, `MODEL_TRACK=green` is the canary route.
|
||||||
- Kubernetes service selectors choose the active track, so rollback is a service selector change instead of an image rebuild.
|
- Kubernetes service selectors choose the active track, so rollback is a service selector change instead of an image rebuild.
|
||||||
|
|
||||||
|
## Local Smoke Test
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker build -t mlops-platform:local apps/mlops-platform
|
||||||
|
docker run --rm -p 8080:8080 mlops-platform:local
|
||||||
|
```
|
||||||
|
|
||||||
|
In another shell:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -fsS http://127.0.0.1:8080/healthz
|
||||||
|
curl -fsS http://127.0.0.1:8080/predict \
|
||||||
|
-H 'Content-Type: application/json' \
|
||||||
|
-d '{"latency_ms":120,"error_rate":0.01,"cpu_utilization":0.55,"memory_utilization":0.62,"queue_depth":8}'
|
||||||
|
curl -fsS http://127.0.0.1:8080/metrics
|
||||||
|
```
|
||||||
|
|
||||||
|
The next deployment step is to add Kubernetes manifests or a Kustomize app with
|
||||||
|
blue and green Deployments, a Service selector for the active track, resource
|
||||||
|
requests and limits, and Prometheus scraping.
|
||||||
|
|
|
||||||
|
|
@ -115,7 +115,15 @@ discovery. New workers are full clones created with
|
||||||
land on the NVMe thin pool. Set `LAB_PIMOX_WORKER_REPLACE_EXISTING=true` to
|
land on the NVMe thin pool. Set `LAB_PIMOX_WORKER_REPLACE_EXISTING=true` to
|
||||||
destroy and recreate existing worker VMs from the current template. The pipeline
|
destroy and recreate existing worker VMs from the current template. The pipeline
|
||||||
refuses `LAB_PIMOX_WORKER_STORAGE=local` so only the template VM lives on local
|
refuses `LAB_PIMOX_WORKER_STORAGE=local` so only the template VM lives on local
|
||||||
storage. Useful overrides:
|
storage.
|
||||||
|
|
||||||
|
Worker indexes are stable: index `1` maps to VMID `9010`,
|
||||||
|
`pimox-worker-01`, and worker key `pimox01`; index `2` maps to VMID `9011`,
|
||||||
|
`pimox-worker-02`, and worker key `pimox02`. `LAB_PIMOX_SKIP_WORKER_INDEXES`
|
||||||
|
defaults to empty, so the pipeline owns index `1` unless you set
|
||||||
|
`LAB_PIMOX_SKIP_WORKER_INDEXES=1`.
|
||||||
|
|
||||||
|
Useful overrides:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
./lab.sh rebuild-cluster
|
./lab.sh rebuild-cluster
|
||||||
|
|
@ -123,6 +131,7 @@ LAB_PIMOX_PIPELINE=false ./lab.sh up
|
||||||
LAB_PIMOX_TEMPLATE_REPLACE_EXISTING=true ./lab.sh up
|
LAB_PIMOX_TEMPLATE_REPLACE_EXISTING=true ./lab.sh up
|
||||||
LAB_PIMOX_WORKER_COUNT=0 ./lab.sh up
|
LAB_PIMOX_WORKER_COUNT=0 ./lab.sh up
|
||||||
LAB_PIMOX_WORKER_COUNT=2 ./lab.sh up
|
LAB_PIMOX_WORKER_COUNT=2 ./lab.sh up
|
||||||
|
LAB_PIMOX_SKIP_WORKER_INDEXES=1 LAB_PIMOX_WORKER_COUNT=2 ./lab.sh up
|
||||||
LAB_PIMOX_WORKER_BASE_VMID=9020 ./lab.sh up
|
LAB_PIMOX_WORKER_BASE_VMID=9020 ./lab.sh up
|
||||||
LAB_PIMOX_WORKER_STORAGE=nvme_thin_pool ./lab.sh up
|
LAB_PIMOX_WORKER_STORAGE=nvme_thin_pool ./lab.sh up
|
||||||
LAB_PIMOX_WORKER_REPLACE_EXISTING=true ./lab.sh up
|
LAB_PIMOX_WORKER_REPLACE_EXISTING=true ./lab.sh up
|
||||||
|
|
|
||||||
|
|
@ -20,6 +20,25 @@ Kubernetes consumes Git from the Debian bare GitOps mirror at
|
||||||
`/home/jv/git-server/my-homelab-configs.git`. Gitea is the human-facing Git
|
`/home/jv/git-server/my-homelab-configs.git`. Gitea is the human-facing Git
|
||||||
service and remains available when the cluster is destroyed.
|
service and remains available when the cluster is destroyed.
|
||||||
|
|
||||||
|
`./lab.sh bootstrap-gitea-repo` creates or validates the public Gitea repository,
|
||||||
|
adds the Debian host deploy key when needed, and points the Debian checkout's
|
||||||
|
`gitea` remote at:
|
||||||
|
|
||||||
|
```text
|
||||||
|
ssh://git@192.168.100.89:32222/jv/my-homelab-configs.git
|
||||||
|
```
|
||||||
|
|
||||||
|
Argo CD does not read from the Raspberry Pi Gitea SSH port. It reads from the
|
||||||
|
Debian bare GitOps mirror through `gitops_repo_url`, normally:
|
||||||
|
|
||||||
|
```text
|
||||||
|
ssh://jv@192.168.100.68/home/jv/git-server/my-homelab-configs.git
|
||||||
|
```
|
||||||
|
|
||||||
|
The platform bootstrap registers that repo secret and updates
|
||||||
|
`argocd-ssh-known-hosts-cm`. If Argo CD reports `knownhosts: key is unknown`,
|
||||||
|
refresh the Debian host key in that ConfigMap and restart `argocd-repo-server`.
|
||||||
|
|
||||||
Backups are installed on the Debian host by `lab.sh deploy-gitea` and
|
Backups are installed on the Debian host by `lab.sh deploy-gitea` and
|
||||||
`lab.sh backup-gitea`. The timer runs `gitea dump` inside the Raspberry Pi
|
`lab.sh backup-gitea`. The timer runs `gitea dump` inside the Raspberry Pi
|
||||||
container, copies the archive to Debian, and stores it under
|
container, copies the archive to Debian, and stores it under
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue