Verify backend deploy via Forgejo + in-cluster registry succeeds end-to-end #26
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Gated on coilysiren/backend#25. Do not start until #25's PR has merged - that PR builds the deployer this verifies.
Independent end-to-end verification that the new GitHub-free deploy path actually deploys, run by a different worker than the one that built it.
Steps
mainor re-run).192.168.0.194:30500/coilysiren-backend:<sha>.kubectl set image+ rollout lands a new pod incoilysiren-backendwhose image is the192.168.0.194:30500/...ref (not the old34bcccb/ ghcr-shaped sideload).2/2 Runningand/healthpasses.TS_*/ OIDC tailnet join happened and no GHCR pull occurred.Report pass/fail. On failure, capture the Forgejo run log and the kubelet pull event.
Blocked by: coilysiren/backend#25.
Verification result: FAIL
The GitHub-free deploy path does not deploy end-to-end. Reopening - the
closes #26trailers on the build-fix commits auto-closed this prematurely(repo hook forces a closing trailer on every commit); verification has not
passed.
What works now
After fixing the deploy job (it had never run green - the runner is a
container executor on
node:lts, which lacksdocker/jq, and thelocalhost:2375DOCKER_HOST premise was wrong):(
tcp://172.18.0.1:2375).192.168.0.194:30500/coilysiren-backend:8f8d9b2a...(Forgejo run#29).Where it fails -
docker pushtimes outRoot cause: the runner is pinned to the WSL node
(
kai-desktop-tower-wsl), and pods there have no route to kai-server's LANIP
192.168.0.194. The cluster fabric is tailscale (node InternalIPs are all100.x); the WSL node reaches kai-server only over the tailnet. Reachabilityprobes from the runner DinD:
10.43.131.232:5000/v2/-> 200 OK100.69.164.66:30500/v2/-> 200 OK100.69.164.66:6443/healthz-> 401 (reachable)192.168.0.194:30500/v2/-> timeout192.168.0.194:6443/healthz-> timeoutSo both the push (
192.168.0.194:30500) and the later Roll-deployment kubeconfig(
https://192.168.0.194:6443) target an address the runner can't reach.Checklist against the issue steps
192.168.0.194:30500/coilysiren-backend:<sha>- NO (timeout).kubectl set image+ rollout lands new pod - NO (never reached; push failed first).2/2 Running+/health- N/A. App pod is unchanged:ghcr.io/coilysiren/coilysiren-backend:34bcccb...(the old ghcr sideload).TS_*/OIDC join, no GHCR pull - the workflow has no tailscale step(good), and no cluster-side pull happened at all (push never landed). No
kubelet pull event to capture - the rollout step never ran.
Blocked on
Infra fix filed: coilysiren/infrastructure#175 - address the registry +
k3s API by kai-server's tailnet IP (from SSM
/coilysiren/kai-server/tailnet-ip),not the LAN IP. Until that lands, this path cannot complete.
Repo-side build fixes that did land (commits on
main)b15f521install docker/kubectl/jq + soften status report;9932f9bresolveDinD docker host via job-container gateway;
ade6f65(BuildKit attempt,superseded);
8f8d9b2use legacy docker builder. These are correct andnecessary but not sufficient - the blocker is the infra addressing in #175.