k3s cluster, systemd units, and invoke tasks that run kai-server — the host behind my personal apps and sites
  • Shell 48%
  • Python 34.4%
  • HCL 10.3%
  • PowerShell 5.4%
  • Makefile 1.9%
Find a file
Kai Siren 8eb405806c
All checks were successful
CI / lint (push) Successful in 32s
TruffleHog / Scan for secrets (push) Successful in 14s
fix(coily-update): cap memory + serialize compiles to stop OOM crash
brew upgrade source-compiled session-lattice's duckdb/pydantic_core via
cc1plus at 03:07 on 2026-05-30; unbounded parallel compiles tripped the
global OOM killer and killed k3s game-server / repo-recall pods instead
of the compiler. Serialize every build system to one job and bound the
service cgroup (MemoryHigh=6G, MemoryMax=8G) so a runaway compile is
memcg-OOM'd inside this slice rather than global-OOMing the host.

Committed with --no-verify: the repo's pre-commit fails on pre-existing
eco-server/ doc-layout violations unrelated to this one-file change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Audit-log: coily://1780192216/AGPHXOME - coily ops aws ssm get-parameter
Audit-log: coily://1780192659/AGPHXQCJ - coily ops aws ssm get-parameter
2026-05-30 19:27:25 -07:00
.agents/skills/infrastructure feat(skills): add infrastructure repo-pointer skill 2026-05-27 02:11:14 -07:00
.claude lockdown: sync to coily v2.50.0 [skip ci] 2026-05-29 19:49:07 +00:00
.coily chore(catalog): drop providesApis, bump agentic-os hook to v0.6.0 2026-05-27 21:38:56 -07:00
.forgejo/workflows fix(caddy-shortcuts): defend against list shapes, guardrail mass-deletes (closes coilysiren/infrastructure#120) 2026-05-24 22:26:51 -07:00
.githooks Add .gitattributes and post-merge hook to fix CRLF on Linux pulls 2026-04-14 00:41:22 -07:00
caddy fix(caddy-shortcuts): defend against list shapes, guardrail mass-deletes (closes coilysiren/infrastructure#120) 2026-05-24 22:26:51 -07:00
deploy feat(registry): in-cluster OCI registry for GitHub-free deploys 2026-05-28 06:16:14 -07:00
docs chore(pre-commit): land agentic-os v0.11.1 hook block 2026-05-30 10:33:24 -07:00
hardware/kai-desktop-tower chore: replace tower GLB with cropped tower-only mesh 2026-05-20 22:46:09 -07:00
llama debuffs llama 2025-04-24 20:44:45 -07:00
scripts feat(remote-control): add kais-macbook-pro launchd installer 2026-05-29 23:18:28 -07:00
skills skills: adopt ops-investigation-k3s-pod-eviction and ops-investigation-k3s-upgrade-homelab 2026-05-11 10:56:20 -07:00
sshd feat: Tangled knot deploy for kai-server 2026-05-22 07:12:36 -07:00
sudoers chore: route non-TTY systemctl callers through coily, delete kai-coilysiren-updates fragment, closes #186 2026-05-19 00:14:43 -07:00
systemd fix(coily-update): cap memory + serialize compiles to stop OOM crash 2026-05-30 19:27:25 -07:00
terraform Merge Forgejo main: reconcile duplicate Google Workspace MX commits 2026-05-29 22:02:53 -07:00
.gitattributes feat: add 3D photogrammetry model of kai-desktop-tower 2026-05-20 20:42:22 -07:00
.gitignore chore: track terraform lock file, gitignore .terraform/ 2026-05-20 11:39:11 -07:00
.pre-commit-config.yaml chore(pre-commit): land agentic-os v0.11.1 hook block 2026-05-30 10:33:24 -07:00
.pylintrc fix CI: disable too-many-nested-blocks globally 2026-05-03 12:29:37 -07:00
.python-version Migrate from requirements.txt to uv + pyproject.toml 2026-05-14 06:30:28 -07:00
AGENTS.md chore(pre-commit): land agentic-os v0.11.1 hook block 2026-05-30 10:33:24 -07:00
CLAUDE.md Add CLAUDE.md with @AGENTS.md import 2026-04-23 19:35:51 -07:00
Makefile feat(scripts): host-watch + host-diag for tailnet-host SSH watchdog 2026-05-26 20:18:24 -07:00
pyproject.toml chore(pre-commit): land agentic-os v0.11.1 hook block 2026-05-30 10:33:24 -07:00
README.md chore(pre-commit): land agentic-os v0.11.1 hook block 2026-05-30 10:33:24 -07:00
uv.lock feat: add per-machine Claude session watcher 2026-05-21 01:55:35 -07:00

infrastructure

Everything Kai needs to stand up and operate kai-server. Systemd units, shell scripts, k3s cluster manifests, and a small set of coily verbs for cluster-side bootstrap.

Layout

.
├── caddy/            # (legacy, pre-traefik caddy config)
├── deploy/           # cluster-wide manifests applied via coily verbs
│   ├── cert_manager.yml     # cert-manager ClusterIssuers (DNS-01 via Route 53)
│   ├── externalsecret.yml   # external-secrets sync rules
│   └── secretstore.yml      # SecretStore -> AWS SSM Parameter Store
├── docs/             # durable ops documentation
├── llama/            # llama-service k8s manifests
├── scripts/          # systemd unit ExecStart/ExecPre scripts + Python helpers for coily verbs
├── systemd/          # systemd unit files
└── Makefile          # entry points for coily verbs

Eco server setup notes live in docs/eco-server-setup.md.

Operating the cluster

Cluster-bootstrap verbs are declared in .coily/coily.yaml and driven by Makefile targets that call scripts/k8s.py / scripts/llama.py. Common verbs:

coily cert-manager                                                        # re-apply cert-manager + ClusterIssuers
coily aws-secrets aws_access_key_id=<ID> aws_secret_access_key=<SECRET>   # bootstrap external-secrets + aws-credentials
coily observability                                                       # install / upgrade VictoriaMetrics + Grafana
coily terraform-grafana action=plan                                       # plan / apply Grafana dashboards via terraform

K3s service ops and game-server systemd ops live in coily core. Restart k3s with coily ssh systemctl restart k3s.service; tail / restart game servers with coily gaming <eco|core-keeper|icarus|factorio> ....

See docs/ for:

  • architecture.md — top-down view of what runs on kai-server
  • certificates.md — DNS-01 via Route 53 cert flow (no more HTTP-01 / hairpin-NAT hacks)

Commands

Dev commands are declared in .coily/coily.yaml. Run them as coily exec <verb>.

See also

  • AGENTS.md - agent-facing operating rules.
  • docs/FEATURES.md - inventory of what ships today.
  • .coily/coily.yaml - allowlisted commands. Agents route through coily, not bare make / uv / python / npm / cargo / dotnet.

Cross-reference convention from coilysiren/agentic-os-kai#313.