Agent fleet: enable tailnet-attached pods on the Mac VM and WSL tower nodes #6

Open
opened 2026-05-23 20:54:25 +00:00 by coilysiren · 0 comments
Owner

Originally filed by @coilysiren on 2026-05-22T20:33:28Z - https://github.com/coilysiren/infrastructure/issues/285

Context

The agent pod fleet (#283) runs Kai's Claude / Codex / OpenAI agents as DaemonSets gated on the node label agent-host=true. At rollout only kai-server carries that label.

A tailnet-attached pod needs kernel-mode tailscale, which needs /dev/net/tun. The two WSL2-backed nodes - kai-macbook-pro-vm and kai-desktop-tower-wsl - don't expose that device to containerd, so they can't host an SSH-able agent pod today. deploy/repo-recall.yml pins to kai-server for the same reason.

Two parts

  1. Decide whether agents belong on these nodes at all. Both nodes back machines Kai uses interactively (a MacBook VM, a Windows desktop tower). An always-on agent pod is real CPU, RAM, and token cost on a machine she is actively working on. Running the fleet only on kai-server may be the right permanent answer.
  2. If yes, enable it. Expose /dev/net/tun to containerd on the WSL2 kernel, confirm kernel-mode tailscale comes up, label the node agent-host=true, verify a pod runs healthy and is SSH-able.

Done when

  • The decision is recorded on this issue.
  • If enabling: at least one non-kai-server node runs an agent pod that Kai can ssh into.
_Originally filed by @coilysiren on 2026-05-22T20:33:28Z - [https://github.com/coilysiren/infrastructure/issues/285](https://github.com/coilysiren/infrastructure/issues/285)_ ## Context The agent pod fleet (#283) runs Kai's Claude / Codex / OpenAI agents as DaemonSets gated on the node label `agent-host=true`. At rollout only `kai-server` carries that label. A tailnet-attached pod needs kernel-mode tailscale, which needs `/dev/net/tun`. The two WSL2-backed nodes - `kai-macbook-pro-vm` and `kai-desktop-tower-wsl` - don't expose that device to containerd, so they can't host an SSH-able agent pod today. `deploy/repo-recall.yml` pins to `kai-server` for the same reason. ## Two parts 1. **Decide whether agents belong on these nodes at all.** Both nodes back machines Kai uses interactively (a MacBook VM, a Windows desktop tower). An always-on agent pod is real CPU, RAM, and token cost on a machine she is actively working on. Running the fleet only on `kai-server` may be the right permanent answer. 2. **If yes, enable it.** Expose `/dev/net/tun` to containerd on the WSL2 kernel, confirm kernel-mode tailscale comes up, label the node `agent-host=true`, verify a pod runs healthy and is SSH-able. ## Done when - The decision is recorded on this issue. - If enabling: at least one non-kai-server node runs an agent pod that Kai can `ssh` into.
coilysiren added
P4
and removed
P3
labels 2026-05-31 07:00:56 +00:00
Sign in to join this conversation.
No labels
P0
P1
P2
P3
P4
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
coilyco-flight-deck/infrastructure#6
No description provided.