tailscale-policy: terraform module for tagOwners + per-host + per-service tags #134

Open
opened 2026-05-26 17:39:38 +00:00 by coilysiren · 0 comments
Owner

Problem

Tailnet identity is currently a single dimension - "which Tailscale node" - which doesn't answer "which physical box hosts this." Physical machines (kai-server, kai-desktop-tower, kai-windows-laptop, kais-macbook-pro) and k8s sidecar nodes (one per service in terraform/tailscale-devices/services.yaml) all sit on the same tailnet with no structured tagging beyond tag:k8s.

Goal

Two layers of tags, both IaC-owned:

  1. Per-physical-machine tags. tag:physical, plus a per-host tag (tag:kai-server, tag:kai-desktop-tower, tag:kai-windows-laptop, tag:kais-macbook-pro).
  2. k8s sidecar tags. Each service-key in services.yaml gets tag:k8s + tag:<service> + tag:host-<host> (current state assumes tag:host-kai-server for everything; expand the map when a second k3s host shows up).

Design

New module: terraform/tailscale-policy/.

  • tailscale_acl.policy - owns the full tailnet policy file (tagOwners + ACL rules + grants). One resource, JSON body assembled from tags.yaml + the existing rules already in the web console.
  • tailscale_device_tags.physical - for_each over the four physical devices, looked up by MagicDNS name via data.tailscale_device. Each gets tag:physical + per-host tag.
  • New python wrapper scripts/k8s/terraform_tailscale_policy.py, new coily verb terraform-tailscale-policy.

Extend terraform/tailscale-devices/:

  • Convert services.yaml from a flat list to a map service -> [tags...] (or compute tags inside main.tf from the host map).
  • Each tailscale_tailnet_key.service gets the per-service tag list, not bare tag:k8s.

Bootstrap sequence

  1. Add new module + script + coily verb. Don't apply yet.
  2. Dump current ACL via coily exec dump-tailscale-acl (also new). Round-trip into terraform/tailscale-policy/main.tf as the starting policy body.
  3. terraform import tailscale_acl.policy <tailnet>.
  4. terraform plan until empty diff against current state.
  5. Add the new tagOwners + host tags. terraform plan shows only the additive diff. Apply.
  6. Switch tailscale-devices to per-service tags. Existing auth keys keep working until rotation; new keys carry the new tag set.

Gotchas

  • tailscale_acl resource owns the whole policy, so the first apply has to be a no-op against current state. Don't skip the import step.
  • tailscale_device_tags overwrites the full tag list per device. Make sure the for_each map enumerates the full desired set per host.
  • Reassigning tags via terraform requires the device to currently be authed by a user, not a tagged auth key. The four physicals qualify.

Out of scope

  • Posture attributes (hardware UUID etc.). Tags answer the question; posture is a later layer if we want hardware identity that survives OS reinstall.
  • Migrating game-server auth-key flows in coilysiren/eco-server, coilysiren/factorio-server, etc. - those keep consuming SSM /coilysiren/<service>/ts-authkey, just with richer tags.

Filed by Claude.

**Problem** Tailnet identity is currently a single dimension - "which Tailscale node" - which doesn't answer "which physical box hosts this." Physical machines (kai-server, kai-desktop-tower, kai-windows-laptop, kais-macbook-pro) and k8s sidecar nodes (one per service in `terraform/tailscale-devices/services.yaml`) all sit on the same tailnet with no structured tagging beyond `tag:k8s`. **Goal** Two layers of tags, both IaC-owned: 1. **Per-physical-machine tags.** `tag:physical`, plus a per-host tag (`tag:kai-server`, `tag:kai-desktop-tower`, `tag:kai-windows-laptop`, `tag:kais-macbook-pro`). 2. **k8s sidecar tags.** Each service-key in `services.yaml` gets `tag:k8s` + `tag:<service>` + `tag:host-<host>` (current state assumes `tag:host-kai-server` for everything; expand the map when a second k3s host shows up). **Design** New module: `terraform/tailscale-policy/`. - `tailscale_acl.policy` - owns the full tailnet policy file (tagOwners + ACL rules + grants). One resource, JSON body assembled from `tags.yaml` + the existing rules already in the web console. - `tailscale_device_tags.physical` - for_each over the four physical devices, looked up by MagicDNS name via `data.tailscale_device`. Each gets `tag:physical` + per-host tag. - New python wrapper `scripts/k8s/terraform_tailscale_policy.py`, new coily verb `terraform-tailscale-policy`. Extend `terraform/tailscale-devices/`: - Convert `services.yaml` from a flat list to a map `service -> [tags...]` (or compute tags inside main.tf from the host map). - Each `tailscale_tailnet_key.service` gets the per-service tag list, not bare `tag:k8s`. **Bootstrap sequence** 1. Add new module + script + coily verb. Don't apply yet. 2. Dump current ACL via `coily exec dump-tailscale-acl` (also new). Round-trip into `terraform/tailscale-policy/main.tf` as the starting policy body. 3. `terraform import tailscale_acl.policy <tailnet>`. 4. `terraform plan` until empty diff against current state. 5. Add the new tagOwners + host tags. `terraform plan` shows only the additive diff. Apply. 6. Switch tailscale-devices to per-service tags. Existing auth keys keep working until rotation; new keys carry the new tag set. **Gotchas** - `tailscale_acl` resource owns the whole policy, so the first apply has to be a no-op against current state. Don't skip the import step. - `tailscale_device_tags` overwrites the full tag list per device. Make sure the for_each map enumerates the full desired set per host. - Reassigning tags via terraform requires the device to currently be authed by a user, not a tagged auth key. The four physicals qualify. **Out of scope** - Posture attributes (hardware UUID etc.). Tags answer the question; posture is a later layer if we want hardware identity that survives OS reinstall. - Migrating game-server auth-key flows in `coilysiren/eco-server`, `coilysiren/factorio-server`, etc. - those keep consuming SSM `/coilysiren/<service>/ts-authkey`, just with richer tags. Filed by Claude.
coilysiren added
P4
and removed
P3
labels 2026-05-31 07:00:38 +00:00
Sign in to join this conversation.
No labels
P0
P1
P2
P3
P4
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
coilyco-flight-deck/infrastructure#134
No description provided.