Session<->repo join: match_type='file' from Read/Edit/Write tool calls (gated, after gh-ref ships) #57

Open
opened 2026-05-23 20:55:28 +00:00 by coilysiren · 0 comments
Owner

Originally filed by @coilysiren on 2026-05-05T08:42:47Z - https://github.com/coilysiren/repo-recall/issues/57

Extend session_repos.match_type with 'file'. Extract file paths from session JSONL tool_use blocks and write a join row when the path resolves under a discovered repo's worktree.

Scope

  • Parse tool_use blocks for known tool names: Read, Edit, Write, Glob, Grep. Pull the path argument.
  • For each path, canonicalize and check ancestor against discovered repo paths (the is_ancestor_or_equal helper in src/join.rs already handles this).
  • Write match_type='file' rows. Don't replace 'cwd' rows - they coexist.
  • Behind a feature gate (env var or Cargo feature) until the parser proves stable.

Preconditions

Do not start until the gh-ref sibling issue has shipped and the dashboard demonstrably gets richer from it. If gh-ref joins don't widen counts meaningfully on Kai's data, file-path joins won't either - both depend on the same premise (sessions doing work outside their cwd).

Risk

Claude Code's JSONL tool-call schema is stable today but isn't a public contract. Mitigations:

  • Extract only from documented tool names. Skip unknowns silently.
  • Tolerate malformed lines per the existing sessions.rs convention.
  • Feature gate so a schema break can be turned off without a release.

Out of scope

  • New MCP tools or HTML routes.
  • Cross-session changed-file overlap (rejected per #38 deep-dive: violates 'data sources are independent tables' convention).
  • Branch-name matching.

Split out of #38.

_Originally filed by @coilysiren on 2026-05-05T08:42:47Z - [https://github.com/coilysiren/repo-recall/issues/57](https://github.com/coilysiren/repo-recall/issues/57)_ Extend `session_repos.match_type` with `'file'`. Extract file paths from session JSONL tool_use blocks and write a join row when the path resolves under a discovered repo's worktree. ## Scope - Parse tool_use blocks for known tool names: `Read`, `Edit`, `Write`, `Glob`, `Grep`. Pull the path argument. - For each path, canonicalize and check ancestor against discovered repo paths (the `is_ancestor_or_equal` helper in `src/join.rs` already handles this). - Write `match_type='file'` rows. Don't replace `'cwd'` rows - they coexist. - Behind a feature gate (env var or Cargo feature) until the parser proves stable. ## Preconditions **Do not start until the `gh-ref` sibling issue has shipped and the dashboard demonstrably gets richer from it.** If gh-ref joins don't widen counts meaningfully on Kai's data, file-path joins won't either - both depend on the same premise (sessions doing work outside their cwd). ## Risk Claude Code's JSONL tool-call schema is stable today but isn't a public contract. Mitigations: - Extract only from documented tool names. Skip unknowns silently. - Tolerate malformed lines per the existing `sessions.rs` convention. - Feature gate so a schema break can be turned off without a release. ## Out of scope - New MCP tools or HTML routes. - Cross-session changed-file overlap (rejected per #38 deep-dive: violates 'data sources are independent tables' convention). - Branch-name matching. Split out of #38.
coilysiren added
P4
and removed
P3
labels 2026-05-31 07:01:11 +00:00
Sign in to join this conversation.
No labels
P0
P1
P2
P3
P4
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
coilyco-flight-deck/repo-recall#57
No description provided.