Loop 2 layer 2.5: move snippet-frequency mining into repo-recall + Luca #1
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally filed by @coilysiren on 2026-05-20T12:54:38Z - https://github.com/coilysiren/voice-flow-learning-loop/issues/17
Problem
#16 landed a standalone Python script (
scripts/snippet-frequency.py) that walks~/.claude/projects/**/*.jsonldirectly, applies filters, and counts word-n-grams. The script works, but it duplicates capability that already exists in Kai's stack:Building it standalone fragments the substrate: every consumer that wants snippet candidates has to re-implement the JSONL walk, the harness-wrapper stripping, the subagent-prompt heuristics. Putting it in repo-recall (for the indexing primitive) + Luca (for the natural-language surface) reuses Kai's existing infrastructure and earns its place in the Luca-asker fleet.
This effectively collapses layers 2 and 3 of the original README plan: layer 2 was the standalone script, layer 3 was the Luca dispatch route wrapping that script. The right design is layer 2 = the repo-recall primitive + the Luca tool.
Scope
snippet_candidatestool (or close-named) that consumes the repo-recall primitive, counts n-grams above a length floor, collapses substring overlaps, and returns ranked candidates. The natural-language surface is whatever Luca already does (dispatch route, MCP tool, ask phrasing).Done when
docs/snippets-mining-runs/produced via Luca, with shape comparable to the layer-2 run already committed.scripts/snippet-frequency.pyis either deleted or marked as a reference implementation in its own header.Why now, before layer 3
The remaining layers (4-8) are all natural-language-shaped and operate on candidate lists - length-weighted ranking, session-aware weighting, phrase-shape extractors, trigger-name proposals via LLM, conflict dedup. Every one of those is a Luca tool that takes "the candidates" as input. If the candidate-generation step lives in a per-repo Python script, every later layer has to either reimplement it or shell out. Moving it into Luca up front means layers 4-8 are pure Luca compositions, not script wrappers.
Depends on: #16 (delivered the reference implementation that defines what the Luca tool needs to produce). Unblocks: layers 4-8 (renumbered as Luca-composition layers in a followup).