Ingest entire markdown documentation files verbatim #3

Closed
opened 2026-05-23 20:55:21 +00:00 by coilysiren · 1 comment
Owner

Originally filed by @coilysiren on 2026-05-22T06:46:20Z - https://github.com/coilysiren/repo-recall/issues/250

Idea

repo-recall should ingest every markdown documentation file in a repo verbatim. The whole file, every markdown file, read end to end. No summarization, no meta-analysis. Eat the whole thing.

Why now

Kai is doing a documentation rewrite that consolidates all docs into one strict, predictable structure (named sections, flat docs/*.md, size caps, validators). That strictness is exactly what makes verbatim markdown ingestion tractable. The docs are short, structured, and bounded, so storing them whole is cheap and high-signal.

Contrast with code

Existing repo-recall issues cover code meta-analysis, where the code body does not need to be stored verbatim. Documentation is the opposite case. Docs are the human-authored intent of the repo, and they are small enough to keep whole. Pull markdown verbatim, keep code as meta-analysis.

Proposed work

  • Walk every *.md in an ingested repo and store the full file text.
  • Make that text searchable alongside session transcripts.
  • Optional gating: only ingest once docs live in the consolidated structure, so repo-recall is not indexing scattered drafts. Decide whether to gate on a marker or just ingest all markdown and let structure improve over time.
_Originally filed by @coilysiren on 2026-05-22T06:46:20Z - [https://github.com/coilysiren/repo-recall/issues/250](https://github.com/coilysiren/repo-recall/issues/250)_ **Idea** repo-recall should ingest every markdown documentation file in a repo verbatim. The whole file, every markdown file, read end to end. No summarization, no meta-analysis. Eat the whole thing. **Why now** Kai is doing a documentation rewrite that consolidates all docs into one strict, predictable structure (named sections, flat `docs/*.md`, size caps, validators). That strictness is exactly what makes verbatim markdown ingestion tractable. The docs are short, structured, and bounded, so storing them whole is cheap and high-signal. **Contrast with code** Existing repo-recall issues cover code meta-analysis, where the code body does not need to be stored verbatim. Documentation is the opposite case. Docs are the human-authored intent of the repo, and they are small enough to keep whole. Pull markdown verbatim, keep code as meta-analysis. **Proposed work** - Walk every `*.md` in an ingested repo and store the full file text. - Make that text searchable alongside session transcripts. - Optional gating: only ingest once docs live in the consolidated structure, so repo-recall is not indexing scattered drafts. Decide whether to gate on a marker or just ingest all markdown and let structure improve over time.
Author
Owner

Merged into #5 in the 2026-05-29 backlog burn-down. Verbatim markdown ingest overlaps docs-ingest issue Reopen if it should stand alone.

Merged into #5 in the 2026-05-29 backlog burn-down. Verbatim markdown ingest overlaps docs-ingest issue Reopen if it should stand alone.
Sign in to join this conversation.
No labels
P0
P1
P2
P3
P4
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
coilyco-flight-deck/repo-recall#3
No description provided.