Flaky test: mcp_smoke wait_for_initial_scan races on macOS #33

Open
opened 2026-05-23 20:55:25 +00:00 by coilysiren · 0 comments
Owner

Originally filed by @coilysiren on 2026-05-15T07:43:29Z - https://github.com/coilysiren/repo-recall/issues/175

Symptom

tests/mcp_smoke.rs::refresh_runs_and_bumps_scan_version and tests/mcp_smoke.rs::dashboard_returns_structured_payload are flaky on macOS local. Both panic at wait_for_initial_scan with the 20s deadline expiring before the spawned binary's initial background scan bumps scan_version past 0.

Run A (failed)

test refresh_runs_and_bumps_scan_version ... FAILED
test dashboard_returns_structured_payload ... FAILED

failures:
---- refresh_runs_and_bumps_scan_version stdout ----
thread 'refresh_runs_and_bumps_scan_version' panicked at tests/mcp_smoke.rs:290:13:
initial scan never bumped scan_version: {"scan_version":0,"last_scan":null,...}

test result: FAILED. 3 passed; 2 failed; 0 ignored; 0 measured; finished in 21.87s

Run B (passed, no code change between A and B)

test refresh_runs_and_bumps_scan_version ... ok
test dashboard_returns_structured_payload ... ok

test result: ok. 5 passed; 0 failed; 0 ignored; 0 measured; finished in 39.74s

Run C (failed again) - same panic shape as Run A.

Candidate causes

  • wait_for_initial_scan polls every 50ms with a 20s wall-clock deadline (tests/mcp_smoke.rs:274). The comment says "~10s for the scan body plus a tantivy commit on a fresh index" - on a busy machine the scan body plus a freshly-built index can clearly drift past that.
  • The startup path now performs a github_client.fetch_user().await HTTP call (bounded to 3s) for the auth probe. Worst-case adds 3s to startup, which combined with a slow tantivy commit could push the test past the 20s deadline.
  • Both failing tests run sequentially in the same cargo test invocation. They each spawn their own repo-recall binary - resource pressure (file descriptors, redb lock contention via $TMPDIR siblings) could compound.

Suggested fix

Either bump the wait_for_initial_scan deadline to 30-45s, or add a "scan complete" stdout marker the test can wait for instead of polling scan_version. CI on Linux passes consistently per recent green runs on main, so the flake is platform/load-shaped, not a bug in the scan path itself.

Where

Discovered while landing #174.

_Originally filed by @coilysiren on 2026-05-15T07:43:29Z - [https://github.com/coilysiren/repo-recall/issues/175](https://github.com/coilysiren/repo-recall/issues/175)_ **Symptom** `tests/mcp_smoke.rs::refresh_runs_and_bumps_scan_version` and `tests/mcp_smoke.rs::dashboard_returns_structured_payload` are flaky on macOS local. Both panic at `wait_for_initial_scan` with the 20s deadline expiring before the spawned binary's initial background scan bumps `scan_version` past 0. **Run A (failed)** ``` test refresh_runs_and_bumps_scan_version ... FAILED test dashboard_returns_structured_payload ... FAILED failures: ---- refresh_runs_and_bumps_scan_version stdout ---- thread 'refresh_runs_and_bumps_scan_version' panicked at tests/mcp_smoke.rs:290:13: initial scan never bumped scan_version: {"scan_version":0,"last_scan":null,...} test result: FAILED. 3 passed; 2 failed; 0 ignored; 0 measured; finished in 21.87s ``` **Run B (passed, no code change between A and B)** ``` test refresh_runs_and_bumps_scan_version ... ok test dashboard_returns_structured_payload ... ok test result: ok. 5 passed; 0 failed; 0 ignored; 0 measured; finished in 39.74s ``` **Run C (failed again)** - same panic shape as Run A. **Candidate causes** - `wait_for_initial_scan` polls every 50ms with a 20s wall-clock deadline ([tests/mcp_smoke.rs:274](tests/mcp_smoke.rs#L274)). The comment says "~10s for the scan body plus a tantivy commit on a fresh index" - on a busy machine the scan body plus a freshly-built index can clearly drift past that. - The startup path now performs a `github_client.fetch_user().await` HTTP call (bounded to 3s) for the auth probe. Worst-case adds 3s to startup, which combined with a slow tantivy commit could push the test past the 20s deadline. - Both failing tests run sequentially in the same `cargo test` invocation. They each spawn their own `repo-recall` binary - resource pressure (file descriptors, redb lock contention via $TMPDIR siblings) could compound. **Suggested fix** Either bump the `wait_for_initial_scan` deadline to 30-45s, or add a "scan complete" stdout marker the test can wait for instead of polling `scan_version`. CI on Linux passes consistently per recent green runs on `main`, so the flake is platform/load-shaped, not a bug in the scan path itself. **Where** - [tests/mcp_smoke.rs:270-294](tests/mcp_smoke.rs#L270-L294) - [src/main.rs](src/main.rs) - new auth probe at startup (#173 step 1, #174) Discovered while landing #174.
coilysiren added
P2
and removed
P1
labels 2026-05-31 07:01:15 +00:00
Sign in to join this conversation.
No labels
P0
P1
P2
P3
P4
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
coilyco-flight-deck/repo-recall#33
No description provided.