fix: align gbrain extract --dry-run and doctor guidance with actual commands by vinsew · Pull Request #397 · garrytan/gbrain

vinsew · 2026-04-24T10:52:12Z

Summary

Two small alignment fixes discovered while upgrading an existing brain from v0.14.2 to v0.20.4:

gbrain doctor still tells users to run the retired command pair gbrain link-extract && gbrain timeline-extract — the graph_coverage warning message now points at the current canonical command gbrain extract all --source db.
gbrain extract --dry-run over-reports by counting every extracted candidate as a net-new row, even when the DB would reject it via ON CONFLICT DO NOTHING. The dry-run now caches existing outgoing links / timeline rows per source slug and filters candidates against that cache, so dry-run row counts match what a real run would actually insert.

The link/timeline dedup keys used in dry-run now also carry origin_page_id / origin_slug so frontmatter-derived edges from different origins don't collapse.

Test plan

New test/extract-db.test.ts case: dry-run output after a prior real-run reports zero net-new links (before this change it reported 1-to-1 with candidates).
test/doctor.test.ts regression guard asserts doctor.ts source contains gbrain extract all --source db and does NOT contain gbrain link-extract && gbrain timeline-extract.
bun test passes the new cases on my setup.

🤖 Generated with Claude Code

vinsew · 2026-05-12T10:50:36Z

Closing — superseded by #914.

This PR can't cherry-pick onto current master. v0.32.8's multi-source threading (#860) reshaped both extractLinksFromDB and extractTimelineFromDB:

Candidate dedup keys now carry source ids (6 segments: from_source_id::from_slug::to_source_id::to_slug::link_type::link_source).
engine.getLinks() does not return f.source_id / t.source_id, and the Link type has no source-id fields, so my original 5-segment key compare would false-positive on cross-source rows.

#914 implements the same intent on the multi-source shape: per-extractor inline SQL that returns source ids alongside the link / timeline rows, cached per from-page, byte-for-byte key parity with the candidate side. 4 new tests in test/extract-db.test.ts pin both halves of the contract (zero net new after a real run, AND newly-added candidates still surface).

Thanks for the patience on this one.

vinsew force-pushed the fix/extract-dry-run-doctor-guidance branch from 7d1389a to 33c02bf Compare April 27, 2026 09:17

fix: align gbrain extract dry-run and doctor guidance

f581c6f

vinsew force-pushed the fix/extract-dry-run-doctor-guidance branch from 33c02bf to f581c6f Compare May 11, 2026 08:30

vinsew mentioned this pull request May 12, 2026

fix(extract): dry-run reports net-new rows, not raw candidates (closes #397) #914

Open

5 tasks

vinsew closed this May 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: align gbrain extract --dry-run and doctor guidance with actual commands#397

fix: align gbrain extract --dry-run and doctor guidance with actual commands#397
vinsew wants to merge 1 commit into
garrytan:masterfrom
vinsew:fix/extract-dry-run-doctor-guidance

vinsew commented Apr 24, 2026

Uh oh!

vinsew commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vinsew commented Apr 24, 2026

Summary

Test plan

Uh oh!

vinsew commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant