Skip to content

[Dream Cycle 2026-06-08] memory: ADR-147 multi-signal retrieval (semantic+BM25+entity)#2317

Draft
ruvnet wants to merge 1 commit into
mainfrom
dream/2026-06-08-memory
Draft

[Dream Cycle 2026-06-08] memory: ADR-147 multi-signal retrieval (semantic+BM25+entity)#2317
ruvnet wants to merge 1 commit into
mainfrom
dream/2026-06-08-memory

Conversation

@ruvnet

@ruvnet ruvnet commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Tonight's Rotation

Field Value
SLOT 3 — memory DEEP, plugins + automation SCAN
Session commit d065b15927c6ba7318623e8af123e7980e4c6681
Date 2026-06-08
Issue #2316

What's in this PR

ADR-147v3/docs/adr/ADR-147-multi-signal-memory-retrieval.md

Proposes replacing Ruflo's vector-only memory retrieval with a parallel multi-signal strategy (semantic HNSW + BM25/FTS5 + entity matching) fused via Reciprocal Rank Fusion. No implementation code in this PR — architectural decision only, per dream-cycle protocol.

README rowv3/docs/adr/README.md updated with ADR-147 entry.


Research Summary

The 2026-06-08 dream session found that SOTA agent memory retrieval has shifted decisively from single-vector to multi-signal:

  • Mem0 v2 (April 2026): 94.4% LongMemEval at ~6,900 tok/query — 75% token reduction vs full-context (Grade A — vendor benchmark crosschecked)
  • Memori (arXiv 2603.19935, March 2026): 81.95% LoCoMo at 1,294 tok/query — 20× fewer tokens than full-context (Grade A — peer-reviewed)
  • Supermemory ASMR: 98.6% LongMemEval-s via agentic ensemble (Grade B — single vendor source)

Ruflo gap: vector-only retrieval, no LoCoMo/LongMemEval scores published, fts5.ts and graceful-retrieval.ts exist but are unwired.

ADR-147 closes the gap in 4 phases: wire FTS5 → add entity tagger → async writes by default → publish benchmark scores.


ADR Link

ADR-147 — Multi-Signal Memory Retrieval


Gist

Dream Cycle report: SHA-256 fac0b931deb16a4c41d96a1864226fb9ca363ab8018bda88602d457566f962a9
Witness stamp: 419823cd839e080543d4d5ddae8e7e0d9642bbce0f9ca165c6e97574afc703d9


Merge Policy

Do not self-merge. Leave for human review. This is a nightly research artifact — merge when the engineering team decides to act on ADR-147's implementation plan.


Generated by Claude Code

… Mem0 SOTA (94.4% LongMemEval)

ADR-147 proposes adding parallel semantic+BM25/FTS5+entity retrieval with RRF
fusion to UnifiedMemoryService. FTS5 infrastructure already in codebase (fts5.ts,
graceful-retrieval.ts); architectural decision records the retrieval contract
change and 4-phase implementation plan.

https://claude.ai/code/session_01R4sETrdwMxLBZCWXowsagM
@ruvnet

ruvnet commented Jun 8, 2026

Copy link
Copy Markdown
Owner Author

Review (from #2324)

Renumber: ADR-147 → ADR-152 before merge. All 6 open dream-cycle PRs claim ADR-147; current highest accepted is ADR-146. Proposed sequence in PR-creation order:

Process concerns (zero merges in 14 nights, self-grading rubric vs. merge-rate signal) tracked in #2324.

Research itself is good — sources, gap analysis, and in-tree citations all check out. Collision is the only thing blocking merge here.

ruvnet added a commit that referenced this pull request Jun 8, 2026
…2327)

Adds entity matching as a third RRF arm alongside dense + sparse, plus per-result signal provenance (#2317).

What this lands:
- `entity-tagger.ts` — regex extractor for emails, URLs, file paths, quoted phrases, proper-noun 2-grams (12 unit tests)
- `hybridSearch` controller — entity arm wired into the existing RRF+MMR fusion at `controller-registry.ts`
- `signals: ('vector' | 'bm25' | 'entity')[]` on every fused result

End-to-end capability smoke against built dist (not just TS source):
- Corpus: 30 generic "authentication" entries + 1 "Alice Smith" needle
- Query: `"Alice Smith authentication"`
- Result: **Alice ranks #1** with `signals=["vector","bm25","entity"]`; runners-up only have `["vector","bm25"]`
- RRF score gap: 0.0477 vs 0.0323 = **47% boost** from the entity signal

16/16 tests in touched modules. Full memory suite 416/420 (4 pre-existing Windows-env failures in unrelated files).

Closes part of #2317. The ADR's stated P1 ("wire FTS5 + RRF") was already shipped before this PR; this lands the actual P2 (entity arm + provenance) that the dream-cycle author missed when surveying the tree.
ruvnet added a commit that referenced this pull request Jun 8, 2026
#2327)

@claude-flow/memory 3.0.0-alpha.19 → 3.0.0-alpha.20. Adds the entity arm
to hybridSearch alongside the existing dense + sparse RRF fusion, plus
per-result signals: ('vector'|'bm25'|'entity')[] provenance.

End-to-end capability smoke against built dist confirmed: Alice needle in
31-doc corpus ranks #1 with all three signals; runner-up has only
vector+bm25 — RRF score gap of ~47%.

@claude-flow/cli, claude-flow, ruflo 3.10.38 → 3.10.39. CLI also pins
@claude-flow/memory to ^3.0.0-alpha.20 so the wrapper users pick up the
entity arm automatically.

All four packages published with latest+alpha+v3alpha aligned.
Lockfile regen included (lesson from #2311 — bumping a workspace dep
without regenerating v3/pnpm-lock.yaml breaks frozen-lockfile CI).

Co-Authored-By: RuFlo <ruv@ruv.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants