Skip to content

Progressive note loading — compact index with on-demand full content #686

@bm-clawd

Description

@bm-clawd

Idea

Inspired by OpenViking's L0/L1/L2 progressive loading and Hermes Agent's "hot memory vs cold recall" architecture.

Problem

When agents work with BM notes, they currently load full note content. For large knowledge graphs, this means either:

  • Stuffing too many tokens into context (expensive, slow)
  • Loading too few notes and missing relevant context

Proposal

Implement tiered note loading through the MCP tools:

L0 — Index (always cheap to inject)

  • Title, type, tags, status
  • Relation summary (connected entity names)
  • ~50-100 tokens per note

L1 — Observations (medium cost)

  • All observations with categories
  • Key relations with types
  • ~200-500 tokens per note

L2 — Full content (on demand)

  • Complete markdown body
  • All relations with context
  • Attachments/references

The search_notes and build_context tools could return L0 by default, with a depth or detail parameter to request L1/L2. This keeps agent prompts compact while making the full graph accessible on demand.

Benefits

  • Dramatically fewer tokens for knowledge graph navigation
  • Agents can scan more notes before deciding which to read fully
  • Better prompt cache stability (compact index stays stable)
  • Aligns with how humans browse — scan titles, then dive in

References

  • OpenViking L0/L1/L2: github.com/volcengine/OpenViking
  • Hermes Agent memory layers (hot prompt memory + cold FTS recall)
  • Relates to Momentum UI progressive rendering

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions