Skip to content

docs(readme): reposition around auto-learning#115

Merged
kaghni merged 6 commits intomainfrom
docs/readme-auto-learning
May 8, 2026
Merged

docs(readme): reposition around auto-learning#115
kaghni merged 6 commits intomainfrom
docs/readme-auto-learning

Conversation

@kaghni
Copy link
Copy Markdown
Collaborator

@kaghni kaghni commented May 8, 2026

Summary

Reframes the README from "shared memory / agent amnesia" toward auto-learning: trace capture → skill codification → skill propagation → compounding capability. Tagline kept ("One brain for all your agents").

Source: growth team's positioning draft (SUGGENSTED_README.md, kept locally and not committed). We pulled the language shifts and rejected the technical regressions in their version (renamed env vars, dropped the unified hivemind install flow, simplified per-agent install to manual git clone, single-agent hook lifecycle table).

Changes

  • Subtitle: "Persistent, cloud-backed shared memory""Auto-learning, cloud-backed shared brain"
  • Pull quote: agent-amnesia → Monday→Tuesday migration framing
  • Lead paragraph reframed around traces / codified skills / propagated capability
  • Feature bullets: Captures · Codifies · Searches · Propagates · Intercepts · Summarizes (was: Captures · Searches · Shares · Intercepts · Summarizes)
  • "How it works" section: replaced 3-block diagram with the 4-stage pipeline (trace capture → skill codification → skill propagation → compounding capability), each stage explained in prose, plus a larger 4-block diagram
  • Natural search examples mention traces and codified skills
  • Team sharing: "see your memory""draw from your team's traces and skills"
  • Data collection notice: added Codified skills row and workspace-isolation framing
  • Skills (skilify) intro framed as "where auto-learning becomes concrete"
  • New Roadmap section before Security: trajectory export for fine-tuning, dense vector retrieval, skill versioning/review, more agents

Not changed

All technical surface intact: env vars, the unified npm install -g @deeplake/hivemind && hivemind install quick start, per-agent install/uninstall commands, agent-detail collapsibles, embeddings section, summaries env-var table, skilify CLI/triggers/gate-CLI tables, integration-model architecture table, monorepo structure, Security, Development.

Test plan

  • Render preview on GitHub looks right (block quotes, ASCII diagrams)
  • All anchor links still resolve (#skills-skilify from the new How-it-works section, etc.)
  • No remaining stale references — grepping for "memory" over the file shows only legitimate technical identifiers (paths, SQL table name, env var, OpenClaw's separate plugin, skill folder name)
  • Growth team signs off on the language pass

Summary by CodeRabbit

  • Documentation
    • Updated product positioning and core feature descriptions
    • Enhanced "How it works" section with numbered phases and improved architecture diagram
    • Refreshed natural search examples and clarified team sharing capabilities
    • Expanded data collection notice with additional workspace isolation details
    • Added roadmap section with planned feature development items

Reframes the README from "shared memory / agent amnesia" toward
auto-learning: trace capture -> skill codification -> skill propagation
-> compounding capability. Tagline kept ("One brain for all your
agents"). No technical surface changes: HIVEMIND_* env vars, install
commands, agent list, embeddings/summaries/skilify CLI tables,
architecture, and security all intact.

Changes:
- Subtitle: "shared memory" -> "auto-learning, cloud-backed shared brain"
- Pull quote: agent-amnesia -> Monday->Tuesday migration framing
- Lead paragraph reframed around traces / codified skills / propagated
  capability
- Feature bullets: Captures, Codifies, Searches, Propagates, Intercepts,
  Summarizes (was: Captures, Searches, Shares, Intercepts, Summarizes)
- "How it works": replaced 3-block diagram with 4-stage pipeline (trace
  capture, skill codification, skill propagation, compounding capability)
  plus larger diagram
- Natural search examples mention traces and codified skills
- Team sharing: "see your memory" -> "draw from your team's traces and
  skills"
- Data collection notice: added "Codified skills" row and workspace-
  isolation framing
- Skills (skilify) intro: framed as "where auto-learning becomes concrete"
- New Roadmap section before Security: trajectory export for fine-tuning,
  dense vector retrieval, skill versioning/review, more agents
@claude
Copy link
Copy Markdown

claude Bot commented May 8, 2026

Claude finished @kaghni's task in 1m 45s —— View job


Review

  • Read PR diff
  • Verify anchors (#skills-skilify resolves correctly)
  • Cross-check claims vs. rest of README
  • Post inline comments

Two accuracy regressions and one verifiability concern flagged inline:

  1. L196 — "statistical pattern matching across the trace store" overclaims; skilify is a single Haiku gate call per the section below.
  2. L606 — "Currently using BM25 + grep fallback" contradicts the existing semantic-search-via-embeddings section (nomic-embed daemon, 768-dim columns).
  3. L605 — Customer claim ("a handful of advanced customers are already doing this") is worth confirming before public release.

The language pass otherwise reads well and the technical surface is intact as you noted.
Branch

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Coverage Report

No src/*.ts files changed in this PR.

Generated for commit 36ed9cf.

Comment thread README.md Outdated
Comment thread README.md Outdated
## Roadmap

- **Trajectory export for fine-tuning.** Because traces are stored in Deeplake's tensor format, they're export-ready as PyTorch datasets. Teams running their own open-source models can fine-tune on their org's accumulated trajectories. A handful of advanced customers are already doing this against the trajectories their Claude Code and Codex agents generated.
- **Dense vector retrieval.** Currently using BM25 + grep fallback. Dense retrieval over GPU-accelerated vector search is in active development.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Currently using BM25 + grep fallback" contradicts the "Optional: enable semantic search (embeddings)" section above (lines 298–348), which documents a working nomic-embed-text-v1.5 daemon producing 768-dim message_embedding / summary_embedding columns and hybrid semantic+lexical ranking. Dense retrieval already exists — it's just opt-in. The roadmap item should be GPU-accelerated / scaled dense retrieval, not "dense retrieval" as if none ships today.

Suggested change
- **Dense vector retrieval.** Currently using BM25 + grep fallback. Dense retrieval over GPU-accelerated vector search is in active development.
- **GPU-accelerated dense retrieval at scale.** Hivemind already supports local CPU embeddings via the optional nomic-embed daemon (see "Optional: enable semantic search" above). Roadmap: GPU-accelerated vector search over the full trace store, on by default.

Comment thread README.md

## Roadmap

- **Trajectory export for fine-tuning.** Because traces are stored in Deeplake's tensor format, they're export-ready as PyTorch datasets. Teams running their own open-source models can fine-tune on their org's accumulated trajectories. A handful of advanced customers are already doing this against the trajectories their Claude Code and Codex agents generated.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth confirming before this ships publicly — "A handful of advanced customers are already doing this against the trajectories their Claude Code and Codex agents generated" is a specific, falsifiable claim about real customer usage. If this is the growth team's framing rather than a verified fact, consider softening (e.g. "designed so teams running their own open-source models can fine-tune on their org's accumulated trajectories") or dropping the sentence — public README claims about customer behavior tend to attract scrutiny.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 8, 2026

Review Change Stack

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: ca8ef9fd-62b6-4514-8b7c-ef494cd5cddd

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

README.md updated with reframed product positioning as an "auto-learning, cloud-backed shared brain," restructured "How it works" into four explicit phases, generalized skill-codification language away from Claude-Code specifics, refreshed examples and phrasing, expanded data collection transparency disclosures, and added a new Roadmap section listing trajectory export, vector retrieval, skill versioning, and additional agent support.

Changes

Documentation Positioning and Content Refresh

Layer / File(s) Summary
Intro and Positioning Framing
README.md
Introductory paragraph reframed from "shared memory" to "auto-learning, cloud-backed shared brain"; feature bullet list updated to emphasize capture, codification, lexical search, and real-time propagation.
How It Works and Architecture
README.md
Core narrative restructured into four numbered stages—Trace capture, Skill codification, Skill propagation, Compounding capability—with refreshed ASCII architecture diagram.
Natural Search Examples
README.md
Search section intro and query examples updated to reference "your agent" instead of product-specific names, maintaining consistency with generalized positioning.
Team Sharing and Capability
README.md
Team sharing section updated to explicitly describe bidirectional team capability using Deeplake org/shared traces/skills concept with revised wording.
Data Collection Transparency
README.md
Data collection notice table expanded to include assistant responses, subagent activity, and codified skills; privacy notice clarified to highlight workspace-level isolation between orgs.
Skills Codification Description
README.md
Skills (skilify) section intro generalized from Claude-Code-specific phrasing to broader "recurring patterns" codification that propagates team-wide.
Roadmap Addition
README.md
New Roadmap section added listing trajectory export, dense vector retrieval, skill versioning/review, and support for more agents.

🎯 2 (Simple) | ⏱️ ~10 minutes

🐰 A shared brain now learns from all its mates,
Four phases capture, codify, and propagate,
No Claude-code walls—just skills for the team,
A roadmap unfolds the collective dream! 🧠✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description covers the PR objectives and changes comprehensively, but omits required template sections (Version Bump and Test plan checkboxes). Add the Version Bump section with version bump guidance and Test plan section with checkboxes as specified in the repository template.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: repositioning the README documentation around an auto-learning narrative theme.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/readme-auto-learning

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@README.md`:
- Around line 239-244: The fenced code block containing the quoted lines ("What
was Emanuele working on?" / "Search traces for authentication bugs we've solved"
/ "What did we decide about the API design?" / "Show me skills my team has
codified for handling migrations") must include a language specifier so it
renders correctly; update that triple-backtick block to use a language
identifier (for example add "text" after the opening ```), ensuring the rest of
the block content is unchanged.
- Around line 202-229: The fenced ASCII diagram block in README.md lacks a
language specifier; change the opening fence from ``` to include a language
identifier (e.g., use ```text) so the ASCII diagram renders correctly—update the
code block that contains the box diagram (the multi-line ASCII art starting with
the top border "┌─────────────────────────────────────────────────────┐") to
start with ```text.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 989efea2-c3f3-40e8-85c2-4051cddc99b5

📥 Commits

Reviewing files that changed from the base of the PR and between 7e49f28 and 4e4c486.

📒 Files selected for processing (1)
  • README.md

Comment thread README.md Outdated
Comment on lines 202 to 229
```
┌─────────────────────────────────────────────────────┐
│ Your Coding Agent │
│ Your Coding Agents │
│ Claude Code · Codex · OpenClaw · Cursor · ... │
└──────────────────────────┬──────────────────────────┘
┌──────────────────▼──────────────────┐
│ 📥 Capture (every turn)
│ 📥 Trace capture (every turn) │
│ prompts · tool calls · responses │
└──────────────────┬──────────────────┘
┌──────────────────▼──────────────────┐
│ 🧠 Hivemind │
│ SQL tables · Virtual File System │
│ Search Memory · inject context │
│ 🧠 Skill codification │
│ pattern detection · LLM extraction │
│ workspace-scoped │
└──────────────────┬──────────────────┘
┌──────────────────▼──────────────────┐
│ 🔗 Skill propagation │
│ injected into agent context │
│ every agent · every teammate │
└──────────────────┬──────────────────┘
┌──────────────────▼──────────────────┐
│ 🌊 Deeplake │
│ Shared across all agents │
│ Postgres · S3 │
│ Tensor storage · Postgres · S3 │
└─────────────────────────────────────┘
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add language specifier to fenced code block.

The ASCII diagram should specify a language identifier for proper rendering.

📝 Proposed fix
-```
+```text
 ┌─────────────────────────────────────────────────────┐
 │              Your Coding Agents                     │
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
```
┌─────────────────────────────────────────────────────┐
│ Your Coding Agent │
│ Your Coding Agents │
│ Claude Code · Codex · OpenClaw · Cursor · ... │
└──────────────────────────┬──────────────────────────┘
┌──────────────────▼──────────────────┐
│ 📥 Capture (every turn) │
│ 📥 Trace capture (every turn) │
│ prompts · tool calls · responses │
└──────────────────┬──────────────────┘
┌──────────────────▼──────────────────┐
│ 🧠 Hivemind │
│ SQL tables · Virtual File System │
│ Search Memory · inject context │
│ 🧠 Skill codification │
│ pattern detection · LLM extraction │
│ workspace-scoped │
└──────────────────┬──────────────────┘
┌──────────────────▼──────────────────┐
│ 🔗 Skill propagation │
│ injected into agent context │
│ every agent · every teammate │
└──────────────────┬──────────────────┘
┌──────────────────▼──────────────────┐
│ 🌊 Deeplake │
│ Shared across all agents │
│ Postgres · S3 │
│ Tensor storage · Postgres · S3 │
└─────────────────────────────────────┘
```
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 202-202: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` around lines 202 - 229, The fenced ASCII diagram block in
README.md lacks a language specifier; change the opening fence from ``` to
include a language identifier (e.g., use ```text) so the ASCII diagram renders
correctly—update the code block that contains the box diagram (the multi-line
ASCII art starting with the top border
"┌─────────────────────────────────────────────────────┐") to start with
```text.

Comment thread README.md
Comment on lines 239 to 244
```
"What was Emanuele working on?"
"Search memory for authentication bugs"
"Search traces for authentication bugs we've solved"
"What did we decide about the API design?"
"Show me skills my team has codified for handling migrations"
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add language specifier to fenced code block.

The code example should specify a language identifier for proper rendering.

📝 Proposed fix
-```
+```text
 "What was Emanuele working on?"
 "Search traces for authentication bugs we've solved"
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 239-239: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` around lines 239 - 244, The fenced code block containing the
quoted lines ("What was Emanuele working on?" / "Search traces for
authentication bugs we've solved" / "What did we decide about the API design?" /
"Show me skills my team has codified for handling migrations") must include a
language specifier so it renders correctly; update that triple-backtick block to
use a language identifier (for example add "text" after the opening ```),
ensuring the rest of the block content is unchanged.

kaghni added 4 commits May 8, 2026 20:14
Cuts the README from 643 lines to 322 (~50% reduction), in line with peer
projects (mem0 ~250, supermemory 362, letta 122, mastra 82). Detail moves
to dedicated files under docs/ that the README now points to.

Dedup cuts:
- Lead paragraph: drop "Repeat work stops being repeat work..." marketing
  fluff (the auto-learning thesis is already in the bullets and How-it-works)
- Quick start: drop the parenthetical agent list (duplicates the Supported-
  assistants table 20 lines later)
- How it works: drop the 4-step narrative paragraphs and ASCII diagram;
  collapse to a one-paragraph "Capture -> Codify -> Propagate -> Compound"
  summary. The bullets above already convey the same flow
- Features: cut the AI-summaries and Team-sharing teaser subsections
  (already covered by the Summaries H2 and the Data-collection notice)
- Skilify intro: drop the "this is where auto-learning becomes concrete"
  editorial opener

Extract to docs/:
- docs/EMBEDDINGS.md - semantic-search daemon, install/uninstall, lexical
  fallback (~50 inline lines moved)
- docs/SUMMARIES.md - wiki-worker triggers, generation flow, env-var
  reference (~64 inline lines moved)
- docs/SKILIFY.md - skilify worker, pull/unpull, gate-CLI per agent,
  configuration, logs (~150 inline lines moved)
- docs/ARCHITECTURE.md - integration model per agent + monorepo tree
  (~36 inline lines moved)

Each section now has a one-paragraph summary + link to the full guide.

Modernization (Tier 1):
- Add npm-version and GitHub-stars badges to the badge row
- Drop the ASCII flow diagram (modern peer READMEs use prose + image, not
  ASCII art)

No technical surface changes: HIVEMIND_* env vars, install commands,
agent list, and all command references are intact (just relocated).
Local CPU embeddings already ship via the optional nomic-embed daemon
(documented in the Semantic search section and docs/EMBEDDINGS.md), so
"currently using BM25 + grep fallback" was wrong. The actual roadmap
item is GPU-accelerated dense retrieval at scale, on by default.

Caught by claude[bot] PR review.
Wrap the white logo paths in a #141214 background rect (matches the landing-page dark surface) and scale the paths to 70% so they don't touch the edges. Without this the logo was invisible on light/white backdrops (GitHub light theme, Google SERP, browser tabs).
Five prose mentions in README plus the docs/SKILIFY.md file (created in
this PR) renamed to docs/SKILLIFY.md to match. Source-code rename is
landing in a separate PR.
Comment thread docs/SKILLIFY.md
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skillify

Comment thread docs/EMBEDDINGS.md
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave it in the README

Comment thread README.md
hivemind skilify unpull --all # ALSO remove flat-layout (locally-mined) skills — destructive
hivemind skilify unpull --legacy-cleanup # ALSO remove pre-`--author`-layout `<projectkey>/` dirs from older skilify versions
hivemind skillify # show current scope, team, install, per-project state
hivemind skillify scope <me|team|org> # who counts as "in scope" for mining
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove it. it's enough in the specific md

@kaghni kaghni merged commit d052f99 into main May 8, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants