Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
366 changes: 36 additions & 330 deletions README.md

Large diffs are not rendered by default.

36 changes: 36 additions & 0 deletions docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Architecture

## Integration model per agent

| Agent | Mechanism | Hooks/tools wired |
|-------------------|------------------------------------|-----------------------------------------------------------------------------------------|
| **Claude Code** | Marketplace plugin | `SessionStart` · `UserPromptSubmit` · `PreToolUse` · `PostToolUse` · `Stop` · `SubagentStop` · `SessionEnd` |
| **Codex** | `~/.codex/hooks.json` | `SessionStart` · `UserPromptSubmit` · `PreToolUse(Bash)` · `PostToolUse` · `Stop` |
| **OpenClaw** | Native extension at `~/.openclaw/extensions/hivemind/` | `agent_end` capture · `before_agent_start` recall · contracted tools (`hivemind_search`/`read`/`index`) |
| **Cursor (1.7+)** | `~/.cursor/hooks.json` | `sessionStart` · `beforeSubmitPrompt` · `postToolUse` · `afterAgentResponse` · `stop` · `sessionEnd` |
| **Hermes** | Skill at `~/.hermes/skills/hivemind-memory/` | recall via grep on `~/.deeplake/memory/` |
| **pi** | `~/.pi/agent/AGENTS.md` + skill | recall via grep on `~/.deeplake/memory/` |

## Monorepo structure

```
hivemind/
├── src/ ← shared core (API client, auth, config, SQL utils)
│ ├── hooks/ ← Claude Code hooks
│ ├── hooks/codex/ ← Codex hooks
│ ├── hooks/cursor/ ← Cursor hooks
│ ├── hooks/hermes/ ← Hermes shell hooks
│ ├── hooks/pi/ ← pi wiki-worker (extension lives in pi/extension-source/)
│ ├── embeddings/ ← nomic embed-daemon + protocol + SQL helpers
│ ├── mcp/ ← MCP server (used by Hermes; available to any future MCP-aware client)
│ ├── commands/ ← auth, auth-creds, auth-login, session-prune
│ └── cli/ ← unified `hivemind install` CLI + per-agent installers
├── claude-code/ ← Claude Code plugin source (marketplace-distributed)
├── codex/ ← Codex plugin build output (npm-distributed)
├── cursor/ ← Cursor plugin build output (npm-distributed)
├── hermes/ ← Hermes plugin build output (npm-distributed)
├── mcp/ ← MCP server build output (shared by Hermes + future MCP clients)
├── openclaw/ ← OpenClaw plugin source + build output (ClawHub-distributed)
├── pi/ ← pi extension source (ships raw .ts; pi compiles at load)
└── bundle/ ← unified `hivemind` CLI build output
```
39 changes: 39 additions & 0 deletions docs/EMBEDDINGS.md
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave it in the README

Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Embeddings (semantic search)

Hivemind can run a local embedding daemon (nomic-embed-text-v1.5, ~130 MB) so that `Grep` over `~/.deeplake/memory/` uses hybrid semantic + lexical ranking instead of pure BM25. This is **off by default** — the daemon depends on `@huggingface/transformers`, which pulls onnxruntime-node and sharp (~600 MB total with native binaries). Shipping that with every agent install would 60× the install size for a feature most users don't need.

## Install

```bash
hivemind embeddings install
```

This installs `@huggingface/transformers` **once** into a shared directory (`~/.hivemind/embed-deps/`) and symlinks every detected agent's plugin to it, so the 600 MB cost is paid one time regardless of how many agents you have wired up. Re-run the same command after installing a new agent and the new symlink is added (the npm install is skipped because it's cached).

Or do it in one shot at install time:

```bash
hivemind install --with-embeddings # all detected agents
hivemind <agent> install --with-embeddings # a single agent
```

## Other commands

```bash
hivemind embeddings status # show shared deps + per-agent state
hivemind embeddings uninstall # remove the per-agent symlinks
hivemind embeddings uninstall --prune # also delete the shared dir (~600 MB)
```

Restart your agents after enabling. From the next session, captured messages and AI-generated summaries will include a 768-dim embedding, and semantic recall queries will route through the local daemon (the nomic model is downloaded on first use and cached in `~/.cache/huggingface/`).

## Lexical-only fallback

If `@huggingface/transformers` is **not** present, Hivemind silently degrades to lexical-only mode:

- ✅ Capture continues; rows still land in Deeplake.
- ✅ `Grep` still works via BM25 / `ILIKE` matching on text columns.
- ⚪ The `message_embedding` / `summary_embedding` columns stay `NULL`.
- ⚪ The hook log notes `embeddings: no-transformers` once at session start.

You can also force lexical-only mode explicitly with `HIVEMIND_EMBEDDINGS=false` (useful for CI or air-gapped environments).
109 changes: 109 additions & 0 deletions docs/SKILLIFY.md
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skillify

Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Skills (skillify)

Hivemind **codifies recurring patterns from your team's recent sessions into reusable skills** that propagate to every agent on your team — automatically. Same architecture as the wiki worker: an async background process that fires on Stop / SessionEnd, mines recent sessions in scope, asks Haiku whether the activity contains something worth keeping, and writes a `SKILL.md` if so.

## When the skillify worker fires

| Trigger | When it fires |
|------------------|--------------------------------------------------------------------------------|
| **Stop counter** | Mid-session, after every `HIVEMIND_SKILLIFY_EVERY_N_TURNS` (default 20) turns. |
| **SessionEnd** | Always at end-of-session, regardless of counter — catches tail-of-session knowledge. |

Per-project counter state lives at `~/.deeplake/state/skillify/<project-key>.json`. Project key is the sha1 of `git config remote.origin.url` (with the absolute path as fallback for non-git dirs).

## How a skill is generated

1. The worker pulls the **last 10 sessions in scope** from the `sessions` Deeplake table — strictly newer than the watermark in the state file.
2. It strips each session to **prompt + assistant text only** (tool calls and thinking blocks are dropped — they're noise for skill mining).
3. It builds a gate prompt: existing project skill bodies + the 10 stripped exchanges + decision rules.
4. It runs `claude -p haiku --permission-mode bypassPermissions` with the prompt. The model returns a JSON verdict:
- `KEEP <name> <body>` — write a new skill.
- `MERGE <existing-name> <merged-body>` — update an existing skill, bump version.
- `SKIP <reason>` — pattern is one-off / generic / already covered.
5. On KEEP/MERGE the skill is written to `<project>/.claude/skills/<name>/SKILL.md` (or `~/.claude/skills/...` if you've set `install` to `global`), with provenance frontmatter (`source_sessions`, `version`, `created_by_agent`, timestamps).
6. A row is also inserted into the `skills` Deeplake table for org-wide provenance (append-only — never UPDATE, sidesteps the UPDATE-coalescing quirk).

## `/skillify` — managing scope, team, install location

The `/skillify` slash command (Claude Code, Codex) and the `hivemind skillify` CLI control mining behaviour.

```bash
hivemind skillify # show current scope, team, install, per-project state
hivemind skillify scope <me|team|org> # who counts as "in scope" for mining
hivemind skillify install <project|global> # where new skills are written
hivemind skillify promote <skill-name> # move a project skill to ~/.claude/skills/
hivemind skillify team add <username> # add to the team list (used when scope=team)
hivemind skillify team remove <username> # remove from team
hivemind skillify team list # list current team members
```

The team list flows into the worker's session-fetch SQL: `scope=me` filters by your own username, `scope=team` filters by `author IN (<team>)`, `scope=org` applies no author filter.

Config persists at `~/.deeplake/state/skillify/config.json` (one global file shared across projects).

## `pull` / `unpull` — sharing skills across the org

Once a teammate's skills are mined into the Deeplake `skills` table, you can install them locally with `pull`. Layout written to disk:

```text
<root>/<name>--<author>/SKILL.md ← pulled skills (e.g. deploy--alice/)
<root>/<name>/SKILL.md ← your locally-mined skills (flat, no suffix)
```

The `--<author>` suffix keeps cross-author entries with the same name disjoint and lets Claude Code's single-depth skill loader find pulled skills without any symlink trickery. `<root>` is `~/.claude/skills` for `--to global` and `<cwd>/.claude/skills` for `--to project`.

```bash
hivemind skillify pull # all authors, install globally
hivemind skillify pull --user alice@example.com # only this author
hivemind skillify pull --users a@x.com,b@y.com # multiple authors (CSV)
hivemind skillify pull --all-users # explicit "no author filter" (default)
hivemind skillify pull --to project # install under <cwd>/.claude/skills
hivemind skillify pull --dry-run # preview, no disk writes
hivemind skillify pull --force # overwrite even when local version >= remote
hivemind skillify pull <skill-name> # pull only that skill (combinable with --user)
```

Every successful pull records an entry in `~/.deeplake/state/skillify/pulled.json`. That manifest is the source of truth for `unpull` — anything not in the manifest is **never** touched by default, even if its directory follows the `<name>--<author>` shape (this protects user-authored variant skills like `deploy--blue-green`).

```bash
hivemind skillify unpull # remove every pulled entry under the install scope
hivemind skillify unpull --user alice@example.com # remove only this author's pulls
hivemind skillify unpull --users a@x.com,b@y.com # multiple authors
hivemind skillify unpull --not-mine # remove all pulls except your own
hivemind skillify unpull --dry-run # preview, no disk writes
hivemind skillify unpull --to project # operate on <cwd>/.claude/skills instead of global
hivemind skillify unpull --all # ALSO remove flat-layout (locally-mined) skills — destructive
hivemind skillify unpull --legacy-cleanup # ALSO remove pre-`--author`-layout `<projectkey>/` dirs from older skillify versions
```

Drift handling: if a manifest entry's directory was deleted out-of-band (e.g. `rm -rf` by hand), the next `unpull` reports it as `manifest-orphan` and prunes the entry from the manifest without errors.

Cross-project caveat: same `(name, author)` from two different projects collides on disk under the new flat layout — the more recently pulled row wins, and the prior `SKILL.md` is preserved as `SKILL.md.bak`. The underlying row stays in the Deeplake `skills` table, so re-pulling from the other project recovers it.

## Configuration

| Env var | Default | Effect |
|--------------------------------------|---------|---------------------------------------------------------|
| `HIVEMIND_SKILLIFY_EVERY_N_TURNS` | `20` | Stop-counter threshold for mid-session worker fires |
| `HIVEMIND_SKILLS_TABLE` | `skills`| Deeplake table name for org-wide provenance |
| `HIVEMIND_SKILLIFY_WORKER=1` | unset | Recursion guard (set automatically inside the worker) |
| `HIVEMIND_CURSOR_MODEL` | `auto` | (cursor only) model passed to the cursor-agent gate call |
| `HIVEMIND_HERMES_PROVIDER` | `openrouter` | (hermes only) provider passed to the hermes gate call |
| `HIVEMIND_HERMES_MODEL` | `anthropic/claude-haiku-4-5` | (hermes only) model passed to hermes |

## Per-agent gate CLI

The skillify worker calls each agent's own headless CLI for the gate prompt — so a user who only has codex / cursor / hermes installed never needs `claude` in their PATH:

| Agent | Gate command |
|-------------|----------------------------------------------------------------------------------------|
| claude_code | `claude -p <prompt> --no-session-persistence --model haiku --permission-mode bypassPermissions` |
| codex | `codex exec --dangerously-bypass-approvals-and-sandbox <prompt>` |
| cursor | `cursor-agent --print --model <HIVEMIND_CURSOR_MODEL> --force --output-format text <prompt>` |
| hermes | `hermes -z <prompt> --provider <HIVEMIND_HERMES_PROVIDER> -m <HIVEMIND_HERMES_MODEL> --yolo --ignore-user-config` |

For hermes via OpenRouter (the default), set `OPENROUTER_API_KEY` in the environment; the worker inherits the parent process env. Other providers (anthropic, openai, etc.) need their respective API keys.

## Logs

Worker activity logs to `~/.claude/hooks/skillify.log`. Each line shows which session pool was mined, what the gate decided, and whether a file was written.
42 changes: 42 additions & 0 deletions docs/SUMMARIES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Summaries

Hivemind doesn't just capture raw events — it also generates an **AI-written wiki summary** for each session and stores it in the `memory` table (alongside its 768-dim `summary_embedding`). The summary is what shows up when you `Grep` for past sessions or follow links from `~/.deeplake/memory/index.md`.

## When summaries are written

Each agent (Claude Code / Codex / Cursor / Hermes / pi) fires a wiki worker on two triggers:

| Trigger | When it fires |
|-------------------|-------------------------------------------------------------------------------|
| **Final** | At session end (Stop / SessionEnd / session_shutdown), once. |
| **Periodic** | Mid-session, when **either** of two thresholds is hit since the last summary: |
| | • messages-since-last-summary ≥ `HIVEMIND_SUMMARY_EVERY_N_MSGS` (default 50) |
| | • elapsed time ≥ `HIVEMIND_SUMMARY_EVERY_HOURS` (default 2) |

The first message after a long pause therefore triggers a fresh summary; long sessions naturally checkpoint every ~50 messages.

A per-session JSON sidecar at `~/.claude/hooks/summary-state/<sessionId>.json` tracks `{lastSummaryAt, lastSummaryCount, totalCount}`. The dir is shared across all agents (session ids are UUIDs so no collisions). It is **never deleted**, so resuming a session via `--resume` / `--continue` picks up where it left off.

## How a summary is generated

1. The wiki worker queries the `sessions` table for every event tied to that session.
2. It builds a structured prompt asking the host agent's CLI to extract entities, decisions, files modified, open questions, etc.
3. It shells out to that agent's CLI (`claude -p`, `codex exec`, `pi --print`, …) with the prompt — never a separate API key, the agent's existing credentials are used.
4. The generated markdown is uploaded to the `memory` table at `/summaries/<user>/<sessionId>.md`. The shared embedding daemon produces the 768-dim `summary_embedding` so the summary is recallable via semantic search.

A lock file at `~/.claude/hooks/summary-state/<sessionId>.lock` prevents two workers from running concurrently for the same session.

## Configuration

| Env var | Default | Effect |
|------------------------------------|----------------|-----------------------------------------------------|
| `HIVEMIND_SUMMARY_EVERY_N_MSGS` | `50` | Trigger periodic when messages-since-last ≥ this |
| `HIVEMIND_SUMMARY_EVERY_HOURS` | `2` | Trigger periodic after this many hours, with ≥1 msg |
| `HIVEMIND_CURSOR_MODEL` | `auto` | (cursor only) model passed to `cursor-agent --print --model` |
| `HIVEMIND_HERMES_PROVIDER` | `openrouter` | (hermes only) provider passed to `hermes -z --provider` |
| `HIVEMIND_HERMES_MODEL` | `anthropic/claude-haiku-4-5` | (hermes only) model passed to `hermes -z -m` |
| `HIVEMIND_PI_PROVIDER` | `google` | (pi only) provider passed to `pi --print --provider`|
| `HIVEMIND_PI_MODEL` | `gemini-2.5-flash` | (pi only) model passed to `pi --print --model` |
| `HIVEMIND_CAPTURE=false` | unset | Disable both capture and summary generation |

For pi specifically, the wiki worker is bundled separately at `~/.pi/agent/hivemind/wiki-worker.js` (deposited by `hivemind pi install`). The other agents ship the wiki worker inside their per-agent plugin bundle.
Loading
Loading