Abbenay is a unified AI daemon and library written in TypeScript/Node.js that provides:
- A reusable core library (
@abbenay/core) for LLM engine abstraction, streaming chat, and config - A gRPC API for chat and configuration
- A web dashboard for provider/model management
- A VS Code extension that registers models with VS Code's Language Model API
┌─────────────────────────────────────────────────────────────────────────┐
│ Consumer Applications │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────────┐ │
│ │ VS Code Ext │ │ Python Apps │ │ Web Dashboard │ │
│ │ (gRPC) │ │ (gRPC) │ │ (HTTP → DaemonState) │ │
│ └────────┬────────┘ └────────┬────────┘ └────────────┬────────────┘ │
│ │ │ │ │
│ └────────────────────┼────────────────────────┘ │
│ │ │
└────────────────────────────────┼─────────────────────────────────────────┘
│ gRPC over Unix Socket (or named pipe)
▼
┌─────────────────────────────────────────────────────────────────────────┐
│ abbenay daemon (TypeScript) │
│ │
│ ┌─ @abbenay/core ──────────────────────────────────────────────────┐ │
│ │ CoreState Engines (Vercel AI SDK) Config (YAML) │ │
│ │ SecretStore i/f Streaming chat + tools Model discovery │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─ daemon layer ────────────────────────────────────────────────────┐ │
│ │ DaemonState gRPC Server VS Code Backchannel │ │
│ │ CLI (Commander) Web Dashboard (Express) KeychainSecretStore │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────┬──────────────────────────────┘
│
┌─────────────────────────────────┼───────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌───────────────┐
│ LLM APIs │ │ keytar │ │ Config Files │
│ (HTTP) │ │ (keychain) │ │ (YAML) │
└─────────────────┘ └─────────────────┘ └───────────────┘
The source tree is organized into two layers:
Reusable library with zero transport dependencies. Can be used standalone by agent developers, web developers, or any Node.js application.
| File | Purpose |
|---|---|
core/state.ts |
CoreState class — provider resolution, model listing, chat |
core/engines.ts |
Engine registry with Vercel AI SDK providers (dynamically loaded) |
core/config.ts |
YAML config loader/saver, merge logic |
core/secrets.ts |
SecretStore interface + MemorySecretStore |
core/paths.ts |
Platform-aware path utilities |
core/mock.ts |
Mock engine for testing |
core/policies.ts |
Policy system — built-in + custom policies, resolution, flattening |
core/tool-registry.ts |
Tool collection, namespacing, policy filtering, executor builder |
core/session-store.ts |
File-based session persistence (CRUD, index, messages) |
core/session-summarizer.ts |
Periodic LLM-generated session summaries (DR-022) |
core/index.ts |
Public API surface |
Full application layer. Extends core with transport, UI, and CLI.
| File | Purpose |
|---|---|
daemon/state.ts |
DaemonState extends CoreState — client registry, VS Code backchannel |
daemon/daemon.ts |
Process lifecycle, gRPC server startup, signal handling |
daemon/transport.ts |
Unix socket and PID file management |
daemon/tool-router.ts |
Tool execution routing (VS Code, MCP, local) |
daemon/mcp-client-pool.ts |
MCP server connection pool |
daemon/mcp-server.ts |
Embedded MCP server (exposes daemon as MCP) |
daemon/index.ts |
CLI entry point (Commander) |
daemon/server/abbenay-service.ts |
gRPC service handlers |
daemon/web/server.ts |
Express web server + REST API |
daemon/web/openai-compat.ts |
OpenAI-compatible /v1/* routes (models, chat completions) |
daemon/web/grpc-web-control.ts |
gRPC client for web server control |
daemon/secrets/keychain.ts |
KeychainSecretStore (keytar native addon) |
The core TypeScript/Node.js process that runs as a background daemon.
Subcommands:
abbenay start- Start all services (daemon, web dashboard, OpenAI API, MCP server)abbenay daemon- Start the gRPC server on Unix socket (or named pipe on Windows)abbenay web- Start the web dashboard (embedded in daemon or started via gRPC if daemon already running)abbenay serve- Start the OpenAI-compatible API server (same aswebbut framed for API use)abbenay status- Check if daemon is runningabbenay stop- Stop the running daemon
Socket location:
- Linux/macOS:
$XDG_RUNTIME_DIR/abbenay/daemon.sockor/run/user/{uid}/abbenay/daemon.sock - Windows:
\\.\pipe\abbenay-daemon
The web dashboard runs inside the daemon process via Express:
- Port:
localhost:8787(configurable) - Static assets: Served from
packages/daemon/static/ - API routes:
/api/*-> Direct calls toDaemonState(no gRPC in the loop) - Chat SSE:
POST /api/chat-> Streaming responses via Server-Sent Events - OpenAI-compatible API:
/v1/models,/v1/chat/completions-> Drop-in replacement for any OpenAI-compatible client (see DR-020)
The web server is started either:
- In-process when
abbenay weborabbenay serveruns and no daemon is running - Via gRPC
StartWebServerwhen a daemon is already running andabbenay web/abbenay serveis invoked
The extension acts as a thin gRPC client to the daemon:
- On activation: Connects to daemon (starts if not running)
- Registers as a
LanguageModelChatProviderwith VS Code - Provides workspace paths via gRPC backchannel (
VSCodeStream) - Opens web dashboard on command
Key files:
extension.ts- Activation, commands, status bardaemon/client.ts- gRPC client wrapperdaemon/backchannel.ts- Bidirectional stream handlerproviders/AbbenayLanguageModelProvider.ts- VS Code LM API integration
All LLM providers are implemented via the Vercel AI SDK with a data-driven engine registry in core/engines.ts.
Each engine entry carries metadata AND its factory function. Adding a new engine is a single registry entry — no switch statements anywhere.
// core/engines.ts — simplified
const ENGINES: Record<string, EngineInfo> = {
openai: {
id: 'openai',
requiresKey: true,
defaultBaseUrl: 'https://api.openai.com/v1',
defaultEnvVar: 'OPENAI_API_KEY',
supportsTools: true,
createModel: (modelId, config) =>
dedicatedProvider('@ai-sdk/openai', 'createOpenAI', config, modelId),
},
// ... 18 more engines
};AI SDK provider packages (@ai-sdk/openai, @ai-sdk/anthropic, etc.) are loaded via dynamic import() at runtime — only when that engine is actually used. This means:
- For the core library: consumers install only the providers they need
- For the daemon: all providers are bundled into the SEA binary
If a provider package is missing, the error message tells you exactly what to install.
- Dedicated providers: Each has its own
@ai-sdk/*package (OpenAI, Anthropic, Gemini, Mistral, xAI, DeepSeek, Groq, Cohere, Bedrock, Fireworks, Together AI, Perplexity) - OpenAI-compatible: Use
@ai-sdk/openai-compatible(Azure, OpenRouter, Ollama, LM Studio, Cerebras, Meta) - Mock: Built-in, no external package needed
Secrets are managed explicitly per-provider with two options:
- Uses keytar for cross-platform keychain access:
- macOS: Keychain
- Linux: libsecret (GNOME Keyring / KDE Wallet)
- Windows: Credential Vault
- Config references key by name:
api_key_keychain_name: "OPENAI_API_KEY"
- Config specifies env var name:
api_key_env_var_name: "OPENAI_API_KEY" - Value read from
process.envat runtime
Important: These options are mutually exclusive per provider. The web UI provides a toggle to choose between them.
interface SecretStore {
get(key: string): Promise<string | null>;
set(key: string, value: string): Promise<void>;
delete(key: string): Promise<boolean>;
has(key: string): Promise<boolean>;
}CoreState accepts any SecretStore via constructor injection. DaemonState uses KeychainSecretStore (keytar-backed) by default. Tests and library consumers can use MemorySecretStore.
- User level:
~/.config/abbenay/config.yaml - Workspace level:
<workspace>/.config/abbenay/config.yaml
providers:
my-openai: # Virtual provider name (user-defined)
engine: openai # Actual engine type
api_key_keychain_name: "OPENAI_API_KEY"
models: # Map of virtual model name -> config
gpt-4o: {} # Enabled with defaults
gpt-4o-mini:
temperature: 0.3
max_tokens: 4096User and workspace configs are merged (workspace overrides user):
// core/config.ts
// loadConfig(), loadWorkspaceConfig(), mergeConfigs()
// Provider config: engine, api_key_keychain_name | api_key_env_var_name, base_url, modelsPolicies are named bundles of behavioral defaults that can be assigned to virtual models. A model references a policy by name; the policy's fields act as defaults that the model's explicit config can override.
Engine defaults ← Policy defaults ← Explicit ModelConfig ← Request params
| Policy | Temperature | max_tokens | Purpose |
|---|---|---|---|
precise |
0.15 | 2048 | Factual, concise responses |
balanced |
0.5 | 4096 | General-purpose |
creative |
0.9 | 8192 | Exploratory, generative |
coder |
0.2 | 4096 | Complete, runnable code |
json_strict |
0.2 | 2048 | JSON-only output with retry |
long_context_chat |
— | 4096 | Concise follow-ups in long conversations |
# ~/.config/abbenay/policies.yaml (user-level only)
my-policy:
sampling:
temperature: 0.3
top_p: 0.8
output:
max_tokens: 4096
system_prompt_snippet: "Be concise."
system_prompt_mode: prepend # prepend | append | replace
format: text # text | json_only | markdown
reliability:
retry_on_invalid_json: false
timeout: 30000# In config.yaml
providers:
my-openai:
engine: openai
models:
gpt-4o:
policy: coder # References a built-in or custom policy
temperature: 0.1 # Explicit config overrides the policyThe ToolRegistry collects tools from multiple sources and namespaces them to prevent collisions. Part of @abbenay/core, usable without the daemon.
Sources and namespace prefixes:
| Source | Prefix | Example |
|---|---|---|
| VS Code workspace | ws: |
ws:myproject/readFile |
| MCP server | mcp: |
mcp:github/searchCode |
| Local (agent-registered) | local: |
local:myAgent/search |
Tool policy controls which tools the LLM sees:
| Tier | Config field | Behavior |
|---|---|---|
| Auto-approve | auto_approve |
Execute without confirmation |
| Require approval | require_approval |
Pause and ask user |
| Disabled | disabled_tools |
Never sent to LLM |
Patterns support glob matching (e.g., mcp:filesystem/*).
The daemon's ToolRouter provides the execution backend for remote tools:
- VS Code tools → routed via gRPC backchannel (
VSCodeStream) - MCP tools → routed via
McpClientPool - Local tools → called directly via inline executor
Manages connections to external MCP servers defined in config. Uses @ai-sdk/mcp for the client implementation.
- Supports stdio and HTTP/SSE transports
- Auto-discovers tools on connect and registers them in
ToolRegistry - Hot-reloads when config changes (connects new, disconnects removed)
# In config.yaml
mcp_servers:
filesystem:
transport: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "/home/user"]
enabled: true
github:
transport: http
url: http://localhost:3001/sse
enabled: trueDefined in proto/abbenay/v1/service.proto. The daemon loads protos dynamically via @grpc/proto-loader (no code generation for the daemon).
| RPC | Description |
|---|---|
Chat |
Streaming chat with a model |
SessionChat |
Streaming chat within a session |
CreateSession / GetSession / ListSessions / DeleteSession |
Session CRUD |
SummarizeSession |
On-demand or cached session summary |
ListModels |
List available models from providers |
ListProviders |
List configured providers |
GetSecret / SetSecret / DeleteSecret / ListSecrets |
Secret management |
Register / Unregister |
Client registration |
VSCodeStream |
Bidirectional backchannel |
GetStatus / HealthCheck |
Daemon status |
GetConfig / UpdateConfig |
Configuration |
GetProviderStatus |
Provider status |
GetConnectedWorkspaces |
Workspace paths from VS Code |
StartWebServer / StopWebServer |
Embedded web dashboard lifecycle |
ListEngines |
List available engine types |
ListPolicies |
List built-in and custom policies |
Shutdown |
Daemon shutdown |
| RPC | Description |
|---|---|
WatchSessions / ReplaySession |
Session replay / real-time events |
ForkSession / ExportSession / ImportSession |
Session branching and sharing |
ListTools / ExecuteTool |
Tool execution via gRPC |
RegisterMcpServer / UnregisterMcpServer |
MCP server registration via gRPC |
The VSCodeStream RPC enables bidirectional communication:
Daemon -> VS Code requests:
GetWorkspace- Get connected workspace pathsInvokeTool- Invoke VS Code tools (future)ListModels- List VS Code LM models (future)
VS Code -> Daemon responses:
- Workspace folder paths
- Tool results
- Error responses
Sessions are persisted as JSON files in $XDG_DATA_HOME/abbenay/sessions/
(Linux) or ~/Library/Application Support/abbenay/sessions/ (macOS). See DR-021.
The SessionStore class (core layer) handles CRUD operations and maintains an
index.json for fast listing without reading every session file.
Available transports:
- gRPC:
CreateSession,GetSession,ListSessions,DeleteSession,SessionChat,SummarizeSession - Web API:
POST/GET/DELETE /api/sessions,POST /api/sessions/:id/chat(SSE),GET /api/sessions/:id/summary - CLI:
aby sessions list/show/delete,aby chat --session <id|new>
Periodic summaries: Every 10 user messages, a background LLM call generates
a 2-3 sentence summary stored on the session (see DR-022). Summaries are also
available on demand via SummarizeSession (gRPC) or GET /api/sessions/:id/summary.
Not yet implemented: ForkSession, ExportSession, ImportSession,
ReplaySession, web dashboard session sidebar, context window compression
using summaries (context.context_threshold / compression_strategy),
internal MCP tool for cross-session retrieval.
1. Client sends ChatRequest via gRPC (or POST /api/chat for web)
↓
2. DaemonState.chat() → CoreState.chat() resolves provider/model from composite ID
↓
3. CoreState.resolveApiKey() gets API key (keychain or env var based on config)
↓
4. engines.ts streamChat() dynamically loads the AI SDK provider and calls streamText()
↓
5. Response chunks streamed back to client as ChatChunk objects
1. Client calls ListModels (gRPC or GET /api/models)
↓
2. CoreState.listModels() iterates configured providers
↓
3. For each configured provider:
- Load API key from config (keychain name or env var name)
- Resolve key value via secretStore or process.env
- Call fetchModels(engineId, apiKey) → provider API
↓
4. Aggregate and return all models as ModelInfo[]
1. Browser loads http://localhost:8787
↓
2. Express serves static HTML/JS from packages/daemon/static/
↓
3. Frontend makes API calls to Express routes:
- GET /api/providers → state.listProviders()
- GET /api/models → state.listModels()
- GET /api/config → loadConfig()
- POST /api/config → saveConfig()
- POST /api/secrets → state.secretStore.set()
- POST /api/chat → state.chat() (SSE stream)
- POST/GET/DELETE /api/sessions → state.sessionStore.*()
- POST /api/sessions/:id/chat → session-scoped chat (SSE)
- GET /v1/models → state.listModels() (OpenAI format)
- POST /v1/chat/completions → state.chat() (OpenAI format, streaming or JSON)
↓
4. Web server has direct DaemonState access (no gRPC in the loop)
| File | Path | Purpose |
|---|---|---|
| Socket (Linux/macOS) | $XDG_RUNTIME_DIR/abbenay/daemon.sock |
gRPC server socket |
| Socket (Windows) | \\.\pipe\abbenay-daemon |
gRPC named pipe |
| PID file | $XDG_RUNTIME_DIR/abbenay/abbenay.pid |
Daemon process ID |
| User Config | ~/.config/abbenay/config.yaml |
User-level provider config |
| Workspace Config | <ws>/.config/abbenay/config.yaml |
Workspace-level config |
| Session Data | $XDG_DATA_HOME/abbenay/sessions/ |
Persisted chat sessions |
| Logs | Stdout/stderr | Daemon logs |
- Secrets: Stored in system keychain via keytar when available; never in config files
- Socket: Unix socket (or named pipe) with user-only permissions
- Web dashboard: Listens on localhost only
- No remote access: Daemon designed for local use only
- Config files: Created with mode
0o600(user read/write only)