Abbenay Architecture

Overview

Abbenay is a unified AI daemon and library written in TypeScript/Node.js that provides:

A reusable core library (@abbenay/core) for LLM engine abstraction, streaming chat, and config
A gRPC API for chat and configuration
A web dashboard for provider/model management
A VS Code extension that registers models with VS Code's Language Model API

┌─────────────────────────────────────────────────────────────────────────┐
│                         Consumer Applications                            │
│                                                                          │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────────────┐  │
│  │   VS Code Ext   │  │   Python Apps   │  │   Web Dashboard         │  │
│  │   (gRPC)        │  │   (gRPC)        │  │   (HTTP → DaemonState)   │  │
│  └────────┬────────┘  └────────┬────────┘  └────────────┬────────────┘  │
│           │                    │                        │               │
│           └────────────────────┼────────────────────────┘               │
│                                │                                         │
└────────────────────────────────┼─────────────────────────────────────────┘
                                 │ gRPC over Unix Socket (or named pipe)
                                 ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                     abbenay daemon (TypeScript)                          │
│                                                                          │
│  ┌─ @abbenay/core ──────────────────────────────────────────────────┐   │
│  │  CoreState          Engines (Vercel AI SDK)    Config (YAML)      │   │
│  │  SecretStore i/f    Streaming chat + tools     Model discovery    │   │
│  └───────────────────────────────────────────────────────────────────┘   │
│                                                                          │
│  ┌─ daemon layer ────────────────────────────────────────────────────┐   │
│  │  DaemonState        gRPC Server               VS Code Backchannel │   │
│  │  CLI (Commander)    Web Dashboard (Express)    KeychainSecretStore │   │
│  └───────────────────────────────────────────────────────────────────┘   │
│                                                                          │
└──────────────────────────────────────────┬──────────────────────────────┘
                                           │
         ┌─────────────────────────────────┼───────────────────┐
         │                                 │                   │
         ▼                                 ▼                   ▼
┌─────────────────┐              ┌─────────────────┐    ┌───────────────┐
│   LLM APIs      │              │  keytar         │    │  Config Files │
│   (HTTP)        │              │  (keychain)     │    │  (YAML)       │
└─────────────────┘              └─────────────────┘    └───────────────┘

Core/Full Package Split

The source tree is organized into two layers:

@abbenay/core (`src/core/`)

Reusable library with zero transport dependencies. Can be used standalone by agent developers, web developers, or any Node.js application.

File	Purpose
`core/state.ts`	`CoreState` class — provider resolution, model listing, chat
`core/engines.ts`	Engine registry with Vercel AI SDK providers (dynamically loaded)
`core/config.ts`	YAML config loader/saver, merge logic
`core/secrets.ts`	`SecretStore` interface + `MemorySecretStore`
`core/paths.ts`	Platform-aware path utilities
`core/mock.ts`	Mock engine for testing
`core/policies.ts`	Policy system — built-in + custom policies, resolution, flattening
`core/tool-registry.ts`	Tool collection, namespacing, policy filtering, executor builder
`core/session-store.ts`	File-based session persistence (CRUD, index, messages)
`core/session-summarizer.ts`	Periodic LLM-generated session summaries (DR-022)
`core/index.ts`	Public API surface

@abbenay/daemon (`src/daemon/`)

Full application layer. Extends core with transport, UI, and CLI.

File	Purpose
`daemon/state.ts`	`DaemonState extends CoreState` — client registry, VS Code backchannel
`daemon/daemon.ts`	Process lifecycle, gRPC server startup, signal handling
`daemon/transport.ts`	Unix socket and PID file management
`daemon/tool-router.ts`	Tool execution routing (VS Code, MCP, local)
`daemon/mcp-client-pool.ts`	MCP server connection pool
`daemon/mcp-server.ts`	Embedded MCP server (exposes daemon as MCP)
`daemon/index.ts`	CLI entry point (Commander)
`daemon/server/abbenay-service.ts`	gRPC service handlers
`daemon/web/server.ts`	Express web server + REST API
`daemon/web/openai-compat.ts`	OpenAI-compatible `/v1/*` routes (models, chat completions)
`daemon/web/grpc-web-control.ts`	gRPC client for web server control
`daemon/secrets/keychain.ts`	`KeychainSecretStore` (keytar native addon)

Components

abbenay daemon

The core TypeScript/Node.js process that runs as a background daemon.

Subcommands:

abbenay start - Start all services (daemon, web dashboard, OpenAI API, MCP server)
abbenay daemon - Start the gRPC server on Unix socket (or named pipe on Windows)
abbenay web - Start the web dashboard (embedded in daemon or started via gRPC if daemon already running)
abbenay serve - Start the OpenAI-compatible API server (same as web but framed for API use)
abbenay status - Check if daemon is running
abbenay stop - Stop the running daemon

Socket location:

Linux/macOS: $XDG_RUNTIME_DIR/abbenay/daemon.sock or /run/user/{uid}/abbenay/daemon.sock
Windows: \\.\pipe\abbenay-daemon

Web Dashboard (Embedded)

The web dashboard runs inside the daemon process via Express:

Port: localhost:8787 (configurable)
Static assets: Served from packages/daemon/static/
API routes: /api/* -> Direct calls to DaemonState (no gRPC in the loop)
Chat SSE: POST /api/chat -> Streaming responses via Server-Sent Events
OpenAI-compatible API: /v1/models, /v1/chat/completions -> Drop-in replacement for any OpenAI-compatible client (see DR-020)

The web server is started either:

In-process when abbenay web or abbenay serve runs and no daemon is running
Via gRPC StartWebServer when a daemon is already running and abbenay web/abbenay serve is invoked

VS Code Extension

The extension acts as a thin gRPC client to the daemon:

On activation: Connects to daemon (starts if not running)
Registers as a LanguageModelChatProvider with VS Code
Provides workspace paths via gRPC backchannel (VSCodeStream)
Opens web dashboard on command

Key files:

extension.ts - Activation, commands, status bar
daemon/client.ts - gRPC client wrapper
daemon/backchannel.ts - Bidirectional stream handler
providers/AbbenayLanguageModelProvider.ts - VS Code LM API integration

Engine Architecture

All LLM providers are implemented via the Vercel AI SDK with a data-driven engine registry in core/engines.ts.

Engine registry

Each engine entry carries metadata AND its factory function. Adding a new engine is a single registry entry — no switch statements anywhere.

// core/engines.ts — simplified
const ENGINES: Record<string, EngineInfo> = {
  openai: {
    id: 'openai',
    requiresKey: true,
    defaultBaseUrl: 'https://api.openai.com/v1',
    defaultEnvVar: 'OPENAI_API_KEY',
    supportsTools: true,
    createModel: (modelId, config) =>
      dedicatedProvider('@ai-sdk/openai', 'createOpenAI', config, modelId),
  },
  // ... 18 more engines
};

Dynamic provider loading

AI SDK provider packages (@ai-sdk/openai, @ai-sdk/anthropic, etc.) are loaded via dynamic import() at runtime — only when that engine is actually used. This means:

For the core library: consumers install only the providers they need
For the daemon: all providers are bundled into the SEA binary

If a provider package is missing, the error message tells you exactly what to install.

Engine categories

Dedicated providers: Each has its own @ai-sdk/* package (OpenAI, Anthropic, Gemini, Mistral, xAI, DeepSeek, Groq, Cohere, Bedrock, Fireworks, Together AI, Perplexity)
OpenAI-compatible: Use @ai-sdk/openai-compatible (Azure, OpenRouter, Ollama, LM Studio, Cerebras, Meta)
Mock: Built-in, no external package needed

Secret Management

Secrets are managed explicitly per-provider with two options:

Option 1: Keychain Storage (keytar)

Uses keytar for cross-platform keychain access:
- macOS: Keychain
- Linux: libsecret (GNOME Keyring / KDE Wallet)
- Windows: Credential Vault
Config references key by name: api_key_keychain_name: "OPENAI_API_KEY"

Option 2: Environment Variable Reference

Config specifies env var name: api_key_env_var_name: "OPENAI_API_KEY"
Value read from process.env at runtime

Important: These options are mutually exclusive per provider. The web UI provides a toggle to choose between them.

SecretStore interface

interface SecretStore {
  get(key: string): Promise<string | null>;
  set(key: string, value: string): Promise<void>;
  delete(key: string): Promise<boolean>;
  has(key: string): Promise<boolean>;
}

CoreState accepts any SecretStore via constructor injection. DaemonState uses KeychainSecretStore (keytar-backed) by default. Tests and library consumers can use MemorySecretStore.

Configuration

Config Files

User level: ~/.config/abbenay/config.yaml
Workspace level: <workspace>/.config/abbenay/config.yaml

Config Format

providers:
  my-openai:              # Virtual provider name (user-defined)
    engine: openai        # Actual engine type
    api_key_keychain_name: "OPENAI_API_KEY"
    models:               # Map of virtual model name -> config
      gpt-4o: {}          # Enabled with defaults
      gpt-4o-mini:
        temperature: 0.3
        max_tokens: 4096

Config Loader

User and workspace configs are merged (workspace overrides user):

// core/config.ts
// loadConfig(), loadWorkspaceConfig(), mergeConfigs()
// Provider config: engine, api_key_keychain_name | api_key_env_var_name, base_url, models

Policies

Policies are named bundles of behavioral defaults that can be assigned to virtual models. A model references a policy by name; the policy's fields act as defaults that the model's explicit config can override.

Resolution order (later wins)

Engine defaults  ←  Policy defaults  ←  Explicit ModelConfig  ←  Request params

Built-in policies

Policy	Temperature	max_tokens	Purpose
`precise`	0.15	2048	Factual, concise responses
`balanced`	0.5	4096	General-purpose
`creative`	0.9	8192	Exploratory, generative
`coder`	0.2	4096	Complete, runnable code
`json_strict`	0.2	2048	JSON-only output with retry
`long_context_chat`	—	4096	Concise follow-ups in long conversations

Policy config structure

# ~/.config/abbenay/policies.yaml (user-level only)
my-policy:
  sampling:
    temperature: 0.3
    top_p: 0.8
  output:
    max_tokens: 4096
    system_prompt_snippet: "Be concise."
    system_prompt_mode: prepend   # prepend | append | replace
    format: text                  # text | json_only | markdown
  reliability:
    retry_on_invalid_json: false
    timeout: 30000

Assigning a policy to a model

# In config.yaml
providers:
  my-openai:
    engine: openai
    models:
      gpt-4o:
        policy: coder            # References a built-in or custom policy
        temperature: 0.1         # Explicit config overrides the policy

Tool System

ToolRegistry (core)

The ToolRegistry collects tools from multiple sources and namespaces them to prevent collisions. Part of @abbenay/core, usable without the daemon.

Sources and namespace prefixes:

Source	Prefix	Example
VS Code workspace	`ws:`	`ws:myproject/readFile`
MCP server	`mcp:`	`mcp:github/searchCode`
Local (agent-registered)	`local:`	`local:myAgent/search`

Tool policy controls which tools the LLM sees:

Tier	Config field	Behavior
Auto-approve	`auto_approve`	Execute without confirmation
Require approval	`require_approval`	Pause and ask user
Disabled	`disabled_tools`	Never sent to LLM

Patterns support glob matching (e.g., mcp:filesystem/*).

ToolRouter (daemon)

The daemon's ToolRouter provides the execution backend for remote tools:

VS Code tools → routed via gRPC backchannel (VSCodeStream)
MCP tools → routed via McpClientPool
Local tools → called directly via inline executor

McpClientPool (daemon)

Manages connections to external MCP servers defined in config. Uses @ai-sdk/mcp for the client implementation.

Supports stdio and HTTP/SSE transports
Auto-discovers tools on connect and registers them in ToolRegistry
Hot-reloads when config changes (connects new, disconnects removed)

# In config.yaml
mcp_servers:
  filesystem:
    transport: stdio
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem", "/home/user"]
    enabled: true
  github:
    transport: http
    url: http://localhost:3001/sse
    enabled: true

gRPC Protocol

Defined in proto/abbenay/v1/service.proto. The daemon loads protos dynamically via @grpc/proto-loader (no code generation for the daemon).

Core RPCs (Implemented)

RPC	Description
`Chat`	Streaming chat with a model
`SessionChat`	Streaming chat within a session
`CreateSession` / `GetSession` / `ListSessions` / `DeleteSession`	Session CRUD
`SummarizeSession`	On-demand or cached session summary
`ListModels`	List available models from providers
`ListProviders`	List configured providers
`GetSecret` / `SetSecret` / `DeleteSecret` / `ListSecrets`	Secret management
`Register` / `Unregister`	Client registration
`VSCodeStream`	Bidirectional backchannel
`GetStatus` / `HealthCheck`	Daemon status
`GetConfig` / `UpdateConfig`	Configuration
`GetProviderStatus`	Provider status
`GetConnectedWorkspaces`	Workspace paths from VS Code
`StartWebServer` / `StopWebServer`	Embedded web dashboard lifecycle
`ListEngines`	List available engine types
`ListPolicies`	List built-in and custom policies
`Shutdown`	Daemon shutdown

Stub RPCs (Deferred)

RPC	Description
`WatchSessions` / `ReplaySession`	Session replay / real-time events
`ForkSession` / `ExportSession` / `ImportSession`	Session branching and sharing
`ListTools` / `ExecuteTool`	Tool execution via gRPC
`RegisterMcpServer` / `UnregisterMcpServer`	MCP server registration via gRPC

VS Code Backchannel

The VSCodeStream RPC enables bidirectional communication:

Daemon -> VS Code requests:

GetWorkspace - Get connected workspace paths
InvokeTool - Invoke VS Code tools (future)
ListModels - List VS Code LM models (future)

VS Code -> Daemon responses:

Workspace folder paths
Tool results
Error responses

Session Management

Sessions are persisted as JSON files in $XDG_DATA_HOME/abbenay/sessions/ (Linux) or ~/Library/Application Support/abbenay/sessions/ (macOS). See DR-021.

The SessionStore class (core layer) handles CRUD operations and maintains an index.json for fast listing without reading every session file.

Available transports:

gRPC: CreateSession, GetSession, ListSessions, DeleteSession, SessionChat, SummarizeSession
Web API: POST/GET/DELETE /api/sessions, POST /api/sessions/:id/chat (SSE), GET /api/sessions/:id/summary
CLI: aby sessions list/show/delete, aby chat --session <id|new>

Periodic summaries: Every 10 user messages, a background LLM call generates a 2-3 sentence summary stored on the session (see DR-022). Summaries are also available on demand via SummarizeSession (gRPC) or GET /api/sessions/:id/summary.

Not yet implemented: ForkSession, ExportSession, ImportSession, ReplaySession, web dashboard session sidebar, context window compression using summaries (context.context_threshold / compression_strategy), internal MCP tool for cross-session retrieval.

Data Flow

Chat Request Flow

1. Client sends ChatRequest via gRPC (or POST /api/chat for web)
   ↓
2. DaemonState.chat() → CoreState.chat() resolves provider/model from composite ID
   ↓
3. CoreState.resolveApiKey() gets API key (keychain or env var based on config)
   ↓
4. engines.ts streamChat() dynamically loads the AI SDK provider and calls streamText()
   ↓
5. Response chunks streamed back to client as ChatChunk objects

Model Discovery Flow

1. Client calls ListModels (gRPC or GET /api/models)
   ↓
2. CoreState.listModels() iterates configured providers
   ↓
3. For each configured provider:
   - Load API key from config (keychain name or env var name)
   - Resolve key value via secretStore or process.env
   - Call fetchModels(engineId, apiKey) → provider API
   ↓
4. Aggregate and return all models as ModelInfo[]

Web Dashboard Flow

1. Browser loads http://localhost:8787
   ↓
2. Express serves static HTML/JS from packages/daemon/static/
   ↓
3. Frontend makes API calls to Express routes:
   - GET /api/providers → state.listProviders()
   - GET /api/models → state.listModels()
   - GET /api/config → loadConfig()
   - POST /api/config → saveConfig()
   - POST /api/secrets → state.secretStore.set()
   - POST /api/chat → state.chat() (SSE stream)
   - POST/GET/DELETE /api/sessions → state.sessionStore.*()
   - POST /api/sessions/:id/chat → session-scoped chat (SSE)
   - GET /v1/models → state.listModels() (OpenAI format)
   - POST /v1/chat/completions → state.chat() (OpenAI format, streaming or JSON)
   ↓
4. Web server has direct DaemonState access (no gRPC in the loop)

File Locations

File	Path	Purpose
Socket (Linux/macOS)	`$XDG_RUNTIME_DIR/abbenay/daemon.sock`	gRPC server socket
Socket (Windows)	`\\.\pipe\abbenay-daemon`	gRPC named pipe
PID file	`$XDG_RUNTIME_DIR/abbenay/abbenay.pid`	Daemon process ID
User Config	`~/.config/abbenay/config.yaml`	User-level provider config
Workspace Config	`<ws>/.config/abbenay/config.yaml`	Workspace-level config
Session Data	`$XDG_DATA_HOME/abbenay/sessions/`	Persisted chat sessions
Logs	Stdout/stderr	Daemon logs

Security

Secrets: Stored in system keychain via keytar when available; never in config files
Socket: Unix socket (or named pipe) with user-only permissions
Web dashboard: Listens on localhost only
No remote access: Daemon designed for local use only
Config files: Created with mode 0o600 (user read/write only)

Uh oh!

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History