AgentVault

Reliable LangChain agents with persistent memory and automatic recovery from failures.

Wraps any LangChain chain with step-level checkpointing, heuristic verification, and automatic retry. Runs fully locally with Ollama — no API keys required.

Why

A 100-step agent pipeline crashes at step 70. Without checkpointing you lose everything and pay to redo it. With AgentVault:

FASE 1 — Interrupted at step 7 of 12

  Vanilla:        💥 Lost all progress. 7 API calls wasted.
  ReliableAgent:  💥 7 checkpoints saved to SQLite.

FASE 2 — Resuming

  Vanilla:        Starts from 0 → 12 more calls (19 total)
  ReliableAgent:  Resumes from step 8 → 5 more calls (12 total)

  ╭───────────────────────┬─────────────┬───────────────────╮
  │                       │   Vanilla   │   ReliableAgent   │
  ├───────────────────────┼─────────────┼───────────────────┤
  │ Total API calls       │     19      │        12         │
  │ Calls saved           │      —      │    7  (37% less)  │
  │ State after crash     │  Lost all   │  7 steps in SQLite│
  ╰───────────────────────┴─────────────┴───────────────────╯

Run the demo: python3 demo_interruption.py

Resilience benchmark — 30 steps with 30% injected failure rate:

Metric	Vanilla	ReliableAgent
Completion rate	56.7%	96.7%
Failed steps	13	1
Recovery rate	—	92%

Stack

LangChain — agent/chain abstraction
Ollama — local LLM inference (no API keys)
OpenViking — persistent context database (auto-detected)
SQLite — fallback storage when OpenViking is not available
Superpowers — agentic skills framework (inspiration)

Install

pip install -e .

Pull a model:

ollama pull llama3.2
ollama serve

Usage

from langchain_ollama import OllamaLLM
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from core import ReliableAgent

llm = OllamaLLM(model="llama3.2")
chain = PromptTemplate.from_template("Answer: {input}") | llm | StrOutputParser()

agent = ReliableAgent(chain=chain, session_id="my_session", max_retries=2)
result = agent.run_steps([
    {"input": "What is 2+2?"},
    {"input": "Name the planets in the solar system."},
])
print(result)

If the run is interrupted, re-running with the same session_id resumes from the last successful step automatically.

Benchmarks

Resilience benchmark (recommended)

Injects 30% failure rate per step (timeouts, connection resets, empty responses) with a fixed seed:

python -m benchmarks.run_benchmark_flaky

Basic benchmark

30 arithmetic/factual steps, vanilla vs ReliableAgent:

python -m benchmarks.run_benchmark

Example

python examples/simple_agent.py

Project structure

core/
  agent.py      # ReliableAgent — wraps any LangChain chain
  memory.py     # OpenViking + SQLite persistence layer
  verifier.py   # Heuristic step verification (no model calls)
benchmarks/
  run_benchmark.py                  # Basic 30-step benchmark (Ollama)
  run_benchmark_flaky.py            # Resilience benchmark with injected failures
  run_benchmark_portfolio.py        # 20-step financial analysis (Ollama)
  run_benchmark_portfolio_claude.py # Same benchmark with Claude API
examples/
  simple_agent.py   # Minimal working demo
demo_interruption.py  # Crash + recovery demo (shows the core value prop)

How it works

run_steps(steps)
  │
  ├─ Resume from last checkpoint (if any)
  │
  └─ For each step:
       ├─ chain.invoke(step_input)
       ├─ Verifier.verify(result, history)  ← no model call
       │    ├─ empty output?  → retry
       │    ├─ loop detected? → retry
       │    └─ inconsistent?  → retry
       ├─ Save to memory (OpenViking or SQLite)
       └─ If max_retries exceeded → mark failed, continue

Credits

volcengine/OpenViking — context database for AI agents
obra/superpowers — agentic skills framework

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
benchmarks		benchmarks
core		core
examples		examples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo_interruption.py		demo_interruption.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentVault

Why

Stack

Install

Usage

Benchmarks

Resilience benchmark (recommended)

Basic benchmark

Example

Project structure

How it works

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AgentVault

Why

Stack

Install

Usage

Benchmarks

Resilience benchmark (recommended)

Basic benchmark

Example

Project structure

How it works

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages