Reliable LangChain agents with persistent memory and automatic recovery from failures.
Wraps any LangChain chain with step-level checkpointing, heuristic verification, and automatic retry. Runs fully locally with Ollama — no API keys required.
A 100-step agent pipeline crashes at step 70. Without checkpointing you lose everything and pay to redo it. With AgentVault:
FASE 1 — Interrupted at step 7 of 12
Vanilla: 💥 Lost all progress. 7 API calls wasted.
ReliableAgent: 💥 7 checkpoints saved to SQLite.
FASE 2 — Resuming
Vanilla: Starts from 0 → 12 more calls (19 total)
ReliableAgent: Resumes from step 8 → 5 more calls (12 total)
╭───────────────────────┬─────────────┬───────────────────╮
│ │ Vanilla │ ReliableAgent │
├───────────────────────┼─────────────┼───────────────────┤
│ Total API calls │ 19 │ 12 │
│ Calls saved │ — │ 7 (37% less) │
│ State after crash │ Lost all │ 7 steps in SQLite│
╰───────────────────────┴─────────────┴───────────────────╯
Run the demo: python3 demo_interruption.py
Resilience benchmark — 30 steps with 30% injected failure rate:
| Metric | Vanilla | ReliableAgent |
|---|---|---|
| Completion rate | 56.7% | 96.7% |
| Failed steps | 13 | 1 |
| Recovery rate | — | 92% |
- LangChain — agent/chain abstraction
- Ollama — local LLM inference (no API keys)
- OpenViking — persistent context database (auto-detected)
- SQLite — fallback storage when OpenViking is not available
- Superpowers — agentic skills framework (inspiration)
pip install -e .Pull a model:
ollama pull llama3.2
ollama servefrom langchain_ollama import OllamaLLM
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from core import ReliableAgent
llm = OllamaLLM(model="llama3.2")
chain = PromptTemplate.from_template("Answer: {input}") | llm | StrOutputParser()
agent = ReliableAgent(chain=chain, session_id="my_session", max_retries=2)
result = agent.run_steps([
{"input": "What is 2+2?"},
{"input": "Name the planets in the solar system."},
])
print(result)If the run is interrupted, re-running with the same session_id resumes from the last successful step automatically.
Injects 30% failure rate per step (timeouts, connection resets, empty responses) with a fixed seed:
python -m benchmarks.run_benchmark_flaky30 arithmetic/factual steps, vanilla vs ReliableAgent:
python -m benchmarks.run_benchmarkpython examples/simple_agent.pycore/
agent.py # ReliableAgent — wraps any LangChain chain
memory.py # OpenViking + SQLite persistence layer
verifier.py # Heuristic step verification (no model calls)
benchmarks/
run_benchmark.py # Basic 30-step benchmark (Ollama)
run_benchmark_flaky.py # Resilience benchmark with injected failures
run_benchmark_portfolio.py # 20-step financial analysis (Ollama)
run_benchmark_portfolio_claude.py # Same benchmark with Claude API
examples/
simple_agent.py # Minimal working demo
demo_interruption.py # Crash + recovery demo (shows the core value prop)
run_steps(steps)
│
├─ Resume from last checkpoint (if any)
│
└─ For each step:
├─ chain.invoke(step_input)
├─ Verifier.verify(result, history) ← no model call
│ ├─ empty output? → retry
│ ├─ loop detected? → retry
│ └─ inconsistent? → retry
├─ Save to memory (OpenViking or SQLite)
└─ If max_retries exceeded → mark failed, continue
- volcengine/OpenViking — context database for AI agents
- obra/superpowers — agentic skills framework