archi is a retrieval-augmented generation framework for research and education teams who need a low-barrier to entry, configurable, private, and extensible assistant. The system was first developed at MIT for the SubMIT computing project, and now powers chat, ticketing, and course-support workflows across academia and research organizations.
archi provides:
- Customizable AI pipelines that combine data retrieval and LLMs (and more tools to come!).
- Data ingestion connectors: web links, git repositories, local files, JIRA, and more.
- Interfaces: chat app, ticketing assistant, email bot, and more.
- Support for running or interacting with local and API-based LLMs.
- Modular design that allows custom data sources, LLM backends, and deployment targets.
- Containerized services and CLI utilities for repeatable deployments.
The docs are organized as follows:
- Install — system requirements and installation.
- Quickstart — deploy your first Archi instance in minutes.
- User Guide — overview of all capabilities.
- Data Sources — configure web links, git, JIRA, Redmine, and more.
- Services — chat, uploader, data manager, Piazza, and other interfaces.
- Models & Providers — LLM providers, embeddings, and BYOK.
- Agents & Tools — agent specs, tools, MCP integration.
- Configuration — full YAML config schema reference.
- CLI Reference — all CLI commands and options.
- API Reference — REST API endpoints.
- Benchmarking — evaluate retrieval and response quality.
- Advanced Setup — GPU setup and production deployment.
- Developer Guide — architecture, contributing, and extension patterns.
- Troubleshooting — common issues and fixes.
Follow the Install and Quickstart guide to set up prerequisites, configure data sources, and launch an instance.
We welcome fixes and new integrations—see the Developer Guide for coding standards, testing instructions, and contribution tips. Please open issues or pull requests on the GitHub repository.
archi is released under the MIT License. For project inquiries, contact paus@mit.edu.
