A production-ready multi-agent system that answers questions about Python codebases. Built with the Model Context Protocol (MCP), each agent is an independent MCP server with specialized capabilities, collaborating through a central orchestrator to provide accurate, context-aware responses.
┌──────────────┐
│ User │
└──────┬───────┘
│
┌──────▼───────┐
│ FastAPI │
│ Gateway │
│ :8000 │
└──────┬───────┘
│ MCP (Streamable HTTP)
┌──────▼───────┐
│ Orchestrator │
│ Agent :9000 │
└──┬──┬──-──┬──┘
┌─────────┘ │ └─────────┐
▼ │ ▼
┌────────-──┐ ┌─────▼────┐ ┌──────────┐
│ Indexer │ │ Graph │ │ Code │
│ Agent │ │ Query │ │ Analyst │
│ :9001 │ │ :9002 │ │ :9003 │
└─────┬─────┘ └─────┬────┘ └─────┬────┘
│ │ │
▼ ▼ ▼
┌──────────────────────────────────────┐
│ Neo4j :7687 │
└──────────────────────────────────────┘
┌──────────────────────────────────────┐
│ Redis :6379 │
└──────────────────────────────────────┘
┌──────────┐
│Monitoring│ (also connected to Neo4j & Redis)
│ Agent │
│ :9004 │
└──────────┘
- User sends a query to the FastAPI Gateway (REST or WebSocket).
- The Gateway calls the Orchestrator Agent via MCP Streamable HTTP.
- The Orchestrator classifies the query using GPT-4o, determines which agents to invoke, and routes requests in parallel or sequentially.
- Downstream agents (Indexer, Graph Query, Code Analyst, Monitoring) execute their specialized tools and return structured data.
- The Orchestrator synthesizes all outputs via GPT-4o into a coherent response.
- The Gateway returns the final response to the user.
Coordinates all queries, manages conversation context, and synthesizes responses.
| Tool | Description |
|---|---|
analyze_query |
Classifies query intent (entity_search, dependency_lookup, code_explanation, pattern_analysis, comparison, general) and extracts entities using GPT-4o with a rule-based fallback. |
route_to_agents |
Builds a routing plan mapping the query to specific agent tools. |
get_conversation_context |
Retrieves conversation history and summary from Redis. |
synthesize_response |
Combines multiple agent outputs into a coherent response via GPT-4o. |
chat |
End-to-end entry point: analyze → route → call agents → synthesize. |
set_user_preferences |
Store session-level preferences that influence response synthesis. |
Multi-Turn Context: Conversation context is loaded from Redis before query analysis and injected into the LLM classifier, so follow-up questions like "how does it handle errors?" correctly resolve pronouns from prior turns.
Fallback Strategy: If Graph Query fails → Code Analyst reads source files directly (filesystem-based AST fallback). If Code Analyst fails → return raw graph data. If all fail → return honest partial answer.
Parses the configured Python repository and populates the Neo4j knowledge graph.
| Tool | Description |
|---|---|
index_repository |
Full repo indexing: clone, parse all .py files, populate Neo4j. Returns a job ID immediately; indexing runs as a background task. Poll get_index_status for progress. |
index_file |
Re-index a single file (clears old data first). |
parse_python_ast |
Returns a simplified AST representation of a Python file. |
extract_entities |
Extracts code entities and relationships from a file. |
get_index_status |
Reports indexing progress for a job ID. |
Provides read-only access to the knowledge graph.
| Tool | Description |
|---|---|
find_entity |
Full-text search for classes, functions, methods, or modules. |
get_dependencies |
Find what an entity imports, inherits from, or calls. |
get_dependents |
Find what imports, inherits from, or calls an entity. |
trace_imports |
Follow the import chain from a module (depth 1-5). |
find_related |
Get entities connected by a specific relationship type. |
execute_query |
Run custom read-only Cypher (write operations are blocked). |
Provides deep code understanding using source code analysis and GPT-4o.
| Tool | Description |
|---|---|
analyze_function |
Deep analysis of a function: complexity, calls, LLM explanation. |
analyze_class |
Comprehensive class analysis: methods, inheritance, patterns. |
find_patterns |
Detect design patterns (Singleton, Factory, Strategy, etc.) recursively across all subpackages. |
get_code_snippet |
Extract code with surrounding context lines. |
explain_implementation |
Generate a detailed explanation of how code works. |
compare_implementations |
Compare two code entities side-by-side. |
Partial Neo4j Independence: Entity resolution falls back to filesystem-based AST scanning of the cloned repository when Neo4j is unavailable. Graph-dependent helpers (calls, methods, bases) return empty results gracefully, so tool responses remain valid but may be incomplete without Neo4j.
Provides system health checks, query analytics, and agent performance metrics.
| Tool | Description |
|---|---|
get_system_health |
Check health of Neo4j and Redis infrastructure. |
get_query_analytics |
Query counts by intent, average agents per query, cache hit rate. |
get_agent_performance |
Per-agent call count, success rate, and average latency. |
get_index_metrics |
Knowledge graph node and relationship counts by type. |
get_active_sessions |
List active conversation sessions with message counts. |
| Label | Key Properties |
|---|---|
File |
path, name, extension |
Module |
name, qualified_name, docstring |
Class |
name, qualified_name, line_start, line_end, docstring, is_abstract |
Function |
name, qualified_name, line_start, line_end, is_async, return_type |
Method |
name, qualified_name, is_async, is_static, is_classmethod, is_property |
Parameter |
name, type_annotation, default_value, is_required |
Decorator |
name, qualified_name, arguments |
Import |
module_name, alias, is_from_import |
Docstring |
content, style (google/numpy/plain) |
CONTAINS, HAS_METHOD, IMPORTS, INHERITS_FROM, CALLS, DECORATED_BY, HAS_PARAMETER, DOCUMENTED_BY, DEPENDS_ON, DEFINED_IN, RETURNS_TYPE
The system indexes the FastAPI framework repository (master branch). After a full indexing run, the knowledge graph typically contains ~6,000 nodes and ~7,000 relationships across 528 Python files.
| URL | Description |
|---|---|
| http://localhost:8000/docs | Swagger UI — interactive API documentation for all gateway endpoints |
| http://localhost:8000/redoc | ReDoc — alternative API documentation |
| http://localhost:7474 | Neo4j Browser — explore the knowledge graph (user: neo4j, password from .env) |
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/chat |
Send a message, receive AI response. Supports session_id for multi-turn. |
POST |
/api/index |
Trigger repository indexing (full or incremental). |
GET |
/api/index/status/{job_id} |
Check indexing job progress. |
GET |
/api/agents/health |
Health check for all agents and infrastructure. |
GET |
/api/graph/statistics |
Knowledge graph node/relationship counts. |
| Endpoint | Description |
|---|---|
WS /ws/chat |
Real-time chat with streaming status updates and chunked responses. |
curl -X POST http://localhost:8000/api/chat \
-H "Content-Type: application/json" \
-d '{"message": "What is the FastAPI class?", "session_id": ""}'{
"session_id": "a1b2c3d4e5f6",
"response": "The `FastAPI` class is the main entry point...",
"sources": [{"agent": "graph_query", "tool": "find_entity"}],
"agents_used": ["graph_query", "code_analyst"],
"duration_ms": 2340
}- Docker and Docker Compose
- An OpenAI API key
# 1. Clone the repository
git clone https://github.com/Anuj-cs20/python-repository-chat-agent.git
cd python-repository-chat-agent
# 2. Configure environment
cp .env.example .env
# Edit .env and set your OPENAI_API_KEY
# 3. Start all services
docker-compose up --build
# 4. Trigger repository indexing (indexes the repo configured in .env by default)
curl -X POST http://localhost:8000/api/index \
-H "Content-Type: application/json" \
-d '{"mode": "full"}'
# 5. Start chatting
curl -X POST http://localhost:8000/api/chat \
-H "Content-Type: application/json" \
-d '{"message": "How does FastAPI handle request validation?"}'Note: Indexing is not automatic on
docker compose up. You must trigger it via the/api/indexendpoint. The default repo is configured in.env(FASTAPI_REPO_URL). To index a different Python repo, pass it in the request:curl -X POST http://localhost:8000/api/index \ -H "Content-Type: application/json" \ -d '{"repo_url": "https://github.com/django/django.git", "branch": "main", "mode": "full"}'The Swagger UI at http://localhost:8000/docs has pre-filled examples for all endpoints.
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies
uv sync --extra dev
# Run tests
uv run python -m pytest tests/ -v
# Start individual agents (requires Neo4j and Redis running locally)
uv run python -m src.agents.indexer.server
uv run python -m src.agents.graph_query.server
uv run python -m src.agents.code_analyst.server
uv run python -m src.agents.monitoring.server
uv run python -m src.agents.orchestrator.server
uv run python -m src.gateway.mainAll configuration is managed through environment variables with Pydantic Settings. See .env.example for the complete list.
| Category | Variables | Description |
|---|---|---|
| General | ENVIRONMENT, LOG_LEVEL |
Environment mode and logging level |
| OpenAI | OPENAI_API_KEY, OPENAI_MODEL |
LLM provider configuration |
| Neo4j | NEO4J_URI, NEO4J_USER, NEO4J_PASSWORD |
Knowledge graph database |
| Redis | REDIS_URL, REDIS_SESSION_TTL |
Session storage and caching |
| Agent URLs | ORCHESTRATOR_URL, INDEXER_URL, etc. |
Inter-agent communication endpoints |
| Timeouts | AGENT_TIMEOUT_SECONDS, AGENT_MAX_RETRIES |
Resilience configuration |
Each agent has its own Pydantic Settings class that inherits from SharedSettings, enabling independent configuration while sharing common infrastructure settings.
We chose Streamable HTTP over stdio because each agent runs in its own Docker container. HTTP enables standard container networking, health checks, and independent scaling. The trade-off is slightly higher latency per call compared to stdio, but this is negligible for our use case.
Neo4j's native graph storage and Cypher query language are ideal for code relationship traversal (imports, inheritance, calls). The trade-off is the additional infrastructure dependency, but it enables queries that would be complex with a relational database.
The assignment requires Docstring as a node type. We store docstrings both as a property on entities (for quick access) and as separate Docstring nodes linked via DOCUMENTED_BY (for graph queries about documentation).
The Orchestrator uses GPT-4o for query classification but includes a rule-based fallback that handles queries when the LLM is unavailable. This ensures the system degrades gracefully.
Identical queries are cached in Redis for 5 minutes. This reduces LLM API costs and improves response time for repeated questions.
Every request receives a unique correlation ID at the Gateway. The ID is forwarded via X-Correlation-ID headers to each MCP agent, where ASGI middleware (CorrelationMiddleware in run_mcp_server()) extracts it for structured logging. This enables end-to-end request tracing across all five agents.
The Graph Query agent validates all generated Cypher queries against an allowlist of read-only clauses (MATCH, WHERE, RETURN, WITH, OPTIONAL MATCH, ORDER BY, LIMIT, SKIP, UNWIND, UNION, COUNT, COLLECT, EXISTS, DISTINCT). Mutation keywords (CREATE, DELETE, SET, REMOVE, MERGE, DROP, CALL) are explicitly blocked to prevent data modification.
├── pyproject.toml # Dependencies and project config
├── docker-compose.yml # Full stack orchestration
├── Dockerfile # Legacy multi-stage build
├── .env.example # Environment variable documentation
│
├── src/
│ ├── shared/ # Shared infrastructure
│ │ ├── config.py # Pydantic Settings
│ │ ├── models.py # Shared data models
│ │ ├── exceptions.py # Custom exception hierarchy
│ │ ├── logging.py # Structured JSON logging + run_mcp_server()
│ │ ├── neo4j_client.py # Async Neo4j driver
│ │ ├── redis_client.py # Async Redis client
│ │ └── mcp_utils.py # MCP client helper (async with httpx)
│ │
│ ├── agents/
│ │ ├── orchestrator/server.py # MCP server (6 tools)
│ │ ├── indexer/server.py # MCP server (5 tools)
│ │ ├── graph_query/server.py # MCP server (6 tools)
│ │ ├── code_analyst/server.py # MCP server (6 tools)
│ │ └── monitoring/server.py # MCP server (5 tools)
│ │
│ └── gateway/
│ ├── main.py # FastAPI application
│ └── routes/ # API endpoints
│
├── tests/
│ ├── unit/ # 238 tests (83%+ coverage)
│ └── integration/ # Gateway & MCP transport tests
│
└── docker/
├── neo4j/init.cypher # Graph schema initialization
└── services/ # Per-service Dockerfiles
├── Dockerfile.orchestrator
├── Dockerfile.indexer
├── Dockerfile.graph-query
├── Dockerfile.code-analyst
├── Dockerfile.monitoring
└── Dockerfile.gateway
# Run all unit tests
uv run python -m pytest tests/unit/ -v
# Run with coverage
uv run python -m pytest tests/ --cov=src --cov-report=html
# Run specific test suite
uv run python -m pytest tests/unit/test_ast_parser.py -vCurrent status: 238 tests covering AST parsing, entity extraction, query building, query classification, routing, pattern detection, snippet extraction, Pydantic models, monitoring, MCP transport, gateway integration, SSE streaming, and WebSocket chat. Coverage threshold is 70% (currently ~83%).
- CALLS relationship accuracy: Function call detection via
ast.Callnodes identifies direct calls but may miss dynamic dispatch or calls through variables. A post-parse resolution pass maps short names (self.method,BaseModel) to qualified names when unambiguous. - Cross-module resolution: Import and dependency relationships undergo best-effort name resolution against the codebase entity index. External dependencies (stdlib, third-party) remain unresolved because their entities are not in the graph.
- LLM dependency: Code analysis and response synthesis require a valid OpenAI API key. The system falls back to raw data when LLM is unavailable but the responses are less useful.
- Single-node deployment: The current Docker Compose setup runs all agents on a single host. Horizontal scaling would require a service mesh or load balancer in front of each agent.