🔧 DevAgent - AI-Powered Developer Tool Testing Pipeline

Automatically crawl, analyze, test, and generate intelligent reports for any developer tool documentation.

DevAgent is a comprehensive testing system that uses AI to evaluate developer tools by analyzing their documentation, generating test cases, executing them, and providing detailed insights for improvement.

🌟 What Does DevAgent Do?

DevAgent automates the entire process of evaluating developer tools and APIs:

🕷️ Intelligent Web Crawling

Uses Crawl4AI with deep crawling strategies to discover documentation pages
Supports multiple crawling modes: simple, deep (BFS/DFS), and adaptive crawling
Smart URL normalization to avoid duplicates (/api vs /api/)
Filters and focuses on relevant documentation content

🧠 AI-Powered Content Analysis

DSPy-powered document analysis that extracts:
- API operations and capabilities
- Authentication methods and requirements
- Usage patterns and workflows
- Error scenarios and edge cases
- Code examples and integration guides

🎯 Automated Test Generation

Generates comprehensive test cases across multiple categories:
- Authentication testing - API key validation, OAuth flows
- Basic usage - Core functionality verification
- Core workflows - Multi-step process testing
- Error handling - Edge cases and failure scenarios
Prioritizes tests based on complexity and importance

⚡ Parallel Test Execution

Runs tests in parallel with configurable worker pools
Thread-safe execution with isolated contexts
Real-time progress tracking and error reporting
Graceful fallback to sequential execution when needed

📊 Intelligent Reporting

AI-generated insights analyzing test failures against documentation
Page-level reports with specific recommendations
Overall quality scores and improvement suggestions
Gap analysis identifying missing examples and unclear documentation
Web-based dashboard with modern, interactive UI

🚀 Key Features

🌐 Modern Web Interface - FastAPI-powered dashboard with real-time updates
🔧 Flexible Configuration - Customize crawling depth, test parameters, and API keys
📈 Progress Tracking - Monitor pipeline execution across all stages
💾 Persistent Results - Save and review past testing runs
🎨 Beautiful UI - Modern, responsive design with dark theme
🔄 Real-time Updates - Auto-refreshing status and progress indicators
📋 Comprehensive Logging - Detailed execution traces and error reporting

🛠️ Installation & Setup

Prerequisites

Python 3.11+
uv (recommended) or pip for package management

1. Clone the Repository

git clone <repository-url>
cd devagent

2. Install Dependencies

Using uv (recommended):

uv sync

Or using pip:

pip install -e .

3. Install Playwright Browsers

⚠️ IMPORTANT: After installing the Python packages, you must install the browser binaries:

uv run playwright install

Or if using pip:

playwright install

This downloads the required browser binaries (Chromium, Firefox, WebKit) that Crawl4AI needs for web scraping.

4. Configure AI Models

Set up your preferred AI model by setting environment variables:

# For Gemini (recommended)
export GEMINI_API_KEY="your-openai-api-key"

# OR create a .env file
echo "GEMINI_API_KEY=your-key-here" > .env

🎮 Usage

Web Interface (Recommended)

Start the web server:

uv run devagent-web

Then open your browser to: http://localhost:8005

The web interface allows you to:

Configure tool testing parameters
Set API keys and context variables
Monitor real-time progress
View comprehensive reports
Access historical test runs

Command Line Interface

For programmatic usage:

uv run devagent-cli

Or run directly:

python agents/test.py

🎯 Example Usage

Testing an API Documentation Site

Open the web interface at http://localhost:8005
Enter tool details:
- Tool Name: OpenWeatherMap API
- Base URL: https://openweathermap.org/api
Add API keys (KEY:VALUE format):
- OPENWEATHER_API_KEY: your-api-key-here
Configure options:
- Max Pages: 20
- Max Depth: 3
- Keywords: api, documentation, guide
Click "Start Testing Pipeline"
Monitor progress in real-time
Review results including:
- Overall quality score
- AI-generated insights
- Page-level analysis
- Specific improvement recommendations

📊 Pipeline Stages

The testing pipeline consists of 5 main stages:

🕷️ Fetching - Crawl and discover documentation pages
🔍 Analysis - AI-powered content extraction and categorization
📝 Test Planning - Generate comprehensive test scenarios
⚡ Execution - Run tests in parallel with isolated contexts
📊 Reporting - Generate insights and recommendations

Each stage provides detailed progress updates and error handling.

🔧 Configuration Options

Crawling Configuration

Max Pages: Maximum number of pages to crawl (1-100)
Max Depth: How deep to crawl from the base URL (1-5)
Keywords: Focus keywords for relevance scoring
URLs to Exclude: Skip specific URLs or patterns

Execution Configuration

Max Workers: Number of parallel workers (1-16)
API Keys: Set testing credentials and context variables
Timeouts: Configure request and execution timeouts

AI Configuration

Model Selection: Choose between OpenAI GPT or Claude models
Analysis Depth: Configure how thorough the AI analysis should be

🎨 Web Interface Features

📋 Configuration Form

Modern, responsive design with dark theme
Dynamic API key management - add/remove key-value pairs
Advanced options with collapsible sections
Form validation and user-friendly error messages

📊 Results Dashboard

Real-time progress tracking with auto-refresh
Interactive report viewing with expandable sections
Search and filtering for large result sets
Export capabilities for reports and raw data

🔄 Pipeline Monitoring

Live status updates during execution
Detailed error reporting with stack traces
Stage-by-stage progress with timing information
Background execution without blocking the UI

Development Setup

Clone and install as described above
Install development dependencies:
```
uv sync --dev
```
Run tests:
```
uv run pytest
```

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
coding_agent_eval		coding_agent_eval
demos		demos
doc_eval_agent		doc_eval_agent
docs		docs
nbs		nbs
opentelemetry		opentelemetry
tests		tests
.env.sample		.env.sample
.gitignore		.gitignore
README.md		README.md
precipitation_tile.png		precipitation_tile.png
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔧 DevAgent - AI-Powered Developer Tool Testing Pipeline

🌟 What Does DevAgent Do?

🕷️ Intelligent Web Crawling

🧠 AI-Powered Content Analysis

🎯 Automated Test Generation

⚡ Parallel Test Execution

📊 Intelligent Reporting

🚀 Key Features

🛠️ Installation & Setup

Prerequisites

1. Clone the Repository

2. Install Dependencies

3. Install Playwright Browsers

4. Configure AI Models

🎮 Usage

Web Interface (Recommended)

Command Line Interface

🎯 Example Usage

Testing an API Documentation Site

📊 Pipeline Stages

🔧 Configuration Options

Crawling Configuration

Execution Configuration

AI Configuration

🎨 Web Interface Features

📋 Configuration Form

📊 Results Dashboard

🔄 Pipeline Monitoring

Development Setup

About

Uh oh!

Releases

Packages

Languages

moarshy/dev-eval-agent

Folders and files

Latest commit

History

Repository files navigation

🔧 DevAgent - AI-Powered Developer Tool Testing Pipeline

🌟 What Does DevAgent Do?

🕷️ Intelligent Web Crawling

🧠 AI-Powered Content Analysis

🎯 Automated Test Generation

⚡ Parallel Test Execution

📊 Intelligent Reporting

🚀 Key Features

🛠️ Installation & Setup

Prerequisites

1. Clone the Repository

2. Install Dependencies

3. Install Playwright Browsers

4. Configure AI Models

🎮 Usage

Web Interface (Recommended)

Command Line Interface

🎯 Example Usage

Testing an API Documentation Site

📊 Pipeline Stages

🔧 Configuration Options

Crawling Configuration

Execution Configuration

AI Configuration

🎨 Web Interface Features

📋 Configuration Form

📊 Results Dashboard

🔄 Pipeline Monitoring

Development Setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages