Demonstration repository showcasing GEPA with SuperOptiX - achieving high accuracy improvements through reflective prompt evolution.
GEPA is a breakthrough optimization technique that uses reflective prompt evolution to dramatically improve AI agent performance. Unlike traditional optimizers that rely on trial-and-error, GEPA acts like an expert tutor - analyzing what went wrong and writing better instructions.
MINIMUM SYSTEM REQUIREMENTS:
- RAM: 8GB minimum, 16GB+ recommended, 32GB+ for production
- CPU: 4+ cores recommended
- Storage: 20GB+ free space for models
- Time: 2-10 minutes depending on configuration
CLOUD COST WARNING:
- Local execution strongly recommended for cost control
- Cloud instances require 8GB+ RAM minimum
- Estimated cost: $0.50-$5.00 per optimization run depending on instance size
- Extended optimization sessions can become expensive quickly
RESPONSIBLE AI USAGE:
- GEPA is computationally intensive - monitor resource usage
- Consider environmental impact of extended optimization runs
- Start with lightweight demos before full optimization
- Use appropriate hardware tier for your use case
Perfect for testing on laptops or systems with 8GB+ RAM:
# 1. Clone and setup
git clone https://github.com/SuperagenticAI/gepa_eval.git
cd gepa_eval
./scripts/setup.sh
# 2. Run lightweight demo (2-3 minutes, 8GB+ RAM)
./scripts/run_light_demo.sh
# 3. Quick results with minimal resource usage!For systems with 16GB+ RAM and better hardware:
# 1. Clone and setup
git clone https://github.com/SuperagenticAI/gepa_eval.git
cd gepa_eval
./scripts/setup.sh
# 2. Run complete demo (5-10 minutes, 16GB+ RAM)
./scripts/run_demo.sh
# 3. See the full GEPA transformation!Before GEPA Optimization:
Input: "Solve x² - 5x + 6 = 0"
Output: "Using the quadratic formula: x = 2 or x = 3"
After GEPA Optimization:
Input: "Solve x² - 5x + 6 = 0"
Output: "**Method 1: Factoring**
Step 1: Look for two numbers that multiply to 6 and add to -5
These numbers are -2 and -3
Factor: (x - 2)(x - 3) = 0
Solutions: x = 2 or x = 3
**Method 2: Quadratic Formula**
For ax² + bx + c = 0, x = (-b ± √(b²-4ac)) / 2a
Here: a=1, b=-5, c=6
x = (5 ± √(25-24)) / 2 = (5 ± 1) / 2
x = 3 or x = 2
**Verification:**
x = 2: (2)² - 5(2) + 6 = 4 - 10 + 6 = 0 ✓
x = 3: (3)² - 5(3) + 6 = 9 - 15 + 6 = 0 ✓"
Domain: Mathematical problem solving Scenarios: Quadratic equations, geometry, calculus optimization Key Feature: Multi-method solutions with verification
cd gepa_eval
super agent compile advanced_math_gepa
super agent optimize advanced_math_gepa
super agent run advanced_math_gepa --goal "Find the maximum area of a rectangle with perimeter 20"| Aspect | Other Optimizers | GEPA |
|---|---|---|
| Sample Efficiency | Needs 100+ examples | Works with 3-10 examples |
| Domain Adaptation | Generic optimization | Domain-specific feedback |
| Interpretability | Black box improvements | Human-readable prompt evolution |
| Quality Focus | Quantity-driven | Quality-driven with reflection |
✅ Perfect for:
- Specialized domains (math, medicine, law, security)
- Limited training data
- Quality over speed requirements
- Interpretable improvements needed
- Simple, general-purpose tasks
- Large datasets (>100 examples)
- Tight resource constraints
- Speed-critical applications
- Model:
llama3.2:1b(lightweight) - Optimization:
auto: minimal - Time: 2-3 minutes
- Use case: Testing, learning, low-end machines
- Model:
llama3.1:8b+qwen3:8b - Optimization:
auto: light - Time: 5-8 minutes
- Use case: Development, good balance of speed/quality
- Model:
llama3.1:8b+qwen3:8b - Optimization:
auto: heavy - Time: 15-30 minutes
- Use case: Production deployments, best quality
- Python 3.11+
- 8GB+ RAM minimum (16GB+ recommended)
- SuperOptiX framework
Option 1: Conda (Recommended)
conda env create -f environment.yml
conda activate gepa-eval
pip install -e .Option 2: UV (Fastest)
uv venv .venv
source .venv/bin/activate
uv pip install -r requirements.txt
uv pip install -e .# Install required models
ollama pull llama3.1:8b # Main processing
ollama pull qwen3:8b # GEPA reflection
ollama pull llama3.2:1b # Lightweight testing# Step-by-step agent optimization
super agent evaluate advanced_math_gepa # Baseline
super agent optimize advanced_math_gepa # GEPA optimization
super agent evaluate advanced_math_gepa # Measure improvement
# Test with custom problems
super agent run advanced_math_gepa --goal "Solve the system: x + 2y = 7, 3x - y = 4"
- Execute & Analyze: Run agent on training examples
- Reflect: Reflection LM analyzes failures and successes
- Evolve: Generate improved prompt candidates
- Select: Choose best performers using Pareto optimization
- Iterate: Repeat to build a tree of improvements
# GEPA Configuration Example
optimization:
optimizer:
name: GEPA
params:
metric: advanced_math_feedback # Domain-specific feedback
auto: light # Budget control
reflection_lm: qwen3:8b # Reflection model
reflection_minibatch_size: 3 # Efficiency tuningGEPA includes specialized metrics for different domains: Refer Docs for more details.
advanced_math_feedback- Mathematical problem solvingmulti_component_enterprise_feedback- Business document analysisvulnerability_detection_feedback- Security analysisprivacy_preservation_feedback- Data privacy protectionmedical_accuracy_feedback- Healthcare applicationslegal_analysis_feedback- Legal document processing
# Create new GEPA-enabled agent
super agent design my_custom_agent
# Add GEPA optimization
# Edit playbook to include GEPA configuration
super agent compile my_custom_agent
super agent optimize my_custom_agentdef custom_domain_feedback(example, pred, trace=None, *args, **kwargs):
"""Implement domain-specific feedback for GEPA."""
from dspy.primitives import Prediction
# Analyze prediction quality
score = analyze_prediction(example, pred)
feedback = generate_domain_feedback(example, pred)
return Prediction(score=score, feedback=feedback)For systems with limited resources:
# Optimized for 16GB+ systems
language_model:
model: llama3.1:8b # ~8GB
optimization:
optimizer:
reflection_lm: qwen3:8b # ~8GB
auto: light # Conservative budget# Budget options for different needs
auto: light # 3-5 minutes, good results
auto: medium # 8-12 minutes, better results
auto: heavy # 15-30 minutes, best resultsGEPA Timeout (Normal Behavior)
Error: Command timed out after 2m 0.0s
Solution: GEPA typically needs 3-5 minutes. This is expected behavior.
super agent optimize agent_name --timeout 300 # 5 minutesMemory Issues
# Reduce memory usage
# Edit playbook: reflection_minibatch_size: 2
# Edit playbook: auto: lightModel Availability
# Ensure models are available
ollama list
ollama pull llama3.1:8b
ollama pull qwen3:8b- GEPA Paper - Original research
- DSPy GEPA Tutorial - Technical guide
- SuperOptiX Docs - Framework documentation
- GEPA Optimization Guide - Comprehensive guide
We welcome contributions! See our contribution guide for details.
Ways to contribute:
- Add new domain-specific agents
- Implement custom feedback metrics
- Improve benchmark coverage
- Enhance documentation
- Report issues and bugs
This project is licensed under the MIT License - see the LICENSE file for details.
- GEPA Research Team - Original algorithm development
- DSPy Framework - Core optimization infrastructure
Ready to see GEPA in action? Run ./scripts/run_demo.sh and watch AI agent optimization revolution! 🚀