Skip to content

Arvind-puthucode/MiniAxiom

Repository files navigation

Motivation

I'm interested in understanding how formal proof engines like Coq work, and how large language models (LLMs) can be used to translate natural language queries into formal representations that such systems can process.

This project is inspired by the work of Symbolica and George Morgan, with the long-term goal of grasping and contributing to the kind of reasoning systems they’re building.

MathGraph: Hybrid Mathematical Reasoning System

A novel hybrid approach to mathematical problem solving that combines Large Language Models (LLMs) with formal reasoning engines to achieve both natural language understanding and mathematical rigor.

🎯 Core Innovation

Hybrid Architecture: LLMs handle the ambiguous natural language ↔ formal representation mapping, while deterministic algorithms ensure mathematical rigor and correctness in the reasoning process.

🏗️ System Architecture

Natural Language Input
    ↓ [LLM Agent: Problem Extraction]
Formal Problem Representation 
    ↓ [Validation Engine]
Verified Facts, Rules, Goals
    ↓ [Forward Chaining Proof Engine]
Proof Path or Failure
    ↓ [LLM Agent: Explanation Generation]
Human-Readable Proof

🚀 Key Features

✅ Completed Implementation

  • 🧮 Natural Language Processing: Azure OpenAI GPT-4o for problem extraction and explanation
  • 🔬 Formal Mathematical Reasoning: Deterministic proof engine with pattern matching
  • 📊 Expression Parsing: SymPy-integrated parser for mathematical expressions
  • 🔗 Rule System: 25+ mathematical rules across algebra, arithmetic, and number theory
  • ⚡ Forward Chaining: Systematic proof search with termination guarantees
  • 🌐 Web Interface: Streamlit-based interactive demo

🎯 Problem Types Supported

  1. Linear Algebraic Equations

    • If x + 5 = 12, find x
    • If 3y = 21, what is y?
  2. Number Theory Proofs

    • If n is even, prove that 2n is even
    • If a is odd and b is odd, prove that a + b is even
  3. Inequality Reasoning

    • If a > b and b > c, prove a > c
    • If x > 5 and 5 > 2, prove x > 2

🛠️ Technology Stack

  • Backend: Python 3.11+
  • LLM Integration: Azure OpenAI GPT-4o API
  • Mathematical Processing: SymPy for expression handling
  • Web Framework: Streamlit for rapid prototyping
  • Data Structures: NetworkX for proof graph representation
  • Testing: pytest with 65+ comprehensive tests
  • Package Management: uv for fast dependency resolution

📦 Installation & Setup

Prerequisites

  • Python 3.11+
  • Azure OpenAI API access
  • uv package manager

Installation

# Clone the repository
git clone <repository-url>
cd MiniAxiom

# Install dependencies
uv sync

# Set up environment variables
cp .env.template .env
# Edit .env with your Azure OpenAI credentials

Environment Configuration

Create a .env file with your Azure OpenAI credentials:

AZURE_OPENAI_API_KEY=your_api_key_here
AZURE_OPENAI_ENDPOINT=https://your-endpoint.openai.azure.com/
AZURE_OPENAI_API_VERSION=2024-02-01
AZURE_OPENAI_DEPLOYMENT_NAME=your-deployment-name

🎮 Usage

Web Interface

uv run streamlit run streamlit_app.py

Open http://localhost:8501 in your browser for the interactive interface.

Python API

from src.mathgraph import MathGraphAPI

# Initialize the system
api = MathGraphAPI()

# Solve a problem
result = api.solve("If x + 7 = 15, find x")

print(result["explanation"])

Command Line Demos

# Core system demo
uv run python demo_core_system.py

# Proof engine demo
uv run python demo_proof_engine.py

# LLM integration demo
uv run python demo_llm_integration.py

# Complete system demo
uv run python demo_complete_system.py

📊 Performance Metrics

Based on comprehensive testing:

  • Success Rate: 100% on supported problem types
  • Average Processing Time: ~8-12 seconds per problem
  • Proof Steps: Typically 1-3 steps for target problems
  • Extraction Confidence: 0.9-0.95 for well-formed problems

🔧 System Components

Core Components

  1. Expression System (src/formal/expressions.py)

    • Mathematical expression hierarchy
    • Fact and rule representations
    • Problem containers
  2. Parser (src/formal/parser.py)

    • SymPy-integrated expression parsing
    • Predicate and rule parsing
    • Validation and error handling
  3. Pattern Matching (src/reasoning/pattern_matching.py)

    • Variable substitution system
    • Rule application matching
    • Proof step validation
  4. Rule System (src/reasoning/rules.py)

    • 25+ mathematical inference rules
    • Categorized rule management
    • Dynamic rule activation
  5. Proof Engine (src/reasoning/proof_engine.py)

    • Forward chaining algorithm
    • Proof search with termination
    • Step-by-step proof tracking
  6. LLM Integration (src/extraction/ & src/explanation/)

    • Natural language problem extraction
    • Formal proof explanation generation
    • Azure OpenAI API integration
  7. System Integration (src/mathgraph.py)

    • End-to-end pipeline
    • Error handling and recovery
    • Performance monitoring

🔬 Mathematical Foundations

Formal Language

The system uses a restricted formal language with these predicates:

  • Equality & Comparison: eq, gt, lt, gte, lte
  • Number Properties: even, odd, prime, positive, negative
  • Arithmetic Relations: divides, multiple

Inference Rules

Sample rules implemented:

R1: eq(X + A, B) → eq(X, B - A)           [Subtraction Property]
R2: eq(A * X, B) → eq(X, B / A)           [Division Property]
R3: eq(X, Y) ∧ eq(Y, Z) → eq(X, Z)        [Transitivity]
R4: even(X) → even(X * Y)                 [Even Multiplication]
R5: gt(X, Y) ∧ gt(Y, Z) → gt(X, Z)        [Greater Than Transitivity]

🎯 Demo Results

The system successfully demonstrates:

✅ Successful Solves

  • Linear Equations: If x + 7 = 15, find xx = 8
  • Multiplication: If 3y = 21, what is y?y = 7
  • Number Theory: If n is even, prove 2n is even → Valid proof
  • Inequalities: If a > b and b > c, prove a > c → Transitivity proof

📈 Performance Metrics

  • 100% Success Rate on target problem types
  • ~1-3 Proof Steps for typical problems
  • 8-12 Second Processing including LLM calls
  • High Confidence Extraction (0.9-0.95)

🔄 Future Extensions

Easier TODOS

  • Backward Chaining: Goal-directed proof search
  • Rule Learning: Extract patterns from successful proofs
  • Multi-step Algebra: More complex equation solving

Medium TODOS

  • Higher-order Logic: Quantifiers and functions
  • Geometric Reasoning: Basic geometric predicates
  • Calculus Integration: Derivatives and limits
  • Group Theory: Convert generic theorems to group theory problems

Harder TODOs (Dream lol)

  • Educational Platform: Interactive tutoring system
  • Research Tool: Mathematician proof assistance
  • Automated Grading: Homework evaluation

📋 Project Structure

MiniAxiom/
├── src/
│   ├── formal/                 # Core mathematical structures
│   │   ├── expressions.py      # Expression hierarchy
│   │   └── parser.py          # Mathematical parsing
│   ├── reasoning/             # Proof engine components
│   │   ├── pattern_matching.py # Rule application
│   │   ├── rules.py           # Mathematical rules
│   │   └── proof_engine.py    # Forward chaining
│   ├── extraction/            # LLM problem extraction
│   │   ├── llm_client.py      # Azure OpenAI client
│   │   └── problem_extractor.py # Natural language processing
│   ├── explanation/           # Proof explanation
│   │   └── proof_explainer.py # Natural language generation
│   └── mathgraph.py          # Main system integration
├── tests/                     # Comprehensive test suite
├── examples/                  # Demo problems
├── streamlit_app.py          # Web interface
├── demo_*.py                 # Various demo scripts
└── README.md                 # This file

🤝 Contributing

This is my vibecoding and experimenting this prototype demonstrating hybrid AI mathematical reasoning. Eventually this needs to be formalized for hypergraphss. Where can we use this?

  • Educational Tools: Interactive math tutoring
  • Research Platforms: Formal reasoning experiments
  • Production Systems: Automated mathematical problem solving

MathGraph v1.0 - Demonstrating the future of AI mathematics through hybrid LLM + formal reasoning approaches.

About

we want to convert any generic math theorem eventually a group theory problem if possible thr llm and then prove it algorithmically

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages