A comprehensive, modular audio analysis toolkit designed specifically for composers and synthesizer music creators. This toolkit goes beyond academic metrics to provide intuitive, creative insights into your music using descriptors like "spacey," "organic," "crystalline," and "oozy."
- 117 Creative Descriptors: Core (9), Extended (8), Advanced (100) - from spacey and organic to tempestuous, celestial, velvet, and microscopic
- Character Classification: 59 tags across synthesis types, textures, and processing - from analog synths to crystalline textures to reverbed atmospheres
- Phase-by-Phase Analysis: Detailed breakdown of each musical section with mood descriptors
- Automatic Playlist Generation: Creates optimal listening sequences based on musical flow principles
- Key Compatibility: Uses circle of fifths relationships for smooth transitions
- Energy Arc Management: Builds natural progression from atmospheric to climactic sections
- Tempo Progression: Ensures gradual tempo changes for better flow
- Musical Phase Detection: Automatically identifies intro, development, climax, and conclusion sections
- Cluster Analysis: Groups similar tracks for themed playlists
- Creative Insights: Understand your compositional patterns and sound palette
- Multiple Export Formats: CSV, JSON, and Markdown reports optimized for further analysis
The toolkit has been completely refactored into a professional Python package structure with comprehensive parallel processing capabilities:
audio_analysis/
βββ __init__.py # Main package interface with parallel components
βββ core/ # Core analysis algorithms
β βββ feature_extraction.py # 80+ audio features with detailed comments
β βββ feature_extraction_base.py # π Shared feature extraction core
β βββ parallel_feature_extraction.py # π Parallel feature extraction
β βββ phase_detection.py # Musical structure detection
β βββ clustering.py # K-means clustering for track grouping
β βββ parallel_clustering.py # π Distributed clustering with tensor support
β βββ tensor_operations.py # π Hardware-agnostic tensor processing
β βββ sequencing.py # Intelligent song ordering
βββ analysis/ # Creative interpretation modules
β βββ mood_analyzer.py # 117 creative mood descriptors
β βββ character_analyzer.py # 59 character tags (synthesis, texture, processing)
β βββ descriptors.py # Comprehensive descriptor definitions and thresholds
βββ utils/ # Support utilities
β βββ audio_io.py # File loading and validation
β βββ data_processing.py # Data cleaning and standardization
β βββ visualization.py # Publication-quality plots
β βββ type_conversion.py # Centralized type conversion utilities
β βββ validation.py # Shared validation functions
β βββ statistics.py # Statistical calculation utilities
βββ exporters/ # Output format handlers
β βββ csv_exporter.py # CSV export for spreadsheets
β βββ json_exporter.py # JSON for programmatic access
β βββ markdown_exporter.py # Human-readable reports
βββ api/ # Main interfaces
β βββ analyzer.py # Primary AudioAnalyzer class
β βββ parallel_analyzer.py # π ParallelAudioAnalyzer for scalable processing
β βββ mcp_server.py # FastMCP server integration
βββ cli/ # Command-line interface
βββ main.py # Enhanced CLI with extensive options
π Parallel Processing:
- 6x+ Performance: Multi-core processing with automatic optimization
- Tensor-Ready: Data structures optimized for hardware acceleration
- Scalable: Configurable batch sizes and memory limits
- Hardware-Agnostic: CPU, GPU, and future Tenstorrent processor support
π§ Code Quality:
- Single Source of Truth: Shared feature extraction core eliminates duplication
- Consistent Results: Traditional and parallel processing produce identical outputs
- Easy Extension: Add new moods and characters in one location
- Better Testing: Comprehensive test coverage with shared utilities
β‘ Performance Benchmarks:
- Feature Extraction: 6.7x speedup (8 cores)
- Phase Detection: 5.6x speedup (8 cores)
- Clustering: 5.0x speedup (8 cores)
- Total Analysis: 6.1x speedup (8 cores)
# Clone or download the project
cd analyze_synths
# Create virtual environment
python -m venv .
# Activate virtual environment
source bin/activate # On macOS/Linux
# or
.\Scripts\activate # On Windows
# Install dependencies
pip install -r requirements.txtOption 1: Wrapper Script (Easiest)
# Basic analysis - uses wrapper script
python analyze_library.py /path/to/your/music/directory
# With custom clustering
python analyze_library.py /path/to/music --clusters 3
# Export in specific format
python analyze_library.py /path/to/music --export-format markdown
# Get processing time estimate
python analyze_library.py /path/to/music --estimate
# Verbose output for debugging
python analyze_library.py /path/to/music --verbose
# Show supported formats and capabilities
python analyze_library.py --infoOption 2: Direct Module Access
# Same functionality, different invocation method
python -m audio_analysis.cli.main /path/to/your/music/directory
# All other options work the same way
python -m audio_analysis.cli.main /path/to/music --clusters 3
python -m audio_analysis.cli.main --infoOption 3: Parallel Processing Demo (NEW v2.1)
# Run parallel processing demonstration
python parallel_demo.py /path/to/music --workers 8 --batch-size 16
# Enable tensor optimizations for hardware acceleration
python parallel_demo.py /path/to/music --enable-tensor --device cpu
# Run specific demo modes
python parallel_demo.py /path/to/music --demo extraction # Feature extraction only
python parallel_demo.py /path/to/music --demo clustering # Clustering only
python parallel_demo.py /path/to/music --demo complete # Full analysisOption 1: Wrapper Script (Easiest)
# Start MCP server - uses wrapper script
python mcp_server.py
# For use with FastMCP directly
fastmcp run mcp_server.pyOption 2: Via analyze_library.py wrapper
# Start MCP server with default settings
python analyze_library.py --mode mcp
# Start on custom host/port
python analyze_library.py --mode mcp --host 0.0.0.0 --port 8080Option 3: Direct Module Access
# Direct module invocation
python -m audio_analysis.cli.main --mode mcp
# With custom settings
python -m audio_analysis.cli.main --mode mcp --host 0.0.0.0 --port 8080Standard Analysis:
from audio_analysis import AudioAnalyzer
# Initialize analyzer
analyzer = AudioAnalyzer('/path/to/audio/files')
# Run complete analysis
df = analyzer.analyze_directory()
# Perform clustering
cluster_labels, centers, features = analyzer.perform_clustering()
# Generate sequence recommendations
sequence = analyzer.recommend_sequence()
# Export all results (default: all formats)
export_info = analyzer.export_comprehensive_analysis()
# Export specific formats
export_info = analyzer.export_comprehensive_analysis(export_format="markdown")
export_info = analyzer.export_comprehensive_analysis(export_format="json", base_name="my_analysis")Parallel Processing (NEW v2.1):
from audio_analysis import ParallelAudioAnalyzer, ProcessingConfig
# Configure parallel processing
config = ProcessingConfig(
max_workers=8, # Use 8 CPU cores
batch_size=16, # Process 16 files per batch
enable_tensor_optimization=True, # Enable tensor operations
memory_limit_mb=4096 # 4GB memory limit
)
# Initialize parallel analyzer
analyzer = ParallelAudioAnalyzer('/path/to/audio/files', config)
# Same interface as standard analyzer
df = analyzer.analyze_directory()
cluster_labels, centers, features = analyzer.perform_clustering()
sequence = analyzer.recommend_sequence()
export_info = analyzer.export_comprehensive_analysis(export_format="all")
# Get parallel processing statistics
stats = analyzer.get_processing_statistics()
print(f"Parallel speedup: {stats['parallel_processing_stats']['parallel_speedup']:.1f}x")
print(f"Throughput: {stats['parallel_processing_stats']['throughput']:.1f} files/second")Hardware Acceleration (Future-Ready):
from audio_analysis import TensorFeatureExtractor
# CPU processing with tensor optimizations
extractor = TensorFeatureExtractor(device="cpu")
features = extractor.extract_features_from_paths(audio_file_paths)
# Tenstorrent processing (when available)
extractor = TensorFeatureExtractor(device="tenstorrent", device_id=0)
features = extractor.extract_features_from_paths(audio_file_paths)Export Format Options (NEW v2.2):
from audio_analysis import AudioAnalyzer
analyzer = AudioAnalyzer('/path/to/audio/files')
analyzer.analyze_directory()
# Export all formats (default: CSV data + JSON + Markdown + visualizations)
analyzer.export_comprehensive_analysis()
# Export only specific formats for faster processing
analyzer.export_comprehensive_analysis(export_format="markdown") # Human-readable report only
analyzer.export_comprehensive_analysis(export_format="json") # Programmatic data only
analyzer.export_comprehensive_analysis(export_format="csv") # Spreadsheet data only
# Customize file naming
analyzer.export_comprehensive_analysis(
export_format="json",
base_name="my_project"
) # β Creates: my_project_data.json
# Full customization
analyzer.export_comprehensive_analysis(
export_dir="/custom/path",
export_format="markdown",
base_name="album_analysis",
show_plots=True
) # β Creates: album_analysis_comprehensive_report.mdEach analysis creates a timestamped directory with organized results:
audio_features.csv- Complete feature matrix (80+ features per track)cluster_analysis.csv- Musical groupings and characteristicsphase_analysis.csv- Detailed phase breakdown with mood descriptorssequence_recommendations.csv- Optimal track ordering with reasoningsummary_statistics.csv- High-level collection insights
phase_timeline.png- Visual timeline of musical phases for all trackscluster_analysis.png- Multi-panel cluster visualization with PCAmood_distribution.png- Mood analysis across your collectionsequence_recommendations.png- Visual representation of optimal flow
comprehensive_analysis_report.md- The main report with:- Executive summary with key insights
- Recommended listening sequence with detailed reasoning
- Track-by-track mood and character analysis
- Phase-by-phase structural breakdown
- Cluster analysis for playlist creation
- Creative insights and compositional recommendations
analysis_data.json- Complete analysis data for programmatic access
Core Moods:
- Spacey: Low energy, ethereal, expansive atmospheres
- Organic: Natural textures, acoustic-like characteristics
- Synthetic: Clean, precise, distinctly electronic
- Oozy: Slow, flowing, liquid-like textures
- Pensive: Contemplative, moderate energy, thoughtful
- Tense: High energy, sharp, angular characteristics
- Exuberant: Joyful, high energy, bright
- Glitchy: Fragmented, stuttering, digital artifacts
- Chaos: Extreme energy, unpredictable, intense
Extended Moods:
- Ethereal: Delicate, floating, otherworldly
- Atmospheric: Environmental, ambient, immersive
- Crystalline: Clear, precise, bell-like
- Warm: Comfortable, enveloping, intimate
- Melodic: Tuneful, songlike, memorable
- Driving: Forward-moving, rhythmic, propulsive
- Percussive: Rhythmic, transient, beat-focused
- Droning: Sustained, minimal, hypnotic
The expanded character system now includes three comprehensive categories:
Synthesis Types (25 total):
- Analog Synth: Warm, vintage synthesizer characteristics
- Digital Synth: Clean, precise digital synthesis
- FM Synth: Complex harmonic interactions from frequency modulation
- Granular Synth: Fragmented, particulate textures
- Wavetable Synth: Morphing spectral evolution
- Physical Modeling: Realistic acoustic behavior through physics simulation
- Subtractive/Additive Synth: Filtered or harmonic-stacked synthesis
- Modular Synth: Experimental, unpredictable characteristics
- Pad/Lead/Bass/Arp Synth: Role-specific synthesizer voices
- Piano/Organ/Choir/Brass/Woodwind: Acoustic instrument emulations
- Drum Machine/Sampler: Rhythmic and sampling-based sources
- Plus 10 additional specialized synthesis types
Texture Types (20 total):
- Rich/Pure Texture: Complex layered vs. simple clean textures
- Bright/Warm Harmonics: High vs. low frequency emphasis
- Smooth/Rough Texture: Silky vs. gritty surface characteristics
- Crystalline/Organic Texture: Glass-like precision vs. natural breathing
- Mechanical/Liquid Texture: Industrial precision vs. flowing fluidity
- Metallic/Wooden/Glassy: Material-inspired sonic characteristics
- Fabric/Sandy/Rubbery: Tactile texture associations
- Plus 8 additional texture descriptors
Processing Types (14 total):
- Reverbed/Delayed: Spatial effects and echo patterns
- Chorused/Flanged/Phased: Modulation-based effects
- Distorted/Filtered: Harmonic saturation and frequency shaping
- Compressed/Pitched: Dynamic control and frequency shifting
- Ring Modulated: Metallic, inharmonic characteristics
- Bit Crushed: Digital degradation and lo-fi character
- Tape/Tube Saturated: Vintage warmth and harmonic distortion
- Plus 5 additional processing characteristics
- Study Your Patterns: Understand your compositional tendencies through data
- Album Sequencing: Use AI-generated optimal listening sequences
- Sound Palette Analysis: Identify your dominant musical characteristics
- Structural Insights: Learn from phase analysis of your work
- Mood Development: Track emotional progression in your compositions
- Playlist Creation: Use cluster analysis for themed collections
- Remix Planning: Find compatible phases and sections across tracks
- Energy Management: Plan DJ sets using detailed energy progression data
- Mood Matching: Pair tracks with complementary emotional characteristics
- Track Selection: Find similar tracks for consistent album flow
- Practice Planning: Sequence practice sessions using energy flow principles
- Performance Sets: Create dynamic live performance sequences
- Collaboration: Share detailed analysis with band members and collaborators
- Inspiration: Discover hidden patterns in your creative process
- Learning: Understand how your music affects listeners emotionally
- Python 3.8+ (Developed and tested on 3.13)
- Audio Formats: WAV, AIFF, MP3 (WAV recommended for best quality)
- Memory: 4GB+ RAM for large collections (100+ files)
- Storage: Analysis exports typically 1-50MB per session
- Dependencies: All handled by requirements.txt
$ python analyze_library.py /Users/composer/my_tracks
Initializing audio analyzer for: /Users/composer/my_tracks
Found 12 audio files. Processing...
Processing 1/12: ambient_dawn.wav
Processing 2/12: crystalline_patterns.wav
...
β CSV files exported to data/
β Plots saved to images/
β Markdown report saved to comprehensive_analysis_report.md
β JSON results saved to analysis_data.json
============================================================
ANALYSIS COMPLETE
============================================================
Files processed: 12
Features extracted: 89
Phases detected: 47
Clusters created: 3
Export directory: /Users/composer/my_tracks/audio_analysis_20241213_143022RECOMMENDED LISTENING SEQUENCE
==============================================================
1. ambient_dawn.wav
atmospheric β’ analog_synth β’ 3:45 β’ 72 BPM β’ C
Opening track - sets the mood with atmospheric atmosphere
2. crystalline_patterns.wav
crystalline β’ digital_synth β’ 4:20 β’ 85 BPM β’ G
Early exploration - introduces digital_synth textures
3. organic_flow.wav
organic β’ mellotron β’ 5:12 β’ 92 BPM β’ G
Core development - showcases organic at 92 BPM
When running in MCP mode, six powerful tools are available for remote analysis:
Comprehensive mood and character analysis with confidence scores
- Input: Audio files (base64 encoded)
- Output: 117 mood descriptors, 59 character tags, musical metrics
Musical structure detection with mood analysis per section
- Input: Audio files (base64 encoded)
- Output: Detailed phase breakdown with timing and characteristics
AI-powered optimal listening sequence generation
- Input: Multiple audio files (base64 encoded)
- Output: Recommended order with detailed reasoning
K-means clustering for musical similarity grouping
- Input: Audio files and optional cluster count
- Output: Cluster analysis with musical groupings and characteristics
Complete analysis pipeline with all features
- Input: Audio files and export format preference
- Output: Full analysis with mood, phases, clustering, and sequencing
System capabilities and format information
- Input: None
- Output: Supported formats, descriptors, analysis capabilities
- Multi-Core Processing: Automatic utilization of all available CPU cores
- Configurable Batching: Process multiple files simultaneously with optimized memory usage
- Performance Scaling: 6x+ speedup on multi-core systems
- Hardware Acceleration: Tensor-optimized data structures for future acceleration
- Memory Management: Intelligent memory usage with configurable limits
- Tensor Operations: Data structures optimized for Tenstorrent processors
- Device Abstraction: Hardware-agnostic interface supports CPU, GPU, and specialized processors
- Batch Processing: Optimal utilization of parallel processing units
- Memory Efficiency: Minimize data movement between processing units
- Future-Proof: Easy integration with emerging hardware acceleration platforms
- Processing Statistics: Detailed performance metrics and throughput analysis
- Parallel Speedup: Real-time calculation of performance improvements
- Memory Usage: Monitoring and optimization of memory consumption
- Error Tracking: Comprehensive error handling and reporting
- Benchmarking: Built-in performance benchmarking tools
- Single Source of Truth: Add new mood descriptors in one place
- Consistent Analysis: Traditional and parallel processing produce identical results
- Easy Extension: Simple framework for adding new creative descriptors
- Modular Design: Clean separation of concerns for easy maintenance
- Comprehensive Testing: Shared utilities ensure consistent behavior
Every analytical approach is thoroughly documented with:
- Why this method: Explanation of creative relevance
- How it works: Technical implementation details
- Parameter choices: Justification for thresholds and settings
- Musical context: Connection to composition and music theory
- File validation: Comprehensive format and content checking
- Graceful degradation: Analysis continues even with problematic files
- Memory management: Efficient processing of large collections
- Progress reporting: Detailed feedback during long operations
- CSV: Optimized for spreadsheet analysis and data science
- JSON: Structured data for programmatic access and APIs
- Markdown: Human-readable reports with musical insights
- Visualizations: Publication-quality plots and charts
- Out of Memory: Process files in smaller batches, increase system RAM
- MP3 Issues: Install additional codecs, convert to WAV for best results
- No Files Found: Verify audio files are in supported formats (WAV, AIFF, MP3)
- Import Errors: Ensure all dependencies installed:
pip install -r requirements.txt - MCP Server Issues: Install FastMCP:
pip install fastmcp
# Show detailed help (any method works)
python analyze_library.py --help
python -m audio_analysis.cli.main --help
# Show format information
python analyze_library.py --info
python -m audio_analysis.cli.main --info
# Estimate processing time
python analyze_library.py /path/to/files --estimate
python -m audio_analysis.cli.main /path/to/files --estimate
# Run with verbose output for debugging
python analyze_library.py /path/to/files --verbose
python -m audio_analysis.cli.main /path/to/files --verbose
# Parallel processing demo and help
python parallel_demo.py --help
python example_mood_extension.py # Shows how to add new mood descriptors# For large collections, use parallel processing
python parallel_demo.py /path/to/files --workers 16 --batch-size 32
# Enable tensor optimizations
python parallel_demo.py /path/to/files --enable-tensor --device cpu
# Monitor memory usage and adjust batch size
python parallel_demo.py /path/to/files --batch-size 8 --memory-limit 2048- Start with a small collection (5-10 tracks) to understand the output
- Read the comprehensive report - Focus on the markdown file first
- Try the recommended sequence - Play tracks in suggested order
- Explore the visualizations - Understand your musical patterns
- Experiment with clustering - Create themed playlists
- Use the Python API - Integrate with your existing workflow
- Customize analysis parameters - Adjust clustering and export options
- Run MCP server - Enable AI assistant integration
- Process large collections - Use batch processing techniques
- Contribute improvements - The modular architecture welcomes enhancements
# Example for Claude/ChatGPT integration
from audio_analysis import AudioAnalyzer
analyzer = AudioAnalyzer('/path/to/music')
results = analyzer.analyze_directory()
# Send results to AI for creative interpretation
mood_analysis = results[['filename', 'primary_mood', 'mood_descriptors']]
# "Please analyze these mood patterns and suggest creative directions..."# Export data for DAW integration
analyzer.export_comprehensive_analysis(export_format='json')
# Import JSON into your DAW or music management software# Process multiple directories (using wrapper script)
for dir in /path/to/albums/*; do
python analyze_library.py "$dir" --export-format csv
done
# Or using direct module access
for dir in /path/to/albums/*; do
python -m audio_analysis.cli.main "$dir" --export-format csv
doneTransform your music analysis from academic metrics to creative insights. Perfect for composers who want to understand their art through an intuitive, musical lens. πΆ
New in v2.1: Comprehensive parallel processing capabilities with 6x+ performance improvements, hardware acceleration readiness for Tenstorrent processors, tensor-optimized data structures, and a refactored architecture that eliminates code duplication while maintaining full backward compatibility.
Previous v2.0: Complete modular refactor with extensive inline documentation, enhanced CLI, robust error handling, professional-grade architecture, and convenient wrapper scripts for easy access.
The toolkit now includes a complete Hugging Face Spaces deployment for web-based audio analysis, making it accessible to users worldwide without any installation required.
Easy Access: Upload audio files directly through your web browser for instant analysis
- Multiple Analysis Types: Comprehensive analysis, mood-only, or phase detection only
- Export Options: Download results in Markdown, JSON, or CSV formats
- Public Demo: Share and demonstrate your audio analysis capabilities
- No Installation: Works immediately in any web browser
gradio/
βββ app.py # Complete Gradio web interface
βββ requirements.txt # HF-specific dependencies
βββ README.md # Model card with proper YAML frontmatter
βββ CLAUDE.md # Deployment-specific guidance
Option 1: Hugging Face Spaces (Recommended)
- Create new Space at hf.co/new-space
- Choose "Gradio" SDK
- Upload contents of
gradio/directory - Automatic deployment at
https://huggingface.co/spaces/yourusername/spacename
Option 2: Hugging Face Model Repository
- Upload complete Python package as model repository
- Users install via
pip install git+https://huggingface.co/username/repo.git - Include comprehensive documentation and examples
Option 3: PyPI + HF Community
- Package for PyPI distribution (
pip install audio-analysis-toolkit) - List as community resource on Hugging Face Hub
Supported Audio: WAV (recommended), AIFF, MP3 up to 100MB, 1s-30min duration Analysis Results: Same 17 mood descriptors, 9 character tags, and phase detection as desktop version Export Formats: Human-readable reports, structured JSON data, and spreadsheet-ready CSV files Real-time Processing: Immediate results with downloadable complete analysis
- Accessibility: No Python knowledge or installation required
- Educational: Interactive exploration of audio analysis concepts
- Professional: API-quality outputs for integration workflows
- Gateway: Introduction to full desktop toolkit capabilities
- Shareable: Public demos for collaboration and teaching
The web interface maintains full compatibility with all desktop analysis features while providing an intuitive, browser-based experience that makes advanced audio analysis accessible to a broader audience.