Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 94 additions & 0 deletions CYTHON_README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Cython Optimization for python-redux

This document details the advanced Cython optimization implemented for `python-redux`, achieving **4.6x faster dispatch throughput** and significantly reduced CPU usage.

## Overview

We utilize a **Full Store Cythonization** approach. The entire `Store` class is implemented as a high-performance `cdef class` in Cython, minimizing Python overhead for hot paths while maintaining full API compatibility.

- **Cython Implementation (`redux/_store_core.pyx`):**
- Complete `Store` class replacement.
- Optimized internal data structures (C-lists, direct attribute access).
- **Hyper Optimization**: Fast-path checks for `asyncio.iscoroutine` (skipping overhead for synchronous listeners).
- Inline type checks for `CompleteReducerResult` to avoid function call overhead.
- **Pure Python Fallback (`redux/_store_py.py`):**
- Original implementation preserved.
- Automatically used if the Cython extension is missing or disabled.

## Benchmark Results (Hyper Optimization)

Performance comparison between the Pure Python baseline and the Hyper-Optimized Cython version:

| Test Case | Baseline (Python) | Optimized (Cython) | Speedup |
|-----------|-------------------|--------------------|-------------|
| **Simple Dispatch** | 38.3 μs | 18.4 μs | **2.08x** |
| **With Event Handlers** | 15.9 μs | 7.8 μs | **2.04x** |
| **With Subscribers** | 45.6 μs | 9.8 μs | **4.65x** |

> **Note**: Times are per dispatch loop (100 actions). Lower is better.

The **4.65x speedup** for subscribers is a result of "Hyper Optimization" (Phase 8), which eliminated 66% of the overhead associated with checking for coroutines in synchronous listeners.

## Autorun Optimization (Phase 9)

In Phase 9, we fully Cythonized the `Autorun` class, embedding it directly within `_store_core.pyx` to access internal store state efficiently.

| Test Case | Baseline (Python) | Optimized (Cython) | Speedup |
|-----------|-------------------|--------------------|-------------|
| **Autorun Creation** | 52.6 μs | 15.5 μs | **3.40x** |
| **Reactivity Check** | 9.6 μs | 5.6 μs | **1.71x** |
| **Complex Selector** | 9.7 μs | 5.9 μs | **1.64x** |
| **Notifications** | 668 μs | 383 μs | **1.74x** |

This provides substantial improvements for applications that rely heavily on reactive state derived from the store.

## Combine Reducers Optimization (Phase 10)

In Phase 10, we optimized `combine_reducers` by moving the dispatch loop and state aggregation logic to Cython.

| Test Case | Baseline (Python) | Optimized (Cython) | Speedup |
|-----------|-------------------|--------------------|-------------|
| **10 Reducers** | 12.9 μs | 10.5 μs | **1.23x** |
| **50 Reducers** | 69.5 μs | 55.3 μs | **1.26x** |
| **100 Reducers** | 178.0 μs | 154.5 μs | **1.15x** |

The dispatch loop avoids Python attribute access overhead (`getattr`) during state decomposition and aggregates results efficiently, though gains are capped by the execution time of the underlying Python reducers.

## Build & Reproduction

To reproduce these results, you can build the extension and run the benchmarks.

### 1. Build the Extension
```bash
pip install cython
python setup.py build_ext --inplace
```

### 2. Run Benchmarks
Run the benchmark suite using `pytest-benchmark`. We recommend disabling timeouts to allow sufficient warmup time:
```bash
pytest -o "python_files=bench_*.py" --timeout=0 benchmarks/
```

### 3. Compare with Python
You can force the use of the Pure Python implementation by setting `REDUX_FORCE_PYTHON=1`. This allows you to verify the performance gains directly.

```bash
# Run Python Baseline
REDUX_FORCE_PYTHON=1 pytest -o "python_files=bench_*.py" --timeout=0 benchmarks/ --benchmark-json=baseline.json

# Run Cython Optimized
pytest -o "python_files=bench_*.py" --timeout=0 benchmarks/ --benchmark-json=optimized.json

# Compare
pytest-benchmark compare baseline.json optimized.json
```

## Files

- `redux/_store_core.pyx`: The optimized Cython `Store` implementation.
- `redux/_store_py.py`: The pure Python fallback.
- `redux/main.py`: The selector module that handles the import logic.
- `benchmarks/bench_dispatch.py`: The performance test suite for dispatch.
- `benchmarks/bench_autorun.py`: The performance test suite for Autorun.
- `benchmarks/bench_combine_reducers.py`: The performance test suite for combine_reducers.
1 change: 1 addition & 0 deletions benchmarks/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""Benchmark package for python-redux."""
Loading
Loading