Skip to content

Comments

Remove non-functional joblib parallelization from TrendFit#428

Merged
blalterman merged 2 commits intomasterfrom
refactor/remove-joblib-parallelization
Feb 17, 2026
Merged

Remove non-functional joblib parallelization from TrendFit#428
blalterman merged 2 commits intomasterfrom
refactor/remove-joblib-parallelization

Conversation

@blalterman
Copy link
Owner

Pull Request Description

Remove joblib parallelization from TrendFit.make_1dfits(). It was added in commit 8e4ffb2c but has never worked on the target platform (macOS + Python 3.12):

Backend Result
loky (default) Deadlocks — macOS spawn start method + loky process pool
multiprocessing Pickle error — nested closure can't be serialized
threading Works but no speedup — GIL blocks CPU-bound scipy curve_fit
Sequential (n_jobs=1) ✅ Always worked

Changes (-530 lines)

  • trend_fits.py: Remove joblib imports, JOBLIB_AVAILABLE flag, and simplify make_1dfits() to sequential-only (100% coverage)
  • test_trend_fits_advanced.py: Remove TestTrendFitParallelization class (7 tests), unused imports, update integration test
  • pyproject.toml: Remove performance = ["joblib>=1.3.0"] optional dependency
  • benchmarks/fitfunctions_performance.py: Delete entirely (179 lines, sole purpose was parallel vs sequential benchmarking)

All Phase 4 features in core.py (residuals use_all, in-place mask ops, xoutside/youtside) are preserved — zero joblib dependency.

Type of Change

  • Code refactoring

Testing

  • All tests pass locally (pytest -q) — 2188 passed
  • Test coverage ≥95% (pytest --cov=solarwindpy --cov-report=term) — trend_fits.py at 100%; project-wide at 82% (pre-existing)
  • Physics validation passed (.claude/hooks/test-runner.sh --physics)
  • New tests added for new functionality — N/A (removal only)

Code Quality

  • Code follows NumPy docstring conventions
  • Formatted with black (black solarwindpy/ tests/)
  • Linted with flake8 (flake8 solarwindpy/ tests/)
  • Conventional commit messages used
  • Documentation built locally without warnings (cd docs && make html)
  • doc8 check passed

Attribution & Documentation

  • AI-assisted code includes "Generated with Claude Code" in commits
  • No code with incompatible licenses (GPL, proprietary, unknown)
  • Documentation updated (README, CHANGELOG, docstrings)

Breaking Changes

make_1dfits() no longer accepts n_jobs, verbose, or backend kwargs. Callers passing these will get a TypeError. This was never in a public release, so no deprecation path is needed.

Additional Context

The [performance] optional dependency group was never tested in CI and never installed in any workflow. The benchmark file existed solely for parallel-vs-sequential comparison.

🤖 Generated with Claude Code

blalterman and others added 2 commits February 10, 2026 17:51
Documents the two-repo architecture for private development with
public releases via rsync-based export script.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Joblib's loky backend deadlocks on macOS with Python 3.12's spawn
start method. The multiprocessing backend fails due to unpicklable
closures, and the threading backend provides no CPU-bound speedup.

Remove the non-functional parallel code path, related tests,
benchmarks, and the performance optional dependency. All Phase 4
features in core.py (residuals use_all, in-place mask ops,
xoutside/youtside) are preserved — they have zero joblib dependency.

Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
@claude
Copy link

claude bot commented Feb 17, 2026

Claude finished @blalterman's task —— View job


PR Review: Remove joblib parallelization from TrendFit

I've completed a thorough review of this pull request. Here's my comprehensive analysis:

🎯 Summary

This is an excellent refactoring that removes 540 lines of non-functional joblib parallelization code. The changes are well-executed, maintain backward compatibility (except for breaking changes), and improve code maintainability.

Strengths

Code Quality & Best Practices

  • Clean removal: No orphaned imports or dead code remaining
  • Consistent style: NumPy docstring conventions maintained throughout
  • Proper error handling: return_exception parameter logic preserved
  • Sequential robustness: The remaining sequential path is well-tested and reliable

Architecture & Design

  • Simplified API: make_1dfits() now has a clean, single-responsibility design
  • Preserved functionality: All Phase 4 features (residuals, mask operations) intact
  • Dependency cleanup: Properly removed performance = ["joblib>=1.3.0"] optional dependency

Testing Strategy

  • Comprehensive coverage: 7 parallelization tests removed, but integration tests preserved
  • Edge case coverage: Phase 6 tests cover all code paths including error conditions
  • Performance verification: Integration test confirms 25 measurements run efficiently sequentially

🔍 Technical Analysis

Performance Considerations

  • Sequential efficiency: pandas.Series.apply() with lambda functions is appropriate for scipy curve_fit operations
  • GIL-bound workload: Parallelization provided no benefit for CPU-bound scipy operations anyway
  • Memory efficiency: Eliminates joblib process spawning overhead
  • Platform reliability: Removes macOS-specific deadlock and pickle serialization issues

Security & Dependencies 🔒

  • Reduced attack surface: Fewer dependencies mean fewer security vectors
  • Supply chain: Removes dependency on joblib ecosystem
  • Process isolation: No more multiprocessing concerns with untrusted data

Breaking Changes Assessment ⚠️

Justified breaking changes for unused parameters:

# These kwargs will now raise TypeError
tf.make_1dfits(n_jobs=4, verbose=True, backend='loky')
  • Risk: Low - feature was never functional on target platform
  • Mitigation: Clear documentation in PR description
  • Timeline: Not in public release, so no deprecation path needed

🔍 Potential Issues & Recommendations

Minor Documentation Gap

  • Line 175: Commented code # self.make_popt_frame() should be removed for clarity
    # Remove this commented line:
    #         self.make_popt_frame()

Test Coverage Verification

  • ✅ File is at 100% coverage according to PR description
  • ⚠️ Coverage monitoring hook detected pytest/pytest-cov not available in environment
  • Recommendation: Ensure CI validates coverage ≥95% requirement before merge

Docstring Consistency

The simplified make_1dfits() docstring is accurate and clear. The method signature change is properly documented.

🏆 Overall Assessment

Rating: APPROVED ⭐⭐⭐⭐⭐

This PR exemplifies excellent refactoring practices:

  1. Clear problem identification: Documented platform-specific failures
  2. Surgical removal: No collateral damage to working features
  3. Comprehensive testing: Integration tests verify all functionality preserved
  4. Clean dependency management: Proper cleanup of optional dependencies
  5. Performance improvement: Eliminates overhead without losing functionality

📋 Recommendations Before Merge

  1. Remove commented code on line 175 in trend_fits.py
  2. Verify CI passes with coverage ≥95% requirement
  3. Confirm integration tests pass with --physics validation

The breaking changes are well-justified given the non-functional state of the removed features. This significantly improves code maintainability while preserving all working functionality.


@github-actions
Copy link
Contributor

SolarWindPy Documentation Validation - Python 3.11

Status: PASSED

Essential Metrics:

  • Files: 44
  • Tests: 20
  • Failed: 0
  • Success Rate: 100.0%
  • Approach: Targeted validation (core physics modules)

🎉 All essential documentation examples working!

@blalterman blalterman merged commit cac6fde into master Feb 17, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant