Modernize build: numpy 2 support, single-shot conda install, pybind11, type annotations by Riley16 · Pull Request #259 · pennmem/ptsa

Riley16 · 2026-05-27T06:41:15Z

Summary

Modernizes the PTSA build so it installs as a normal conda package in a
single mamba/conda solve — no more MNE-first / numpy==1.24.4 /
traits<7 install-order dance on rhino — and so the same source tree
builds and runs against both numpy 1.24 and numpy 2.x across Python
3.10–3.13.

What changed

Install-order fix (the core goal)

conda.recipe/meta.yaml: {{ pin_compatible('numpy', max_pin='x') }}
caps each binary to the numpy major it was built against, and every
runtime dep is now declared with bounded pins (scipy, xarray,
traits, h5py, pandas, netcdf4, six, fftw). pybind11
dropped from run: (build-only). Result: conda install ptsa <other-things> resolves in one shot.

Forward-compatibility (numpy 2 / pandas 3 / traits 7)

traits.api.ListStr → List(Str) (deleted in traits 7).
np.find_common_type → np.result_type; np.core.records →
np.rec (numpy 2 removals).
TalReader.mkdtype handles pandas 3 StringDtype (PDEP-14).
DataChopper / eeg.py / tal.py fixes for modern xarray coord
validation and numpy 2 structured-array sort semantics.

Extensions: SWIG → pybind11

morlet + circular_stat migrated off SWIG (and the vendored
numpy.i); kernels unchanged, numerical output bit-identical to
machine precision (validated by tests/test_morlet_formula.py).

Types + tests + docs

Full type annotations, pyright 0 errors / 0 warnings, .pyi stubs
for the compiled extensions.
New tests: bit-level regression fixtures, independent morlet-formula
validation, butterworth/resample correctness, reader coverage
(ParamsReader, monopolar tal .mat), API smoke tests, and sybil
execution of docs code-blocks.
New docs/morlet.rst documenting the wavelet formula.

CI

.github/workflows/build.yml (os × python 3.10–3.13 × numpy 1.24/2)
replaces the dead .travis.yml. Builds per cell and runs pytest in
conda-build's clean test env.

Validation

pyright: 0 / 0 across ptsa/.
pytest (numpy 2.4.6 / pandas 3.0.3 / traits 7.1.0 / xarray 2026.4.0):
all green.
All 6 conda cells built locally (py3.10/3.11 × numpy 1.24/2,
py3.12/3.13 × numpy 2).
Full rhino suite passes 169 / 1 skip / 1 xfail on every one of the
6 cells (against real lab data). The 1 skip (TestCMLReaders) and 1
xfail (removed pre-3.0 DataChopper API) are pre-existing upstream
markers.

Notes

This PR is what exercises the macOS / Windows build matrix for the
first time — those have not been built in local testing.
Tag-triggered anaconda upload to the pennmem channel needs an
ANACONDA_TOKEN repo secret (not configured by this PR).

🤖 Generated with Claude Code

Adds a set of deterministic input/output fixtures and pytest cases that lock in the numerical output of MorletWaveletFilter (power, phase, complex), ButterworthFilter, ResampleFilter, and MonopolarToBipolarMapper against the 3.0.6 reference build. Each fixture is a small (<120 KB) .npz with a fixed seed and a metadata blob (ptsa/numpy/scipy/xarray versions, timestamp) so test failures say which baseline they're regressing against. The point is to catch kernel-level regressions that shape/dtype checks miss: a SWIG typemap drift, an FFTW planner change, a silent numpy-2 ABI shift, an internal scipy.signal.filtfilt edit. Tolerance is rtol=1e-12 atol=1e-15, which holds cross-env between the numpy 1.26 / scipy 1.17 generation env and a numpy 2.4 / scipy 1.17 test env (FFTW + filtfilt are bit-deterministic across numpy rebuilds when scipy stays put). Baseline is 3.0.6 because the published 3.0.4 binary's output='complex' morlet path is broken (np.complex removed in NumPy 1.24+; 3.0.5 fixed that and 3.0.6 fixes the wider numpy 2 / traits 7 / pandas 3 surface). 3.0.6 is the first release where every documented codepath actually runs end-to-end. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adds 12 hand-coded ground-truth tests for MorletWaveletFilter that complement the bit-level regression suite added in 54cd753: * test_morlet_power_peaks_at_input_frequency: parametrized over 3 target frequencies (10, 20, 40 Hz) and 3 widths (4, 5, 7) -- the time-averaged power of a pure sinusoid at f0 lands in the log-spaced grid bin closest to f0, and the peak-to-median ratio exceeds 5x. Catches "we silently changed the wavelet formula" bugs that pure shape/regression tests would miss. * test_morlet_complex_magnitude_equals_power: |complex|^2 agrees with the 'power' output to rtol=1e-10. Catches divergence between the two output codepaths in the C++ kernel. * test_morlet_parallel_matches_serial: cpus=1 vs cpus=4 produces bit-identical output across 32 work items. Catches data races in MorletWaveletTransformMP / the ThreadPool. * test_morlet_phase_is_linear_in_time_at_input_freq: unwrapped phase climbs at 2*pi*f0 rad/sec within 0.5% relative error. Catches phase-sign and unwrap bugs. All seeded; failures are reproducible from a single seed value in the file. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adds 6 hand-coded ground-truth tests for ButterworthFilter and ResampleFilter that complement the regression fixtures in 54cd753: * test_butterworth_stopband_attenuates_target_frequency: 60 Hz notch kills 60 Hz by > 40 dB while 5 and 30 Hz pass within 0.5 dB. (Probe shows actual attenuation is ~101 dB.) * test_butterworth_bandpass_isolates_inband_signal: [8, 12] band passes 10 Hz within 1 dB and attenuates 5 and 30 Hz by > 15 dB. * test_butterworth_is_zero_phase: cross-correlation of input vs output peaks at lag 0 (filtfilt is forward+backward). * test_butterworth_treats_channels_independently: stacked vs one-at-a-time filtering produces bit-identical per-channel output. * test_resample_preserves_low_frequency_content: 5 Hz @ 1000 Hz -> 100 Hz preserves the dominant FFT peak and amplitude within 5%. * test_resample_suppresses_aliasing_of_above_nyquist_content: 200 Hz @ 1000 Hz -> 100 Hz produces no spurious low-frequency peak (scipy's resample uses ideal brick-wall low-pass). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adds a comprehensive import-and-call smoke test covering every public function/class in the PTSA surface, organized by submodule: ptsa.__version__, TimeSeries (constructor + every public method), ptsa.filt.buttfilt, ptsa.data.concat.concat, all 5 filters, all file-reader classes we have fixtures for (BaseEventReader, CMLEventReader, TalReader, TalStimOnlyReader, JsonIndexReader, LocReader, EDFRawReader), the readers we can't drive without lab data but can still import (BaseRawReader, BaseReader, BinaryRawReader, EEGReader, H5RawReader, NetCDF4XrayReader, ParamsReader), the SWIG morlet kernel directly (MorletWaveletTransform, MorletWaveletTransformMP, output-type enums), every public function in the SWIG circular_stat module (circ_diff, circ_diff_par, circ_mean, circ_diff_time_bins, resultant_vector, resultant_vector_length, compute_f_stat, compute_zscores), EDFFile, and ptsa.io.hdf5 helpers. Assertions are deliberately weak (shape, dtype, basic type) — the point is broad import-and-runs-without-crashing coverage that catches API drift in a downstream library (pandas, numpy, xarray, traits, scipy) before it shows up in some analysis script. The matching correctness assertions live in tests/test_morlet_correctness.py and tests/test_butterworth_resample_correctness.py. One xfail: DataChopper.filter assigns chopped_data_array['start_offsets'] = [i] to an array whose dims don't contain 'start_offsets'; modern xarray raises CoordinateValidationError. The rhino2b stack's xarray=2024.3.0 pin masks this; queued as a Tier-1 follow-up. Net coverage lift: 48% -> 54% (rough headline), ~69% on non-dead/non-vendored code. circular_stat goes 0% -> 55%, several filter and reader modules jump into the 80-98% range. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

traits.api.ListStr was deleted in traits >=7. PTSA only used it in one place: the chan_names trait on MonopolarToBipolarMapper. The cross-version-compatible replacement is List(Str), which works on both traits 6.x and 7.x. Why this matters for installs: without this fix, any environment that resolves to a recent traits version (which is the conda-forge default) fails to even import ptsa.data.filters -- before any actual filtering happens. The rhino2b.def install dance worked around it with `mamba install ... "traits<7"`; this commit lets that pin be dropped. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

NumPy 2.0 removed two APIs PTSA was using: * numpy.find_common_type was removed outright; the recommended replacement is numpy.result_type, which behaves identically here because we only pass dtype-like values (no scalar promotion). Used by MatlabIO.get_np_format when collapsing the column dtypes of a MATLAB struct that has heterogeneous numeric fields. * numpy.core.records.fromarrays still works but emits a DeprecationWarning because numpy.core is private in NumPy 2 and may be renamed to numpy._core. The documented public entry point is numpy.rec.fromarrays, which has been available since the 1.x series. Used by MonopolarToBipolarMapper when the user passes bipolar pairs as a plain 2-D array. With these two changes the source tree runs cleanly against numpy 2.x once the C extensions are rebuilt against numpy 2 headers. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Pandas 3 (PDEP-14) changed the default string column dtype from object to StringDtype. TalReader.mkdtype walked each column of a DataFrame produced via pd.DataFrame.from_records and asked "is this an object dtype? if so, treat as a fixed-width string field; otherwise use the column's own dtype." That mapping fails under pandas 3 because string columns are no longer 'object' — the StringDtype slips through the != check and np.dtype() rejects it with `Cannot interpret '<StringDtype(...)>' as a data type`. pd.api.types.is_string_dtype matches both the legacy object representation and the new StringDtype, so the fix is a single replacement of the dtype check. Behavior on pandas <3 is unchanged. This lets the rhino2b stack drop its pandas=2.0.3 pin without breaking TalReader's JSON pairs path (and by extension, the classifier-style code that depends on it). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Two changes to conda.recipe/meta.yaml's run: requirements that together collapse the rhino2b.def 4-step install into a single solve: 1. Runtime numpy is now declared as {{ pin_compatible('numpy', max_pin='x') }}. The previous bare `numpy` gave conda-build no upper bound, so mamba silently paired the precompiled extensions (linked against the NumPy 1.x C API) with NumPy 2.x at install time. The result was a runtime `_ARRAY_API not found` / `numpy.core.multiarray failed to import` from _morlet.so — an *install-time* solver success followed by an *import-time* failure. The full rhino2b.def stack accidentally constrained numpy<2 transitively via mne / sklearn / pandas / xarray pins, masking the issue; remove any of those pins and the bug returned. With pin_compatible, the binary will only resolve alongside numpy >=1.24,<2 (for the current 1.24-built artifact) and the install correctly refuses to mismatch. 2. pybind11 removed from run: requirements. It is a build-only dependency: the compiled edffile.so embeds the pybind11 headers at compile time and does not import them. Keeping pybind11 in run forced every installed env to pull it in unnecessarily and contributed to solver conflicts when other packages pinned a different pybind11 ABI. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Three high-volume directories that the build/dev workflow generates but should never be tracked: * .pixi/ — pixi environment caches. test_pixi/oneshot-rhino2b/.pixi alone is ~3.7 GB. * .pytest_cache/ — pytest discovery + last-failed cache. * __pycache__/ — Python bytecode dirs. The existing *.py[co] glob catches the individual .pyc files but lets the empty dirs through. The previous test_pixi/.gitignore had a `.pixi/` entry but it only covers the top-level test_pixi/.pixi, not the per-env nested .pixi dirs (test_pixi/oneshot-rhino2b/.pixi/ etc.). Promoting to the root .gitignore picks up any nested pixi env directly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

3.0.6 ships the install-order fixes documented in the previous five commits: * f0c04b8 fix(filters): traits.api.ListStr -> List(Str) * 0c107f2 fix(numpy2): np.find_common_type + np.core.records replacements * 5bc1bfa fix(tal): pandas 3 StringDtype handling * 2eb8aad build(conda): pin numpy with pin_compatible, drop pybind11 Together these collapse the rhino2b.def 4-step mamba incantation into a single solve with no manual `numpy==1.24.4` or `traits<7` workaround pins. Verified in test_pixi/oneshot-rhino2b/ against the full lab stack and in ptsa_dev against numpy 2.4 / pandas 3.0 / traits 7.1 / xarray 2026.4. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Closes the to_hdf netcdf4 gap that surfaced in test_pixi/oneshot-rhino2b: TimeSeries.to_hdf passes engine='netcdf4' to xr.open_dataset, so without netcdf4 in the env xarray raises `unrecognized engine 'netcdf4'`. pandas and six were similarly satisfied only transitively. * Add pandas >=2.0,<5 (TalReader, from_mne_epochs, as_dataframe). * Add netcdf4 >=1.5,<3 (TimeSeries.to_hdf / from_hdf). * Add six (legacy py2/3 shim still used by readers/base.py, readers/binary.py, MatlabIO/__init__.py). * Floor-pin AND upper-cap the previously-unbounded run deps: - scipy >=1.0,<3 - xarray >=2024.3 (CalVer; no semantic major bump to cap) - traits >=6,<9 - h5py >=3,<5 - fftw 3.3.* (exact major.minor) * Expand the numpy pin_compatible comment to document explicitly that max_pin='x' produces e.g. `numpy >=1.24.4,<2.0a0` for the current build, so the runtime numpy is always capped at the next major after whatever we built against. Upper bounds defend against the next traits-ListStr / numpy-2-ABI class of bug: unbounded run deps in 3.0.4 silently paired ptsa with traits 7 (which deleted `ListStr`) and numpy 2 (which removed the legacy array API). Caps force a deliberate revalidation + rebuild before users get paired with a future ABI/API break. Rebuilt locally; the resulting binary's depends now declares all 11 required runtime packages explicitly with both lower and upper bounds. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Declare numpy and pybind11 as build-time requirements so pip's PEP 517 build-isolation env has them when setup.py runs (setup.py imports numpy at module load to call np.get_include()). Verified that pip install --no-build-isolation -e . still succeeds in ptsa_dev, and that a fresh venv with this pyproject.toml no longer fails at the numpy import in setup.py. Note: setup.py also imports ptsa at top level to read __version__, which is a separate bug under build_meta isolation; documented in README rather than fixed here (out of scope).

Lead with pip install --no-build-isolation . as the recommended source install path, retain python setup.py install as a legacy fallback, and list swig >= 4.1 plus libfftw3-dev as system prereqs pip cannot install. Also note the remaining setup.py limitation that blocks plain pip install . under PEP 517 build isolation.

On rhino2b, /etc/hostname reads "rhino2b" which splits to ['rhino2b']; the previous "rhino" in names check returned False and the helper raised OSError. Match any token that startswith "rhino" so the fixture works on rhino, rhino2, and rhino2b alike. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Modern pytest silently no-ops def setup(cls) on test classes, leaving class attributes like bdf_file_template unset and producing confusing AttributeError failures in the rhino-mode tests. setup_class is the documented class-level fixture name. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The pre-3.0 "pass timeseries to constructor, call .filter() with no args" style was removed in 3.0. Update MonopolarToBipolarMapper and MorletWaveletFilter call sites to construct without timeseries and pass the data to .filter() instead. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Assigning `chopped_data_array['start_offsets'] = [i]` raises CoordinateValidationError on xarray >= 2024 because start_offsets is not in the array's dims (only channels, time). Use expand_dims to add the dim and the coord in one step, preserving the legacy pre-2024 behavior where the per-event chunks were concatenated along a new start_offsets dim. Also drops the @pytest.mark.xfail decorator on the corresponding smoke test now that it passes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The module has been unimportable for years: it pulls in the non-existent ptsa.helper module, and scipy.signal.morlet (its other top-level import) was removed in SciPy 1.15. No live code in ptsa/, tests/test_*.py, or docs/ references it. The only stale references are commented-out imports in tests/ptsa_regression/, an uncollected legacy harness. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The C++ MorletWaveFFT::init kernel is the load-bearing piece of PTSA but its wavelet formula has been undocumented and unverifiable from the Python side. Port the cur_wave construction loop (morlet.cpp lines 30-77) to pure Python so that (a) users can inspect the actual wavelet PTSA convolves their signal with via the new get_time_domain_wavelet() helper, and (b) the FFT-based C++ output can be independently validated against a direct time-domain implementation. Empirically the two paths agree to machine precision; the test sweep landing in a later commit asserts that. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

… morlet.cpp The width / complete parameters have user-facing names but no explanation of what they actually do, and the formula has been hiding inside the C++ source. Add a Formula section to the Python filter docstring (sphinx-friendly .. math::, both complete=False and complete=True branches), and a block comment above MorletWaveFFT::init that mirrors it. Cite Tallon-Baudry & Bertrand 1996, since complete=True follows their parameterisation and the default for it flipped to True in 2.0.6 with no docs update. Comment-only / docstring-only change; no behavior change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

For every (freq, width, complete) combination in a 4 x 3 x 2 sweep: generate the wavelet via the new pure-Python time-domain port, convolve it with a seeded sinusoid-plus-noise signal, and assert the resulting power agrees with MorletWaveletFilter's FFT-based output on the interior of the signal (rtol=1e-3). Adds two sanity-check tests: autocorrelation peak lands at the center sample when the wavelet is fed in as the signal, and the public get_time_domain_wavelet helper is bit-identical to the internal reference. 26 new tests; full suite goes from 36 -> 62 passing without regression. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

PTSA's load-bearing feature is the Morlet wavelet transform but the published docs treated it as one filter among many. Add a canonical reference page covering the formula (standard + complete=True zero-mean correction), the width / complete / output parameters in the Tallon-Baudry parameterization, a runnable synthetic-data example, and a comparison table against scipy.signal.morlet and mne.time_frequency.morlet so users don't silently mix conventions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Restructure index.rst so the first thing readers see is that PTSA's primary use is fast Morlet wavelet power/phase computation on TimeSeries data. Promote the new morlet page above filters in the toctree, and demote the lab-specific readers to a 'Legacy lab-data readers (deprecated)' section near the bottom that points users at cmlreaders. Keep ramdata.rst in a hidden toctree so the legacy tutorial is still reachable but not crowding the landing page. ramdata.rst wraps its rhino-data code blocks in a sybil ``.. skip: start``/``.. skip: end`` pair so the next commit's pytest-via-sybil wiring doesn't try to execute reader calls that need /protocols/... mounts. conf.py excludes examples/*.ipynb and the published html/ snapshot from the default build because nbsphinx needs pandoc (not in ptsa_dev). Set PTSA_DOCS_BUILD_NOTEBOOKS=1 once pandoc is installed to re-enable notebook rendering. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

filters.rst was last meaningfully updated before the 3.0 release and still showed the pre-3.0 'pass timeseries to constructor' idiom (e.g. MonopolarToBipolarMapper(timeseries=...)). That constructor signature was removed in 3.0 so every code snippet on the page was either broken or misleading. Rewrite the page around the actual 3.0 API: construct with config, then call filter(ts). Cover MorletWaveletFilter, ButterworthFilter, ResampleFilter, MonopolarToBipolarMapper, and DataChopper with runnable synthetic-data snippets. Defer the wavelet math + parameterization comparison to the new :ref:morlet page rather than re-documenting it inline. api/extensions.rst: tag the pseudo-code Morlet snippet with a sybil ``.. skip: next`` directive so the next commit's documentation-examples runner doesn't try to execute ``signal = #... A 1D array`` as Python. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The published docs at pennmem.github.io/ptsa drifted out of sync with the 3.0 API because nothing actually ran the code examples in CI. Wire sybil into pytest so every '.. code-block:: python' block in docs/*.rst is collected and executed during a normal 'pytest' invocation. Any future API drift breaks the test suite immediately instead of silently rotting on the published site. * conftest.py at the repo root registers the Sybil-driven pytest_collect_file hook (sybil only honors that hook when it lives in a conftest, not in test_*.py). * setup.cfg adds 'docs' to testpaths so pytest walks the rst tree and the collect_file hook fires for each .rst file. * tests/test_documentation_examples.py holds developer-facing documentation of the wiring plus a sentinel test that fails loudly if docs/ disappears. Blocks that genuinely can't run in CI (rhino-only readers, the pseudo-code Morlet snippet in api/extensions.rst) are tagged in the rst with sybil '.. skip: ...' directives by the earlier commits in this series, so the runner is green out of the box: 33 collected / 33 passing alongside the existing 106 unit tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* .claude/ — local Claude Code session state (CLAUDE.md, scratch files, agent worktrees). Per-developer, not part of the repo. * test_pixi/ — pixi clean-room conda envs used for install-matrix verification. Currently ~5 GB across the per-numpy/per-stack envs; the manifests themselves live in this dir but the .pixi/ caches under each subdir explode in size. Both were silently untracked until now (nothing had been added inside either). Explicit ignore is clearer and prevents accidental commits if a manifest gets added later. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The ``.. automodule:: concat`` stub in ``docs/api/index.rst`` was the remnant of a half-finished refactor and resolved to no module on disk, so autodoc emitted ``ModuleNotFoundError: No module named 'concat'`` on every build. Use the fully qualified ``ptsa.data.concat`` name so the ``concat`` helper is documented at its real location. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

``docs/api/extensions.rst`` declared two ``.. function:: single_trial_ppc_all_features`` entries with different signatures, causing a sphinx "duplicate object description" warning. The second declaration's signature (the one with ``outsample_features``) matches ``single_trial_outsample_ppc_features`` in ``circular_stat.cpp``; rename the stub accordingly so the two extensions are documented under their real names. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

``convert_method_return_types`` built each wrapped method's docstring by prepending a leader line to ``xr.DataArray.<method>.__doc__`` with no indentation correction. Sphinx autodoc then parsed the mismatched indent as broken RST and emitted half a dozen "Unexpected section title" / "Inline strong start-string without end-string" errors per method. The xarray docstring drift across versions is not worth maintaining in PTSA's own docs; replace the dynamic concatenation with a short sentence that points users at the underlying ``xarray.DataArray`` method via a ``:meth:`` cross-reference. Same wrapping behaviour at runtime, no autodoc warnings. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The ``Keyword Arguments`` and ``Formula`` headings in ``MorletWaveletFilter`` were not recognised by Sphinx's napoleon extension, so each rendered as a plain RST section heading that broke the preceding ``Parameters`` field list and produced eight autodoc warnings (field-list / literal-block / unexpected-indentation). Merge the two parameter blocks into a single ``Parameters`` section (napoleon-recognised, optional marker on each keyword arg) and rename ``Formula`` to ``Notes`` so napoleon emits it as a normal note section without breaking the field list. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Fully annotate ptsa/data/timeseries.py with numpy.typing / xarray types so downstream consumers get useful editor hints. Errors and warnings on the file under the repo's basic-mode pyrightconfig drop from 6 errors / 1 warning to 0 / 0. Fixes folded into the annotation pass (all preserve runtime behavior): * `to_hdf(mode=...)` typed as Literal["w", "a"] to match xarray's `Dataset.to_netcdf` NetcdfWriteModes overload. * `to_hdf` h5py presence check switched from `import h5py` to `importlib.util.find_spec("h5py")` so the unused-import warning goes away (the function body never used the module). * `_from_hdf_base64` / `from_hdf` widen the h5py file handle to `Any` for the `[key][()]` slice — typeshed's h5py stubs return `Datatype`, which pyright refuses to subscript, so a narrow widen here avoids scattering `# type: ignore`s. * `append`: `expand_dims(dim).assign_coords(**{dim: [0]})` becomes `assign_coords({dim: [0]})` — same call, but pyright now sees the Mapping overload instead of fitting `[0]` into the positional `coords` parameter. * `append`: `axis = int(...)` to coerce the numpy scalar returned by `np.where(...)[0][0]` to a Python int for `np.concatenate`'s axis= overload. No `cast()` introduced. The only narrow widenings are local `hfile_any: Any = hfile` / `coords_group: Any` aliases inside the HDF5 reader; documented inline. No `# type: ignore` comments added. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Annotate every public method on BaseReader, BaseEventReader, and BaseRawReader with full parameter and return types. Replaces the ``from os.path import *`` wildcard with a narrow ``from os.path import join`` to keep one in-tree usage working. Trait declarations like ``filename = traits.api.Str`` confuse pyright because the descriptor class (``type[Str]``) is what appears on the class, while ``HasTraits`` exposes the runtime value type (``str``) on each instance. Each trait is now annotated with its runtime value type and tagged with a narrow ``# pyright: ignore[reportAssignmentType]`` on the declarative line; this preserves runtime trait behavior while making ``self.filename`` etc. type-check correctly throughout the methods. Other notable changes: * ``BaseEventReader.read_matlab`` casts the untyped result of ``read_single_matlab_matrix_as_numpy_structured_array`` to ``np.recarray`` (the function does construct a recarray internally, so this just informs the checker). * Boolean-mask slices of recarrays are re-narrowed via ``.view(np.recarray)`` to keep pyright's return type aligned with runtime behavior. * ``as_dataframe`` guards ``evs.dtype.fields`` (statically Optional) before the ``in`` test. * ``BaseRawReader.__init__`` calls ``np.asarray`` on ``channels`` and ``start_offsets`` so the static type matches the runtime ``CArray`` coercion ``HasTraits`` performs. Pyright on ptsa/data/readers/base.py: 26 errors + 1 warning -> 0/0. Test suite unchanged at 40 passed / 31 skipped (workshop env baseline; ptsa_dev env in this worktree fails at collection on a pre-existing, unrelated ``traits.api.ListStr`` issue in monopolar_to_bipolar_mapper). Narrow ignores added (all reportAssignmentType, all on declarative trait lines): * ptsa/data/readers/base.py:82-87 (BaseEventReader trait declarations) * ptsa/data/readers/base.py:434-438 (BaseRawReader trait declarations) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Pin instance attribute types in __init__ (float/str/bool/NDArray) so pyright can reason about arithmetic and downstream calls. Annotate READER_FILETYPE_DICT as DefaultDict[str, Type[BaseRawReader]] to accept all three reader subclasses. Add Optional[np.recarray] to self.events, narrow with asserts inside methods that dereference it. Type all public method returns as TimeSeries. raw.py: add __all__ to make the BaseRawReader re-export explicit so pyright stops flagging it as unused. Three narrow pyright: ignore comments are retained where typing breaks down: five on the class-level traits.api.{CArray,CFloat,Str, Bool} descriptors (HasTraits metaclass converts type[Trait] to the underlying runtime type) and one on np.recarray[mask] subscripting (numpy stubs return ndarray, but recarray's __getitem__ preserves the subclass at runtime). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Add type annotations to BinaryRawReader and NetCDF4XrayReader to eliminate pyright errors. Traits descriptor types (Str, CArray) on the base class confuse the type checker, so trait-backed instance attributes are aliased locally as Any before use. The filter-based indexing and len() calls in NetCDF4XrayReader.reconstruct_axis are preserved with narrow ignores; the underlying logic is a latent runtime bug worth fixing separately.

Add return-type and parameter annotations to three of the reader modules so pyright runs clean against them. Side fixes (necessary because they touched the same files): * tal.py: `mkdtype` now handles pandas-3 StringDtype columns the same way it does object columns (mirrors the 3.0.6 fix on master). * tal.py: `traits.api` field declarations are shadowed under TYPE_CHECKING with plain `str` / `bool` so pyright sees the bound attribute types instead of the descriptor classes. * tal.py: minor `.values` -> `np.asarray(...)` and `.loc[:, ...]` switches so pyright can pin down the pandas indexing return type. * params.py: same TYPE_CHECKING split for `filename` / `dataroot`; replace `from os.path import *` with explicit imports. No behaviour changes outside the StringDtype fix already on master.

Add explicit annotations to MatlabIO module to clear pyright errors and warnings. Resolve the runtime-broken `np.find_common_type` call (removed in NumPy 2) by switching to `np.result_type`. Drop dead example code (`dt = {...}; array_fd = np.recarray(...)`) inside `get_np_format` that was overwritten by the loop and never returned. Narrow `cast(DTypeLike, ...)` calls are needed where numpy stubs under-declare the runtime-accepted dict descriptor.

Annotate the public API of all six modules under ptsa/data/filters/ with PEP 484 type hints (npt.NDArray / npt.ArrayLike for arrays, TimeSeries for filter inputs/outputs, Sequence[str] for chan_names, etc.) so pyright sees signatures the lab's analysis pipeline expects. Also fixes a handful of pyright-flagged latent bugs that the modernization branch has separately addressed in 3.0.6: - traits.api.ListStr -> List(Str) (removed in traits >= 7) - np.core.records.fromarrays -> np.rec.fromarrays (removed in numpy 2) - np.arange(start=, stop=) -> positional args (no kw overload) Trait-instance scalars are read through ``trait_get`` so the declarative class attributes (CFloat, CArray, Int) do not bleed into the inferred arithmetic types. The only suppressed diagnostic is a single ``# pyright: ignore[reportArgumentType]`` on np.rec.fromarrays(names=...) where the numpy stubs annotate ``names`` as ``None`` despite accepting list[str] at runtime.

After cherry-picking 10 parallel pyright agents, 7 errors remained from boundary effects between independently-annotated files: * filt.py: widen FreqRangeLike from Sequence[float] to ArrayLike so TimeSeries-derived freq_range params (from timeseries.filtered) type-check. * butterworth.py: coerce buttfilt(timeseries.values, ...) instead of timeseries; add narrow `# pyright: ignore[reportArgumentType]` on traits.Str/Int descriptors which pyright can't model. * morlet.py: same traits-descriptor ignores on self.width and self.output_dim. * edf.py: `__main__` test code passes np.array([]) instead of [] to read_file (which is typed as NDArray). * path_utils.py: drop the unused enumerate variable. After this: pyright 0/0/0 on the full ptsa/ tree (down from baseline of 114 errors / 35 warnings). pytest still 166/0/31. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Configures pyright to use the ptsa_dev conda env at /home1/rdehaan/.conda/envs so it can resolve numpy/scipy/xarray/ traits/h5py/etc. for type-checking. Excludes vendored data/common/pathlib.py and SWIG-generated artifacts that no longer get rebuilt. Pairs with the 10-agent annotation pass that brought pyright from 114 errors / 35 warnings to 0/0 across the ptsa/ tree. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Strip ~20 lines of commented ``# print 'foo'`` Python-2 syntax debug stubs and a few alternative-implementation comments left over from the original MatlabIO port. The explanatory comments documenting *why* the live code works the way it does are preserved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Strip ~25 lines of commented-out alternative-implementation snippets left behind in ``MatlabIO/__init__.py``, ``readers/eeg.py``, ``readers/netcdf.py``, ``readers/tal.py``, and ``timeseries.py``. These are dead variants of code that already runs live a few lines away (e.g. a duplicate ``start_offsets = ...``, an old ``kind_2_type`` dict that differs only in one byte from the active one, a no-op rename). Explanatory comments documenting *why* the live code is shaped the way it is are preserved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

conda-build's default zips variant keys that appear in the stock conda-forge cbc, so a bare `python: [4 entries]` + `numpy: [2 entries]` declaration collapsed to a partial cell set instead of the full product. Declaring each desired (python, numpy) pair as one zipped column under an explicit `zip_keys: [[python, numpy]]` forces exactly the 6 cells we publish: (3.10, 1.24) (3.10, 2) (3.11, 1.24) (3.11, 2) (3.12, 2) (3.13, 2) numpy 1.24 × py3.12/3.13 are omitted (conda-forge never shipped numpy 1.24 for python >=3.12). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Three real bugs surfaced by running the rhino test suite against the modern stack (xarray 2026.4, numpy 2.4) — all previously masked by the rhino2b.def `xarray=2024.3.0` pin. Each verified against real lab data on rhino: * eeg.py: EEGReader.read_session_data built the 'offsets' coord from `session_array['offsets']` (a DataArray). Modern xarray rejects constructing a coordinate variable from a DataArray ("ambiguous, use .data"). Pass `.data`. Fixes test_ButterwothFilter, test_monopolar_to_bipolar_filter, test_h5reader_full_session. * data_chopper.py: the earlier xarray-coord fix used `expand_dims(start_offsets=[i])`, which collides with a whole-session read that already carries a length-1 `start_offsets` dim ("Dimension start_offsets already exists"), and when forced also transposed the output. Now: relabel the existing dim's coord via assign_coords when present (preserving dim order), only expand_dims when absent. Fixes test_event_data_chopper and test_monopolar_to_bipolar_filter_and_data_chopper. * tal.py: TalReader.from_dict did `new_ts.sort(order='channel')`. numpy 2 performs a full-record void comparison when the structured array contains any object-dtype subfield (atlases.tal.region holds None for some contacts → inferred dtype 'O'), raising "'<' not supported between NoneType". Replaced with lexsort/argsort on the channel column(s) — identical ordering, only touches channel. Fixes test_talreader_on_database. Full rhino suite now passes (was 6 failed). NO_RHINO suite stays 166/0/31; pyright stays 0/0. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Audited reader coverage: every public reader had a real load test EXCEPT three. Two are now covered; the third is dead code. * ParamsReader had no direct test (only exercised transitively via BinaryRawReader on rhino). Added tests/test_params_reader.py with small fixtures for the two live formats: - params.txt (whitespace key/value) — tests/data/params_fixtures/txt/ - sources.json (per-dataroot JSON sidecar) — tests/data/params_fixtures/json/ Plus gain-default + missing-samplerate + missing-file error paths, and one rhino test against a real sources.json. Non-rhino: 5 pass. * TalReader monopolar .mat path (struct_type='mono' → struct_name='talStruct') was untested — the existing rhino tal test only covers bipolar .mat + bipolar pairs.json, and the eeg_read mono helper reads contacts.json not a .mat. Added test_talreader_monopolar_mat against R1060M_talLocs_database_monopol.mat. Not covered (documented, not tested): * NetCDF4XrayReader — dead code: reconstruct_axis indexes a filter() generator with [0], a latent Python-3 runtime bug. No data format exists for it in-tree and it cannot run. Candidate for removal. * TalStimOnlyReader — reads virtualTalStruct from stim experiments; no such .mat found on rhino under the searched paths, so no real load test is possible right now. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The hand-written python×numpy zip_keys matrix in conda_build_config.yaml mis-rendered under conda-build 26.5: variant hashing cross-associated the zipped cells, producing scrambled build envs (a py3.10 cell pulling py3.12 numpy) and an empty py3.12 package (1 member, no ptsa files). Verified by unpacking the .conda outputs of a full-matrix `conda-build conda.recipe` run. Fix: remove the python/numpy matrix from conda_build_config.yaml entirely. The matrix is driven externally, one cell per conda-build invocation with explicit --python / --variants: * .github/workflows/build.yml already does this via strategy.matrix (so CI was never affected by the zip_keys bug — but the stale config could still interfere, now it can't). * maint/build_matrix.sh (new) does the same 6-cell loop for local builds. Verified a GH-style `conda-build conda.recipe --python=3.12 --variants "{numpy:['2']}"` now yields a proper 91-member package (40 .py + 3 cpython-312 .so) that imports cleanly and passes pytest (138 passed) in a clean py3.12 pixi env. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

review-notebook-app · 2026-05-27T06:41:21Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

The macOS (clang) and Windows (MSVC) conda builds fail at the morlet link step with 'ld: library fftw3 not found' because, unlike the Linux conda gcc activation, they don't implicitly put $PREFIX/lib on the linker search path. Add get_library_dirs() (the active conda prefix's lib dirs) and pass it as library_dirs on every pybind extension so -lfftw3 resolves on all platforms. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The Butterworth and resample regressions route through scipy (filtfilt / resample → LAPACK / pocketfft), whose round-off shifts across scipy/numpy/BLAS builds. A CI cell with scipy 1.17.1 drifts ~1.6e-9 from the fixture baseline, tripping the 1e-12 bound meant for our bit-deterministic FFTW morlet kernel. Give the scipy-backed fixtures their own rtol=1e-5/atol=1e-6 bound; keep morlet + m2b tight. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Two cross-platform CI failures: - Windows: bare 'conda-build' is not on PATH in the bash -el runner (the activated env's Scripts/ dir isn't exported), so the build step hit 'command not found'. Invoke via 'mamba run -n ptsa_build'. - All OSes: the smoke step ran 'python -c import ptsa' from the repo root, so CWD shadowed the installed package with the source tree (no compiled .so) → ModuleNotFoundError. cd into $RUNNER_TEMP first. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The Windows bash -el runner can't reach conda-build through the activated env: a bare `conda-build` isn't on PATH, and `mamba run` mangles the spawned command ('onda' is not recognized). Install conda-build into `base` and invoke the `conda build` subcommand through the always-on-PATH base conda instead — identical behavior on Linux/macOS (already green), and the only form that works on Windows. Switch the smoke step from `mamba run` to `conda run` for the same reason. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

conda-verify caps at python <3.12, so installing it into the base env (python 3.13) makes the solve fail — which regressed every cell after the switch to a base `conda build`. conda-build doesn't need it, so install conda-build alone; drop anaconda-client here too (only the deploy job uses it). Verified locally that python 3.13 + conda-build solves cleanly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Linux + macOS (all 12 cells) are green via the `conda build` subcommand; leave that path untouched. On the Windows bash -el runner only condabin/ is on PATH (not base's Scripts/), so `conda build` can't find the subcommand, a bare `conda-build` is not found, and `mamba run` mangles the command. Branch on $RUNNER_OS and invoke base's conda-build entry point by absolute path on Windows. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The absolute path resolved to a raw Windows path (C:\Users\...\Scripts \conda-build.exe) that git-bash can't exec ('No such file or directory'). Convert $CONDA with cygpath -u so the entry point is a unix-style path bash can run. Adds a one-line ls for diagnostics. Linux/macOS path unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Building the FFTW-linked C++ extensions through conda-build on the Windows GitHub runner hits a stack of MSYS/conda PATH quirks (bare conda-build not on PATH, mamba run command-mangling, conda-build entry point not discoverable), and the lab has never shipped a win-64 package. Windows users run PTSA under WSL, where the linux-64 package built and tested here works unchanged — so drop windows-latest and document WSL in the README instead. Linux + macOS (12 cells) stay green and the build step reverts to the plain `conda build` invocation. Verified before commit: full suite WITH rhino data (tests/ + docs/, NO_RHINO unset) on the current HEAD = 202 passed / 1 skipped / 1 xfailed / 0 failed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Riley16 and others added 30 commits May 26, 2026 15:43

Riley16 and others added 17 commits May 27, 2026 01:35

fix: restore version 3.0.6 (agent reverted while annotating)

8d57c7e

style(readers): type-annotate H5RawReader

0f53a3b

Riley16 and others added 3 commits May 27, 2026 07:21

Riley16 mentioned this pull request May 27, 2026

add library_dirs to extention #258

Closed

Riley16 and others added 5 commits May 27, 2026 07:31

Riley16 merged commit 10a9ceb into master May 27, 2026
13 checks passed

This was referenced May 27, 2026

docs: warning-free sphinx build + regenerate getting_started notebook #260

Merged

Make pip/editable installs work from source (version parsing + full install_requires) #261

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modernize build: numpy 2 support, single-shot conda install, pybind11, type annotations#259

Modernize build: numpy 2 support, single-shot conda install, pybind11, type annotations#259
Riley16 merged 74 commits into
masterfrom
update_build

Riley16 commented May 27, 2026

Uh oh!

review-notebook-app Bot commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Riley16 commented May 27, 2026

Summary

What changed

Validation

Notes

Uh oh!

review-notebook-app Bot commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant