Skip to content

Modernize build: numpy 2 support, single-shot conda install, pybind11, type annotations#259

Merged
Riley16 merged 74 commits into
masterfrom
update_build
May 27, 2026
Merged

Modernize build: numpy 2 support, single-shot conda install, pybind11, type annotations#259
Riley16 merged 74 commits into
masterfrom
update_build

Conversation

@Riley16
Copy link
Copy Markdown
Contributor

@Riley16 Riley16 commented May 27, 2026

Summary

Modernizes the PTSA build so it installs as a normal conda package in a
single mamba/conda solve — no more MNE-first / numpy==1.24.4 /
traits<7 install-order dance on rhino — and so the same source tree
builds and runs against both numpy 1.24 and numpy 2.x across Python
3.10–3.13.

What changed

Install-order fix (the core goal)

  • conda.recipe/meta.yaml: {{ pin_compatible('numpy', max_pin='x') }}
    caps each binary to the numpy major it was built against, and every
    runtime dep is now declared with bounded pins (scipy, xarray,
    traits, h5py, pandas, netcdf4, six, fftw). pybind11
    dropped from run: (build-only). Result: conda install ptsa <other-things> resolves in one shot.

Forward-compatibility (numpy 2 / pandas 3 / traits 7)

  • traits.api.ListStrList(Str) (deleted in traits 7).
  • np.find_common_typenp.result_type; np.core.records
    np.rec (numpy 2 removals).
  • TalReader.mkdtype handles pandas 3 StringDtype (PDEP-14).
  • DataChopper / eeg.py / tal.py fixes for modern xarray coord
    validation and numpy 2 structured-array sort semantics.

Extensions: SWIG → pybind11

  • morlet + circular_stat migrated off SWIG (and the vendored
    numpy.i); kernels unchanged, numerical output bit-identical to
    machine precision (validated by tests/test_morlet_formula.py).

Types + tests + docs

  • Full type annotations, pyright 0 errors / 0 warnings, .pyi stubs
    for the compiled extensions.
  • New tests: bit-level regression fixtures, independent morlet-formula
    validation, butterworth/resample correctness, reader coverage
    (ParamsReader, monopolar tal .mat), API smoke tests, and sybil
    execution of docs code-blocks.
  • New docs/morlet.rst documenting the wavelet formula.

CI

  • .github/workflows/build.yml (os × python 3.10–3.13 × numpy 1.24/2)
    replaces the dead .travis.yml. Builds per cell and runs pytest in
    conda-build's clean test env.

Validation

  • pyright: 0 / 0 across ptsa/.
  • pytest (numpy 2.4.6 / pandas 3.0.3 / traits 7.1.0 / xarray 2026.4.0):
    all green.
  • All 6 conda cells built locally (py3.10/3.11 × numpy 1.24/2,
    py3.12/3.13 × numpy 2).
  • Full rhino suite passes 169 / 1 skip / 1 xfail on every one of the
    6 cells
    (against real lab data). The 1 skip (TestCMLReaders) and 1
    xfail (removed pre-3.0 DataChopper API) are pre-existing upstream
    markers.

Notes

  • This PR is what exercises the macOS / Windows build matrix for the
    first time — those have not been built in local testing.
  • Tag-triggered anaconda upload to the pennmem channel needs an
    ANACONDA_TOKEN repo secret (not configured by this PR).

🤖 Generated with Claude Code

Riley16 and others added 30 commits May 26, 2026 15:43
Adds a set of deterministic input/output fixtures and pytest cases
that lock in the numerical output of MorletWaveletFilter (power,
phase, complex), ButterworthFilter, ResampleFilter, and
MonopolarToBipolarMapper against the 3.0.6 reference build. Each
fixture is a small (<120 KB) .npz with a fixed seed and a metadata
blob (ptsa/numpy/scipy/xarray versions, timestamp) so test failures
say which baseline they're regressing against.

The point is to catch kernel-level regressions that shape/dtype
checks miss: a SWIG typemap drift, an FFTW planner change, a silent
numpy-2 ABI shift, an internal scipy.signal.filtfilt edit. Tolerance
is rtol=1e-12 atol=1e-15, which holds cross-env between the
numpy 1.26 / scipy 1.17 generation env and a numpy 2.4 / scipy 1.17
test env (FFTW + filtfilt are bit-deterministic across numpy
rebuilds when scipy stays put).

Baseline is 3.0.6 because the published 3.0.4 binary's
output='complex' morlet path is broken (np.complex removed in
NumPy 1.24+; 3.0.5 fixed that and 3.0.6 fixes the wider numpy 2 /
traits 7 / pandas 3 surface). 3.0.6 is the first release where
every documented codepath actually runs end-to-end.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds 12 hand-coded ground-truth tests for MorletWaveletFilter that
complement the bit-level regression suite added in 54cd753:

* test_morlet_power_peaks_at_input_frequency: parametrized over 3
  target frequencies (10, 20, 40 Hz) and 3 widths (4, 5, 7) -- the
  time-averaged power of a pure sinusoid at f0 lands in the
  log-spaced grid bin closest to f0, and the peak-to-median ratio
  exceeds 5x. Catches "we silently changed the wavelet formula"
  bugs that pure shape/regression tests would miss.
* test_morlet_complex_magnitude_equals_power: |complex|^2 agrees
  with the 'power' output to rtol=1e-10. Catches divergence between
  the two output codepaths in the C++ kernel.
* test_morlet_parallel_matches_serial: cpus=1 vs cpus=4 produces
  bit-identical output across 32 work items. Catches data races in
  MorletWaveletTransformMP / the ThreadPool.
* test_morlet_phase_is_linear_in_time_at_input_freq: unwrapped
  phase climbs at 2*pi*f0 rad/sec within 0.5% relative error.
  Catches phase-sign and unwrap bugs.

All seeded; failures are reproducible from a single seed value in
the file.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds 6 hand-coded ground-truth tests for ButterworthFilter and
ResampleFilter that complement the regression fixtures in 54cd753:

* test_butterworth_stopband_attenuates_target_frequency: 60 Hz
  notch kills 60 Hz by > 40 dB while 5 and 30 Hz pass within
  0.5 dB. (Probe shows actual attenuation is ~101 dB.)
* test_butterworth_bandpass_isolates_inband_signal: [8, 12] band
  passes 10 Hz within 1 dB and attenuates 5 and 30 Hz by > 15 dB.
* test_butterworth_is_zero_phase: cross-correlation of input vs
  output peaks at lag 0 (filtfilt is forward+backward).
* test_butterworth_treats_channels_independently: stacked vs
  one-at-a-time filtering produces bit-identical per-channel output.
* test_resample_preserves_low_frequency_content: 5 Hz @ 1000 Hz
  -> 100 Hz preserves the dominant FFT peak and amplitude within
  5%.
* test_resample_suppresses_aliasing_of_above_nyquist_content:
  200 Hz @ 1000 Hz -> 100 Hz produces no spurious low-frequency
  peak (scipy's resample uses ideal brick-wall low-pass).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a comprehensive import-and-call smoke test covering every
public function/class in the PTSA surface, organized by submodule:
ptsa.__version__, TimeSeries (constructor + every public method),
ptsa.filt.buttfilt, ptsa.data.concat.concat, all 5 filters, all
file-reader classes we have fixtures for (BaseEventReader,
CMLEventReader, TalReader, TalStimOnlyReader, JsonIndexReader,
LocReader, EDFRawReader), the readers we can't drive without lab
data but can still import (BaseRawReader, BaseReader,
BinaryRawReader, EEGReader, H5RawReader, NetCDF4XrayReader,
ParamsReader), the SWIG morlet kernel directly
(MorletWaveletTransform, MorletWaveletTransformMP, output-type
enums), every public function in the SWIG circular_stat module
(circ_diff, circ_diff_par, circ_mean, circ_diff_time_bins,
resultant_vector, resultant_vector_length, compute_f_stat,
compute_zscores), EDFFile, and ptsa.io.hdf5 helpers.

Assertions are deliberately weak (shape, dtype, basic type) —
the point is broad import-and-runs-without-crashing coverage that
catches API drift in a downstream library (pandas, numpy, xarray,
traits, scipy) before it shows up in some analysis script. The
matching correctness assertions live in
tests/test_morlet_correctness.py and tests/test_butterworth_resample_correctness.py.

One xfail: DataChopper.filter assigns
chopped_data_array['start_offsets'] = [i] to an array whose dims
don't contain 'start_offsets'; modern xarray raises
CoordinateValidationError. The rhino2b stack's xarray=2024.3.0 pin
masks this; queued as a Tier-1 follow-up.

Net coverage lift: 48% -> 54% (rough headline), ~69% on
non-dead/non-vendored code. circular_stat goes 0% -> 55%, several
filter and reader modules jump into the 80-98% range.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
traits.api.ListStr was deleted in traits >=7. PTSA only used it in
one place: the chan_names trait on MonopolarToBipolarMapper. The
cross-version-compatible replacement is List(Str), which works on
both traits 6.x and 7.x.

Why this matters for installs: without this fix, any environment
that resolves to a recent traits version (which is the conda-forge
default) fails to even import ptsa.data.filters -- before any
actual filtering happens. The rhino2b.def install dance worked
around it with `mamba install ... "traits<7"`; this commit lets
that pin be dropped.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
NumPy 2.0 removed two APIs PTSA was using:

* numpy.find_common_type was removed outright; the recommended
  replacement is numpy.result_type, which behaves identically here
  because we only pass dtype-like values (no scalar promotion).
  Used by MatlabIO.get_np_format when collapsing the column dtypes
  of a MATLAB struct that has heterogeneous numeric fields.
* numpy.core.records.fromarrays still works but emits a
  DeprecationWarning because numpy.core is private in NumPy 2 and
  may be renamed to numpy._core. The documented public entry point
  is numpy.rec.fromarrays, which has been available since the 1.x
  series. Used by MonopolarToBipolarMapper when the user passes
  bipolar pairs as a plain 2-D array.

With these two changes the source tree runs cleanly against numpy
2.x once the C extensions are rebuilt against numpy 2 headers.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pandas 3 (PDEP-14) changed the default string column dtype from
object to StringDtype. TalReader.mkdtype walked each column of a
DataFrame produced via pd.DataFrame.from_records and asked
"is this an object dtype? if so, treat as a fixed-width string
field; otherwise use the column's own dtype." That mapping fails
under pandas 3 because string columns are no longer 'object' — the
StringDtype slips through the != check and np.dtype() rejects it
with `Cannot interpret '<StringDtype(...)>' as a data type`.

pd.api.types.is_string_dtype matches both the legacy object
representation and the new StringDtype, so the fix is a single
replacement of the dtype check. Behavior on pandas <3 is
unchanged.

This lets the rhino2b stack drop its pandas=2.0.3 pin without
breaking TalReader's JSON pairs path (and by extension, the
classifier-style code that depends on it).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two changes to conda.recipe/meta.yaml's run: requirements that
together collapse the rhino2b.def 4-step install into a single
solve:

1. Runtime numpy is now declared as
   {{ pin_compatible('numpy', max_pin='x') }}. The previous bare
   `numpy` gave conda-build no upper bound, so mamba silently
   paired the precompiled extensions (linked against the NumPy 1.x
   C API) with NumPy 2.x at install time. The result was a runtime
   `_ARRAY_API not found` / `numpy.core.multiarray failed to
   import` from _morlet.so — an *install-time* solver success
   followed by an *import-time* failure. The full rhino2b.def
   stack accidentally constrained numpy<2 transitively via mne /
   sklearn / pandas / xarray pins, masking the issue; remove any
   of those pins and the bug returned. With pin_compatible, the
   binary will only resolve alongside numpy >=1.24,<2 (for the
   current 1.24-built artifact) and the install correctly refuses
   to mismatch.

2. pybind11 removed from run: requirements. It is a build-only
   dependency: the compiled edffile.so embeds the pybind11
   headers at compile time and does not import them. Keeping
   pybind11 in run forced every installed env to pull it in
   unnecessarily and contributed to solver conflicts when other
   packages pinned a different pybind11 ABI.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three high-volume directories that the build/dev workflow generates
but should never be tracked:

* .pixi/ — pixi environment caches. test_pixi/oneshot-rhino2b/.pixi
  alone is ~3.7 GB.
* .pytest_cache/ — pytest discovery + last-failed cache.
* __pycache__/ — Python bytecode dirs. The existing *.py[co] glob
  catches the individual .pyc files but lets the empty dirs through.

The previous test_pixi/.gitignore had a `.pixi/` entry but it only
covers the top-level test_pixi/.pixi, not the per-env nested
.pixi dirs (test_pixi/oneshot-rhino2b/.pixi/ etc.). Promoting to the
root .gitignore picks up any nested pixi env directly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
3.0.6 ships the install-order fixes documented in the previous five
commits:
* f0c04b8 fix(filters): traits.api.ListStr -> List(Str)
* 0c107f2 fix(numpy2): np.find_common_type + np.core.records replacements
* 5bc1bfa fix(tal): pandas 3 StringDtype handling
* 2eb8aad build(conda): pin numpy with pin_compatible, drop pybind11

Together these collapse the rhino2b.def 4-step mamba incantation
into a single solve with no manual `numpy==1.24.4` or `traits<7`
workaround pins. Verified in test_pixi/oneshot-rhino2b/ against the
full lab stack and in ptsa_dev against numpy 2.4 / pandas 3.0 /
traits 7.1 / xarray 2026.4.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes the to_hdf netcdf4 gap that surfaced in
test_pixi/oneshot-rhino2b: TimeSeries.to_hdf passes
engine='netcdf4' to xr.open_dataset, so without netcdf4 in the
env xarray raises `unrecognized engine 'netcdf4'`. pandas and six
were similarly satisfied only transitively.

* Add pandas >=2.0,<5 (TalReader, from_mne_epochs, as_dataframe).
* Add netcdf4 >=1.5,<3 (TimeSeries.to_hdf / from_hdf).
* Add six (legacy py2/3 shim still used by readers/base.py,
  readers/binary.py, MatlabIO/__init__.py).
* Floor-pin AND upper-cap the previously-unbounded run deps:
  - scipy >=1.0,<3
  - xarray >=2024.3  (CalVer; no semantic major bump to cap)
  - traits >=6,<9
  - h5py >=3,<5
  - fftw 3.3.*       (exact major.minor)
* Expand the numpy pin_compatible comment to document explicitly
  that max_pin='x' produces e.g. `numpy >=1.24.4,<2.0a0` for the
  current build, so the runtime numpy is always capped at the next
  major after whatever we built against.

Upper bounds defend against the next traits-ListStr / numpy-2-ABI
class of bug: unbounded run deps in 3.0.4 silently paired ptsa
with traits 7 (which deleted `ListStr`) and numpy 2 (which removed
the legacy array API). Caps force a deliberate revalidation +
rebuild before users get paired with a future ABI/API break.

Rebuilt locally; the resulting binary's depends now declares all
11 required runtime packages explicitly with both lower and upper
bounds.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Declare numpy and pybind11 as build-time requirements so pip's PEP 517
build-isolation env has them when setup.py runs (setup.py imports
numpy at module load to call np.get_include()).

Verified that pip install --no-build-isolation -e . still succeeds in
ptsa_dev, and that a fresh venv with this pyproject.toml no longer
fails at the numpy import in setup.py.

Note: setup.py also imports ptsa at top level to read __version__,
which is a separate bug under build_meta isolation; documented in
README rather than fixed here (out of scope).
Lead with pip install --no-build-isolation . as the recommended source
install path, retain python setup.py install as a legacy fallback, and
list swig >= 4.1 plus libfftw3-dev as system prereqs pip cannot
install. Also note the remaining setup.py limitation that blocks plain
pip install . under PEP 517 build isolation.
On rhino2b, /etc/hostname reads "rhino2b" which splits to ['rhino2b'];
the previous "rhino" in names check returned False and the helper
raised OSError. Match any token that startswith "rhino" so the
fixture works on rhino, rhino2, and rhino2b alike.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Modern pytest silently no-ops def setup(cls) on test classes, leaving
class attributes like bdf_file_template unset and producing confusing
AttributeError failures in the rhino-mode tests. setup_class is the
documented class-level fixture name.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The pre-3.0 "pass timeseries to constructor, call .filter() with no
args" style was removed in 3.0. Update MonopolarToBipolarMapper and
MorletWaveletFilter call sites to construct without timeseries and
pass the data to .filter() instead.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Assigning `chopped_data_array['start_offsets'] = [i]` raises
CoordinateValidationError on xarray >= 2024 because start_offsets is
not in the array's dims (only channels, time). Use expand_dims to
add the dim and the coord in one step, preserving the legacy
pre-2024 behavior where the per-event chunks were concatenated along
a new start_offsets dim.

Also drops the @pytest.mark.xfail decorator on the corresponding
smoke test now that it passes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The module has been unimportable for years: it pulls in the
non-existent ptsa.helper module, and scipy.signal.morlet (its other
top-level import) was removed in SciPy 1.15. No live code in
ptsa/, tests/test_*.py, or docs/ references it. The only stale
references are commented-out imports in tests/ptsa_regression/, an
uncollected legacy harness.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The C++ MorletWaveFFT::init kernel is the load-bearing piece of PTSA
but its wavelet formula has been undocumented and unverifiable from
the Python side. Port the cur_wave construction loop (morlet.cpp
lines 30-77) to pure Python so that (a) users can inspect the actual
wavelet PTSA convolves their signal with via the new
get_time_domain_wavelet() helper, and (b) the FFT-based C++ output
can be independently validated against a direct time-domain
implementation. Empirically the two paths agree to machine
precision; the test sweep landing in a later commit asserts that.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… morlet.cpp

The width / complete parameters have user-facing names but no
explanation of what they actually do, and the formula has been
hiding inside the C++ source. Add a Formula section to the Python
filter docstring (sphinx-friendly .. math::, both complete=False
and complete=True branches), and a block comment above
MorletWaveFFT::init that mirrors it. Cite Tallon-Baudry & Bertrand
1996, since complete=True follows their parameterisation and the
default for it flipped to True in 2.0.6 with no docs update.

Comment-only / docstring-only change; no behavior change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
For every (freq, width, complete) combination in a 4 x 3 x 2 sweep:
generate the wavelet via the new pure-Python time-domain port,
convolve it with a seeded sinusoid-plus-noise signal, and assert the
resulting power agrees with MorletWaveletFilter's FFT-based output
on the interior of the signal (rtol=1e-3). Adds two
sanity-check tests: autocorrelation peak lands at the center sample
when the wavelet is fed in as the signal, and the public
get_time_domain_wavelet helper is bit-identical to the internal
reference. 26 new tests; full suite goes from 36 -> 62 passing
without regression.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PTSA's load-bearing feature is the Morlet wavelet transform but
the published docs treated it as one filter among many. Add a
canonical reference page covering the formula (standard +
complete=True zero-mean correction), the width / complete /
output parameters in the Tallon-Baudry parameterization, a
runnable synthetic-data example, and a comparison table against
scipy.signal.morlet and mne.time_frequency.morlet so users don't
silently mix conventions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Restructure index.rst so the first thing readers see is that
PTSA's primary use is fast Morlet wavelet power/phase computation
on TimeSeries data. Promote the new morlet page above filters in
the toctree, and demote the lab-specific readers to a 'Legacy
lab-data readers (deprecated)' section near the bottom that
points users at cmlreaders. Keep ramdata.rst in a hidden toctree
so the legacy tutorial is still reachable but not crowding the
landing page.

ramdata.rst wraps its rhino-data code blocks in a sybil
``.. skip: start``/``.. skip: end`` pair so the next commit's
pytest-via-sybil wiring doesn't try to execute reader calls that
need /protocols/... mounts.

conf.py excludes examples/*.ipynb and the published html/
snapshot from the default build because nbsphinx needs pandoc
(not in ptsa_dev). Set PTSA_DOCS_BUILD_NOTEBOOKS=1 once pandoc
is installed to re-enable notebook rendering.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
filters.rst was last meaningfully updated before the 3.0 release
and still showed the pre-3.0 'pass timeseries to constructor'
idiom (e.g. MonopolarToBipolarMapper(timeseries=...)). That
constructor signature was removed in 3.0 so every code snippet
on the page was either broken or misleading.

Rewrite the page around the actual 3.0 API: construct with
config, then call filter(ts). Cover MorletWaveletFilter,
ButterworthFilter, ResampleFilter, MonopolarToBipolarMapper,
and DataChopper with runnable synthetic-data snippets. Defer
the wavelet math + parameterization comparison to the new
:ref:morlet page rather than re-documenting it inline.

api/extensions.rst: tag the pseudo-code Morlet snippet with a
sybil ``.. skip: next`` directive so the next commit's
documentation-examples runner doesn't try to execute
``signal = #... A 1D array`` as Python.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The published docs at pennmem.github.io/ptsa drifted out of sync
with the 3.0 API because nothing actually ran the code examples
in CI. Wire sybil into pytest so every '.. code-block:: python'
block in docs/*.rst is collected and executed during a normal
'pytest' invocation. Any future API drift breaks the test suite
immediately instead of silently rotting on the published site.

* conftest.py at the repo root registers the Sybil-driven
  pytest_collect_file hook (sybil only honors that hook when
  it lives in a conftest, not in test_*.py).
* setup.cfg adds 'docs' to testpaths so pytest walks the rst
  tree and the collect_file hook fires for each .rst file.
* tests/test_documentation_examples.py holds developer-facing
  documentation of the wiring plus a sentinel test that fails
  loudly if docs/ disappears.

Blocks that genuinely can't run in CI (rhino-only readers, the
pseudo-code Morlet snippet in api/extensions.rst) are tagged in
the rst with sybil '.. skip: ...' directives by the earlier
commits in this series, so the runner is green out of the box:
33 collected / 33 passing alongside the existing 106 unit tests.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* .claude/ — local Claude Code session state (CLAUDE.md, scratch
  files, agent worktrees). Per-developer, not part of the repo.
* test_pixi/ — pixi clean-room conda envs used for install-matrix
  verification. Currently ~5 GB across the per-numpy/per-stack
  envs; the manifests themselves live in this dir but the .pixi/
  caches under each subdir explode in size.

Both were silently untracked until now (nothing had been added
inside either). Explicit ignore is clearer and prevents accidental
commits if a manifest gets added later.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The ``.. automodule:: concat`` stub in ``docs/api/index.rst`` was the
remnant of a half-finished refactor and resolved to no module on disk,
so autodoc emitted ``ModuleNotFoundError: No module named 'concat'``
on every build. Use the fully qualified ``ptsa.data.concat`` name so
the ``concat`` helper is documented at its real location.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
``docs/api/extensions.rst`` declared two ``..  function::
single_trial_ppc_all_features`` entries with different signatures,
causing a sphinx "duplicate object description" warning. The second
declaration's signature (the one with ``outsample_features``) matches
``single_trial_outsample_ppc_features`` in
``circular_stat.cpp``; rename the stub accordingly so the two
extensions are documented under their real names.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
``convert_method_return_types`` built each wrapped method's docstring
by prepending a leader line to ``xr.DataArray.<method>.__doc__`` with
no indentation correction. Sphinx autodoc then parsed the mismatched
indent as broken RST and emitted half a dozen "Unexpected section
title" / "Inline strong start-string without end-string" errors per
method.

The xarray docstring drift across versions is not worth maintaining
in PTSA's own docs; replace the dynamic concatenation with a short
sentence that points users at the underlying ``xarray.DataArray``
method via a ``:meth:`` cross-reference. Same wrapping behaviour at
runtime, no autodoc warnings.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The ``Keyword Arguments`` and ``Formula`` headings in
``MorletWaveletFilter`` were not recognised by Sphinx's napoleon
extension, so each rendered as a plain RST section heading that broke
the preceding ``Parameters`` field list and produced eight autodoc
warnings (field-list / literal-block / unexpected-indentation).

Merge the two parameter blocks into a single ``Parameters`` section
(napoleon-recognised, optional marker on each keyword arg) and rename
``Formula`` to ``Notes`` so napoleon emits it as a normal note section
without breaking the field list.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Riley16 and others added 17 commits May 27, 2026 01:35
Fully annotate ptsa/data/timeseries.py with numpy.typing / xarray
types so downstream consumers get useful editor hints. Errors and
warnings on the file under the repo's basic-mode pyrightconfig drop
from 6 errors / 1 warning to 0 / 0.

Fixes folded into the annotation pass (all preserve runtime behavior):

* `to_hdf(mode=...)` typed as Literal["w", "a"] to match xarray's
  `Dataset.to_netcdf` NetcdfWriteModes overload.
* `to_hdf` h5py presence check switched from `import h5py` to
  `importlib.util.find_spec("h5py")` so the unused-import warning
  goes away (the function body never used the module).
* `_from_hdf_base64` / `from_hdf` widen the h5py file handle to
  `Any` for the `[key][()]` slice — typeshed's h5py stubs return
  `Datatype`, which pyright refuses to subscript, so a narrow widen
  here avoids scattering `# type: ignore`s.
* `append`: `expand_dims(dim).assign_coords(**{dim: [0]})` becomes
  `assign_coords({dim: [0]})` — same call, but pyright now sees the
  Mapping overload instead of fitting `[0]` into the positional
  `coords` parameter.
* `append`: `axis = int(...)` to coerce the numpy scalar returned
  by `np.where(...)[0][0]` to a Python int for `np.concatenate`'s
  axis= overload.

No `cast()` introduced. The only narrow widenings are local
`hfile_any: Any = hfile` / `coords_group: Any` aliases inside the
HDF5 reader; documented inline. No `# type: ignore` comments added.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Annotate every public method on BaseReader, BaseEventReader, and
BaseRawReader with full parameter and return types. Replaces the
``from os.path import *`` wildcard with a narrow ``from os.path
import join`` to keep one in-tree usage working.

Trait declarations like ``filename = traits.api.Str`` confuse pyright
because the descriptor class (``type[Str]``) is what appears on the
class, while ``HasTraits`` exposes the runtime value type (``str``) on
each instance. Each trait is now annotated with its runtime value type
and tagged with a narrow ``# pyright: ignore[reportAssignmentType]``
on the declarative line; this preserves runtime trait behavior while
making ``self.filename`` etc. type-check correctly throughout the
methods.

Other notable changes:

* ``BaseEventReader.read_matlab`` casts the untyped result of
  ``read_single_matlab_matrix_as_numpy_structured_array`` to
  ``np.recarray`` (the function does construct a recarray internally,
  so this just informs the checker).
* Boolean-mask slices of recarrays are re-narrowed via ``.view(np.recarray)``
  to keep pyright's return type aligned with runtime behavior.
* ``as_dataframe`` guards ``evs.dtype.fields`` (statically Optional)
  before the ``in`` test.
* ``BaseRawReader.__init__`` calls ``np.asarray`` on ``channels`` and
  ``start_offsets`` so the static type matches the runtime ``CArray``
  coercion ``HasTraits`` performs.

Pyright on ptsa/data/readers/base.py: 26 errors + 1 warning -> 0/0.
Test suite unchanged at 40 passed / 31 skipped (workshop env baseline;
ptsa_dev env in this worktree fails at collection on a pre-existing,
unrelated ``traits.api.ListStr`` issue in monopolar_to_bipolar_mapper).

Narrow ignores added (all reportAssignmentType, all on declarative
trait lines):

* ptsa/data/readers/base.py:82-87 (BaseEventReader trait declarations)
* ptsa/data/readers/base.py:434-438 (BaseRawReader trait declarations)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pin instance attribute types in __init__ (float/str/bool/NDArray) so
pyright can reason about arithmetic and downstream calls. Annotate
READER_FILETYPE_DICT as DefaultDict[str, Type[BaseRawReader]] to
accept all three reader subclasses. Add Optional[np.recarray] to
self.events, narrow with asserts inside methods that dereference it.
Type all public method returns as TimeSeries.

raw.py: add __all__ to make the BaseRawReader re-export explicit so
pyright stops flagging it as unused.

Three narrow pyright: ignore comments are retained where typing
breaks down: five on the class-level traits.api.{CArray,CFloat,Str,
Bool} descriptors (HasTraits metaclass converts type[Trait] to the
underlying runtime type) and one on np.recarray[mask] subscripting
(numpy stubs return ndarray, but recarray's __getitem__ preserves
the subclass at runtime).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add type annotations to BinaryRawReader and NetCDF4XrayReader to
eliminate pyright errors. Traits descriptor types (Str, CArray) on
the base class confuse the type checker, so trait-backed instance
attributes are aliased locally as Any before use. The filter-based
indexing and len() calls in NetCDF4XrayReader.reconstruct_axis are
preserved with narrow ignores; the underlying logic is a latent
runtime bug worth fixing separately.
Add return-type and parameter annotations to three of the reader
modules so pyright runs clean against them. Side fixes (necessary
because they touched the same files):

* tal.py: `mkdtype` now handles pandas-3 StringDtype columns the same
  way it does object columns (mirrors the 3.0.6 fix on master).
* tal.py: `traits.api` field declarations are shadowed under
  TYPE_CHECKING with plain `str` / `bool` so pyright sees the bound
  attribute types instead of the descriptor classes.
* tal.py: minor `.values` -> `np.asarray(...)` and `.loc[:, ...]`
  switches so pyright can pin down the pandas indexing return type.
* params.py: same TYPE_CHECKING split for `filename` / `dataroot`;
  replace `from os.path import *` with explicit imports.

No behaviour changes outside the StringDtype fix already on master.
Add explicit annotations to MatlabIO module to clear pyright errors
and warnings. Resolve the runtime-broken `np.find_common_type` call
(removed in NumPy 2) by switching to `np.result_type`. Drop dead
example code (`dt = {...}; array_fd = np.recarray(...)`) inside
`get_np_format` that was overwritten by the loop and never returned.
Narrow `cast(DTypeLike, ...)` calls are needed where numpy stubs
under-declare the runtime-accepted dict descriptor.
Annotate the public API of all six modules under ptsa/data/filters/
with PEP 484 type hints (npt.NDArray / npt.ArrayLike for arrays,
TimeSeries for filter inputs/outputs, Sequence[str] for chan_names,
etc.) so pyright sees signatures the lab's analysis pipeline expects.

Also fixes a handful of pyright-flagged latent bugs that the
modernization branch has separately addressed in 3.0.6:
- traits.api.ListStr -> List(Str) (removed in traits >= 7)
- np.core.records.fromarrays -> np.rec.fromarrays (removed in numpy 2)
- np.arange(start=, stop=) -> positional args (no kw overload)
Trait-instance scalars are read through ``trait_get`` so the
declarative class attributes (CFloat, CArray, Int) do not bleed into
the inferred arithmetic types.

The only suppressed diagnostic is a single
``# pyright: ignore[reportArgumentType]`` on np.rec.fromarrays(names=...)
where the numpy stubs annotate ``names`` as ``None`` despite accepting
list[str] at runtime.
After cherry-picking 10 parallel pyright agents, 7 errors remained
from boundary effects between independently-annotated files:

* filt.py: widen FreqRangeLike from Sequence[float] to ArrayLike so
  TimeSeries-derived freq_range params (from timeseries.filtered)
  type-check.
* butterworth.py: coerce buttfilt(timeseries.values, ...) instead of
  timeseries; add narrow `# pyright: ignore[reportArgumentType]` on
  traits.Str/Int descriptors which pyright can't model.
* morlet.py: same traits-descriptor ignores on self.width and
  self.output_dim.
* edf.py: `__main__` test code passes np.array([]) instead of [] to
  read_file (which is typed as NDArray).
* path_utils.py: drop the unused enumerate variable.

After this: pyright 0/0/0 on the full ptsa/ tree (down from baseline
of 114 errors / 35 warnings). pytest still 166/0/31.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Configures pyright to use the ptsa_dev conda env at
/home1/rdehaan/.conda/envs so it can resolve numpy/scipy/xarray/
traits/h5py/etc. for type-checking. Excludes vendored
data/common/pathlib.py and SWIG-generated artifacts that no
longer get rebuilt.

Pairs with the 10-agent annotation pass that brought pyright from
114 errors / 35 warnings to 0/0 across the ptsa/ tree.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Strip ~20 lines of commented ``# print 'foo'`` Python-2 syntax debug
stubs and a few alternative-implementation comments left over from
the original MatlabIO port. The explanatory comments documenting
*why* the live code works the way it does are preserved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Strip ~25 lines of commented-out alternative-implementation snippets
left behind in ``MatlabIO/__init__.py``, ``readers/eeg.py``,
``readers/netcdf.py``, ``readers/tal.py``, and ``timeseries.py``. These
are dead variants of code that already runs live a few lines away
(e.g. a duplicate ``start_offsets = ...``, an old ``kind_2_type`` dict
that differs only in one byte from the active one, a no-op rename).
Explanatory comments documenting *why* the live code is shaped the
way it is are preserved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
conda-build's default zips variant keys that appear in the stock
conda-forge cbc, so a bare `python: [4 entries]` + `numpy: [2 entries]`
declaration collapsed to a partial cell set instead of the full
product. Declaring each desired (python, numpy) pair as one zipped
column under an explicit `zip_keys: [[python, numpy]]` forces exactly
the 6 cells we publish:
  (3.10, 1.24) (3.10, 2) (3.11, 1.24) (3.11, 2) (3.12, 2) (3.13, 2)
numpy 1.24 × py3.12/3.13 are omitted (conda-forge never shipped
numpy 1.24 for python >=3.12).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three real bugs surfaced by running the rhino test suite against
the modern stack (xarray 2026.4, numpy 2.4) — all previously masked
by the rhino2b.def `xarray=2024.3.0` pin. Each verified against real
lab data on rhino:

* eeg.py: EEGReader.read_session_data built the 'offsets' coord from
  `session_array['offsets']` (a DataArray). Modern xarray rejects
  constructing a coordinate variable from a DataArray ("ambiguous,
  use .data"). Pass `.data`. Fixes test_ButterwothFilter,
  test_monopolar_to_bipolar_filter, test_h5reader_full_session.

* data_chopper.py: the earlier xarray-coord fix used
  `expand_dims(start_offsets=[i])`, which collides with a
  whole-session read that already carries a length-1 `start_offsets`
  dim ("Dimension start_offsets already exists"), and when forced
  also transposed the output. Now: relabel the existing dim's coord
  via assign_coords when present (preserving dim order), only
  expand_dims when absent. Fixes test_event_data_chopper and
  test_monopolar_to_bipolar_filter_and_data_chopper.

* tal.py: TalReader.from_dict did `new_ts.sort(order='channel')`.
  numpy 2 performs a full-record void comparison when the structured
  array contains any object-dtype subfield (atlases.tal.region holds
  None for some contacts → inferred dtype 'O'), raising "'<' not
  supported between NoneType". Replaced with lexsort/argsort on the
  channel column(s) — identical ordering, only touches channel.
  Fixes test_talreader_on_database.

Full rhino suite now passes (was 6 failed). NO_RHINO suite stays
166/0/31; pyright stays 0/0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Audited reader coverage: every public reader had a real load test
EXCEPT three. Two are now covered; the third is dead code.

* ParamsReader had no direct test (only exercised transitively via
  BinaryRawReader on rhino). Added tests/test_params_reader.py with
  small fixtures for the two live formats:
  - params.txt (whitespace key/value) — tests/data/params_fixtures/txt/
  - sources.json (per-dataroot JSON sidecar) — tests/data/params_fixtures/json/
  Plus gain-default + missing-samplerate + missing-file error paths,
  and one rhino test against a real sources.json. Non-rhino: 5 pass.

* TalReader monopolar .mat path (struct_type='mono' →
  struct_name='talStruct') was untested — the existing rhino tal test
  only covers bipolar .mat + bipolar pairs.json, and the eeg_read mono
  helper reads contacts.json not a .mat. Added
  test_talreader_monopolar_mat against R1060M_talLocs_database_monopol.mat.

Not covered (documented, not tested):
* NetCDF4XrayReader — dead code: reconstruct_axis indexes a filter()
  generator with [0], a latent Python-3 runtime bug. No data format
  exists for it in-tree and it cannot run. Candidate for removal.
* TalStimOnlyReader — reads virtualTalStruct from stim experiments;
  no such .mat found on rhino under the searched paths, so no real
  load test is possible right now.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The hand-written python×numpy zip_keys matrix in
conda_build_config.yaml mis-rendered under conda-build 26.5: variant
hashing cross-associated the zipped cells, producing scrambled build
envs (a py3.10 cell pulling py3.12 numpy) and an empty py3.12
package (1 member, no ptsa files). Verified by unpacking the .conda
outputs of a full-matrix `conda-build conda.recipe` run.

Fix: remove the python/numpy matrix from conda_build_config.yaml
entirely. The matrix is driven externally, one cell per conda-build
invocation with explicit --python / --variants:

* .github/workflows/build.yml already does this via strategy.matrix
  (so CI was never affected by the zip_keys bug — but the stale
  config could still interfere, now it can't).
* maint/build_matrix.sh (new) does the same 6-cell loop for local
  builds.

Verified a GH-style `conda-build conda.recipe --python=3.12
--variants "{numpy:['2']}"` now yields a proper 91-member package
(40 .py + 3 cpython-312 .so) that imports cleanly and passes
pytest (138 passed) in a clean py3.12 pixi env.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Riley16 and others added 3 commits May 27, 2026 07:21
The macOS (clang) and Windows (MSVC) conda builds fail at the morlet
link step with 'ld: library fftw3 not found' because, unlike the Linux
conda gcc activation, they don't implicitly put $PREFIX/lib on the
linker search path. Add get_library_dirs() (the active conda prefix's
lib dirs) and pass it as library_dirs on every pybind extension so
-lfftw3 resolves on all platforms.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The Butterworth and resample regressions route through scipy
(filtfilt / resample → LAPACK / pocketfft), whose round-off shifts
across scipy/numpy/BLAS builds. A CI cell with scipy 1.17.1 drifts
~1.6e-9 from the fixture baseline, tripping the 1e-12 bound meant for
our bit-deterministic FFTW morlet kernel. Give the scipy-backed
fixtures their own rtol=1e-5/atol=1e-6 bound; keep morlet + m2b tight.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two cross-platform CI failures:
- Windows: bare 'conda-build' is not on PATH in the bash -el runner
  (the activated env's Scripts/ dir isn't exported), so the build step
  hit 'command not found'. Invoke via 'mamba run -n ptsa_build'.
- All OSes: the smoke step ran 'python -c import ptsa' from the repo
  root, so CWD shadowed the installed package with the source tree
  (no compiled .so) → ModuleNotFoundError. cd into $RUNNER_TEMP first.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Riley16 and others added 5 commits May 27, 2026 07:31
The Windows bash -el runner can't reach conda-build through the
activated env: a bare `conda-build` isn't on PATH, and `mamba run`
mangles the spawned command ('onda' is not recognized). Install
conda-build into `base` and invoke the `conda build` subcommand
through the always-on-PATH base conda instead — identical behavior on
Linux/macOS (already green), and the only form that works on Windows.
Switch the smoke step from `mamba run` to `conda run` for the same
reason.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
conda-verify caps at python <3.12, so installing it into the base env
(python 3.13) makes the solve fail — which regressed every cell after
the switch to a base `conda build`. conda-build doesn't need it, so
install conda-build alone; drop anaconda-client here too (only the
deploy job uses it). Verified locally that python 3.13 + conda-build
solves cleanly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Linux + macOS (all 12 cells) are green via the `conda build`
subcommand; leave that path untouched. On the Windows bash -el runner
only condabin/ is on PATH (not base's Scripts/), so `conda build`
can't find the subcommand, a bare `conda-build` is not found, and
`mamba run` mangles the command. Branch on $RUNNER_OS and invoke
base's conda-build entry point by absolute path on Windows.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The absolute path resolved to a raw Windows path (C:\Users\...\Scripts
\conda-build.exe) that git-bash can't exec ('No such file or
directory'). Convert $CONDA with cygpath -u so the entry point is a
unix-style path bash can run. Adds a one-line ls for diagnostics.
Linux/macOS path unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Building the FFTW-linked C++ extensions through conda-build on the
Windows GitHub runner hits a stack of MSYS/conda PATH quirks (bare
conda-build not on PATH, mamba run command-mangling, conda-build entry
point not discoverable), and the lab has never shipped a win-64
package. Windows users run PTSA under WSL, where the linux-64 package
built and tested here works unchanged — so drop windows-latest and
document WSL in the README instead. Linux + macOS (12 cells) stay green
and the build step reverts to the plain `conda build` invocation.

Verified before commit: full suite WITH rhino data (tests/ + docs/,
NO_RHINO unset) on the current HEAD = 202 passed / 1 skipped /
1 xfailed / 0 failed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@Riley16 Riley16 merged commit 10a9ceb into master May 27, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant