Allow `create_sorting_analyzer` to accept dicts #4037

chrishalcrow · 2025-07-07T09:35:37Z

Allows create_sorting_analyzer to accept a dict of recordings and a dict of sortings. Now the following code...

raw_rec = si.read_openephys("recording_path") 
grouped_rec = raw_rec.split_by("group")
pp_rec = si.bandpass_filter(grouped_rec)
sort = si.run_sorter("mountainsort5", pp_rec)
analyzer = si.create_sorting_analyzer(sort, pp_rec)

...works for both grouped and un-grouped recordings! So the user doesn't have to include any logic about grouping in their pipeline for e.g. Neuropixel 2.0 recordings.

For now, I've implemented the simple solution: internally the create_sorting_analyzer function aggregates the recordings and sortings and combines them into a single sorting analyzer. The actual code inside create_sorting_analyzer is super simple! Also updated the units_aggregation function to accept dicts.

(We've also discussed allowing it to output a dictionary of analyzers. This is slightly more painful: the user will have to write awkward code when they want to compute, or export. We'll need to write and test new save and load options etc. I think we should add this in the future if there's a need for it.)

Internally, we need to keep track of the splitting and aggregation (e.g. which units in the analyzer belonged to which sorting?). I've put that logic in the aggregate_channels and aggregate_units functions, which attach new properties to the channels and units depending on the underlying group called "aggregate_channels_key" and "aggregate_units_key". Happy to hear other name ideas! So you can get the grouped analyzer by doing something like:

  group_0_keys = analyzer.unit_ids[analyzer.get_sorting_property("aggregate_units_key") == 0]
  group_0_analzyer = analyzer.select_units(unit_ids = group_0_keys)

Docs: https://spikeinterface--4037.org.readthedocs.build/en/4037/how_to/process_by_channel_group.html#
Plus a comment here: https://spikeinterface--4037.org.readthedocs.build/en/4037/modules/sorters.html#spike-sorting-by-group

samuelgarcia · 2025-07-09T11:13:53Z

Merci beaucoup.
I will read this in details.

alejoe91

A few small suggestions :)

src/spikeinterface/core/sortinganalyzer.py

doc/how_to/process_by_channel_group.rst

samuelgarcia · 2025-07-18T06:53:50Z

This is ultra smart and usefull.
The only improvement I would see is the handling of sparsity that would be group dependant.
For instance if you play with tetrodes you could expect the sparsity to be by group which is super fast to compute (instead of taking many random waveforms).
This could be done line 145 in sorting_analyzer.py with maybe contorl by an option. Because for big group (shank of neuropixel) I think the sparsity from waveforms is still wanted no ?

zm711

Two small points-- 1 a rendering issue 2 a discussion of if we should explain the dict layout better for beginners trying to use this strategy (at the level of inputs to the analyzer creation).

doc/how_to/process_by_channel_group.rst

zm711 · 2025-07-18T10:25:56Z

doc/how_to/process_by_channel_group.rst

+
+The code above generates a dictionary of recording objects and a dictionary of sorting objects.
+When making a :ref:`SortingAnalyzer <modules/core:SortingAnalyzer>`, we can pass these dictionaries and
+a single analyzer will be created, with the recordings and sortings appropriately aggregated.


How? Do keys of the dict need to match up? Will the user know how the sorting dict looks vs the recording dict?

Maybe we could check that keys are the same in the same order no ?

Yeah we either do that for the users internally or need to explain it here. So I think doing it ourselves is fine. Chris is doing a dict key comparison below but I haven't tested it to see if it checks the order or just the presence of the same keys.

Updated to hopefully make things clearer.

The order of keys doesn't matter: when you aggregate, the link between the recording channels and unit ids is independent of this dict stuff I'm adding. Added a test to check this is true.

chrishalcrow · 2025-07-19T22:07:31Z

The only improvement I would see is the handling of sparsity that would be group dependant.
For instance if you play with tetrodes you could expect the sparsity to be by group which is super fast to compute (instead of taking many random waveforms).
This could be done line 145 in sorting_analyzer.py with maybe contorl by an option. Because for big group (shank of neuropixel) I think the sparsity from waveforms is still wanted no ?

Hello. Hmm, I'm not sure how to handle this. Like you say: you defo want this for tetrodes. You probably don't want this for NP since you want an extra sparsity calc on each shank. You maybe want this for 32 channel silicon probes?? Ideally, you would compute sparsity per group, but this seems fairly complicated code-wise. Maybe we can add a note to the tetrode docs, and these docs, that if you're using tetrodes you can pass the sparsity explicitly? Maybe easiest to discuss "in person".

for more information, see https://pre-commit.ci

…erface into sa-by-dict

samuelgarcia · 2025-07-25T08:55:07Z

src/spikeinterface/core/sortinganalyzer.py

        You can control `estimate_sparsity()` : all extra arguments are propagated to it (included job_kwargs)
    sparsity : ChannelSparsity or None, default: None
        The sparsity used to compute exensions. If this is given, `sparse` is ignored.
+    set_sparsity_by_dict_key : bool, default: False


I support this.
What about setting it to True ?

If it's False by default, some tetrode people will have un-sparsified tetrode bundles (not great, but ok)
If True by default, some silicon people will have badly-sparsified probes (worse!)

So I vote False!

chrishalcrow · 2025-07-28T10:24:59Z

Updated docstring of set_sparsity_by_dict_key - now ready!

src/spikeinterface/core/basesorting.py

Co-authored-by: Chris Halcrow <[email protected]>

chrishalcrow and others added 6 commits June 25, 2025 10:23

initial implementation

8301ae2

start docs

4817c45

allow dicts in UnitsAggregationSorting

96a1819

Merge branch 'main' into sa-by-dict

9b5fd01

add tests, improve doc, change channelaggregation properties

4b76f2f

update doc and change metadata name

9a46c3e

chrishalcrow marked this pull request as ready for review July 8, 2025 12:29

chrishalcrow requested a review from samuelgarcia July 8, 2025 12:29

chrishalcrow added the enhancement New feature or request label Jul 8, 2025

chrishalcrow added this to the 0.103.0 milestone Jul 8, 2025

Merge branch 'main' into sa-by-dict

ad5eff9

alejoe91 reviewed Jul 14, 2025

View reviewed changes

src/spikeinterface/core/sortinganalyzer.py Outdated Show resolved Hide resolved

doc/how_to/process_by_channel_group.rst Outdated Show resolved Hide resolved

chrishalcrow and others added 2 commits July 15, 2025 07:12

respond to Alessio

8f5f103

Merge branch 'main' into sa-by-dict

433a2ae

samuelgarcia approved these changes Jul 18, 2025

View reviewed changes

zm711 reviewed Jul 18, 2025

View reviewed changes

chrishalcrow and others added 7 commits July 19, 2025 18:29

respond to zach

c2dfc55

Merge branch 'main' into sa-by-dict

3d7d6c5

[pre-commit.ci] auto fixes from pre-commit.com hooks

38f8533

for more information, see https://pre-commit.ci

oups

ac6bf72

Merge branch 'sa-by-dict' of https://github.com/chrishalcrow/spikeint…

492d1fa

…erface into sa-by-dict

try out adding sparsity_by_dict_key

11c96d4

update key in doc and test

5b8ba0e

samuelgarcia reviewed Jul 25, 2025

View reviewed changes

update tests for new key

36312f3

samuelgarcia approved these changes Jul 28, 2025

View reviewed changes

update set_sparsity_by_dict_key docstring

78af5b6

Merge branch 'main' into sa-by-dict

1eb05c7

Add split_by method to BaseSorting

b9cf1d8

alejoe91 approved these changes Jul 28, 2025

View reviewed changes

Merge branch 'main' into sa-by-dict

f8b3b26

chrishalcrow commented Jul 28, 2025

View reviewed changes

src/spikeinterface/core/basesorting.py Outdated Show resolved Hide resolved

chrishalcrow commented Jul 28, 2025

View reviewed changes

src/spikeinterface/core/basesorting.py Outdated Show resolved Hide resolved

Apply suggestions from code review

97073b6

Co-authored-by: Chris Halcrow <[email protected]>

alejoe91 added the core Changes to core module label Jul 28, 2025

alejoe91 merged commit f2d6373 into SpikeInterface:main Jul 28, 2025
15 checks passed

Allow create_sorting_analyzer to accept dicts #4037

Allow create_sorting_analyzer to accept dicts #4037

Uh oh!

Conversation

chrishalcrow commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

samuelgarcia commented Jul 9, 2025

Uh oh!

alejoe91 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

samuelgarcia commented Jul 18, 2025

Uh oh!

zm711 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zm711 Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

samuelgarcia Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

zm711 Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

chrishalcrow Jul 19, 2025

Choose a reason for hiding this comment

Uh oh!

chrishalcrow commented Jul 19, 2025

Uh oh!

samuelgarcia Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

chrishalcrow Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chrishalcrow commented Jul 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Allow `create_sorting_analyzer` to accept dicts #4037

Allow `create_sorting_analyzer` to accept dicts #4037

chrishalcrow commented Jul 7, 2025 •

edited

Loading

chrishalcrow Jul 25, 2025 •

edited

Loading