Skip to content
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
0e98098
add functions
huizizhang949 Jul 17, 2025
dd0207c
Merge branch 'main' into add-noisecutoff2
huizizhang949 Jul 17, 2025
871c6b1
update references
huizizhang949 Jul 17, 2025
384fdc6
restore _get_amplitudes_by_units
huizizhang949 Jul 17, 2025
fc01b48
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 17, 2025
fea674d
Merge branch 'main' into add-noisecutoff2
huizizhang949 Jul 22, 2025
b13e353
Merge branch 'main' into add-noisecutoff2
huizizhang949 Jul 22, 2025
8247e41
Update doc/modules/qualitymetrics/noise_cutoff.rst
huizizhang949 Jul 22, 2025
d9356c8
Update doc/modules/qualitymetrics/noise_cutoff.rst
huizizhang949 Jul 22, 2025
a1890fd
Update src/spikeinterface/qualitymetrics/misc_metrics.py
huizizhang949 Jul 22, 2025
62d61d8
Update src/spikeinterface/qualitymetrics/misc_metrics.py
huizizhang949 Jul 22, 2025
f78cb70
Update src/spikeinterface/qualitymetrics/misc_metrics.py
huizizhang949 Jul 22, 2025
6b9522c
Update src/spikeinterface/qualitymetrics/misc_metrics.py
huizizhang949 Jul 22, 2025
be50e14
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 22, 2025
bfb5fe7
update doc
huizizhang949 Jul 22, 2025
a44c0fa
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 22, 2025
36b1901
update latex
huizizhang949 Jul 22, 2025
1196efd
update doc
huizizhang949 Jul 22, 2025
5803fdb
doc
huizizhang949 Jul 22, 2025
d6b971f
Update doc/modules/qualitymetrics/noise_cutoff.rst
alejoe91 Jul 23, 2025
1a4a1dd
docs formatting
chrishalcrow Jul 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added doc/modules/qualitymetrics/example_cutoff.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
111 changes: 104 additions & 7 deletions doc/modules/qualitymetrics/noise_cutoff.rst
Original file line number Diff line number Diff line change
@@ -1,30 +1,127 @@
Noise cutoff (not currently implemented)
Noise cutoff (:code:`noise_cutoff`)
========================================

Calculation
-----------


Metric describing whether an amplitude distribution is cut off, similar to _amp_cutoff :ref:`amplitude cutoff <amp_cutoff>` but without a Gaussian assumption.
A histogram of amplitudes is created and quantifies the distance between the low tail, mean number of spikes and high tail in terms of standard deviations.
Metric describing whether an amplitude distribution is cut off as it approaches zero, similar to :ref:`amplitude cutoff <amp_cutoff>` but without a Gaussian assumption.

A SpikeInterface implementation is not yet available.
The **noise cutoff** metric assesses whether a unit’s spike‐amplitude distribution is truncated
at the low-end, which may be due to the high amplitude detection threshold in the deconvolution step,
i.e., if low‐amplitude spikes were missed. It does not assume a Gaussian shape;
instead, it directly compares counts in the low‐amplitude bins to counts in high‐amplitude bins.

1. **Build a histogram**

For each unit, divide all amplitudes into `n_bins` equally spaced bins over the range of the amplitude.
If the number of spikes is large, you may consider using a larger `n_bins`. For a small number of spikes, consider a smaller `n_bins`.
Let :math:`n_i` denote the count in the :math:`i`-th bin.

2. **Identify the “low” region**
- Compute the amplitude value at the specified `low_quantile` (for example, 0.10 = 10th percentile), denoted as :math:`\text{amp}_{low}`.
- Find all histogram bins whose upper edge is below that quantile value. These bins form the “low‐quantile region”.
- Compute

.. math::
L_{\mathrm{bins}} = \bigl\{i : \text{upper_edge}_i \le \text{amp}_{low}\bigr\}, \quad
\mu_{\mathrm{low}} = \frac{1}{|L_{\mathrm{bins}}|}\sum_{i\in L_{\mathrm{bins}}} n_i.
3. **Identify the “high” region**

- Compute the amplitude value at the specified `high_quantile` (for example, 0.25 = top 25th percentile), denoted as :math:`\text{amp}_{high}`.
- Find all histogram bins whose lower edge is greater than that quantile value. These bins form the “high‐quantile region.”
- Compute

.. math::
H_{\mathrm{bins}} &= \bigl\{i : \text{lower_edge}_i \ge \text{amp}_{high}\bigr\}, \\
\mu_{\mathrm{high}} &= \frac{1}{|H_{\mathrm{bins}}|}\sum_{i\in H_{\mathrm{bins}}} n_i, \quad
\sigma_{\mathrm{high}} = \sqrt{\frac{1}{|H_{\mathrm{bins}}|-1}\sum_{i\in H_{\mathrm{bins}}}\bigl(n_i-\mu_{\mathrm{high}} \bigr)^2}.
4. **Compute cutoff**

The *cutoff* is given by how many standard deviations away the low-amplitude bins are from the high-amplitude bins, defined as

.. math::
\mathrm{cutoff} = \frac{\mu_{\mathrm{low}} - \mu_{\mathrm{high}}}{\sigma_{\mathrm{high}}}.
- If no low‐quantile bins exist, a warning is issued and `cutoff = NaN`.
- If no high‐quantile bins exist or :math:`\sigma_{\mathrm{high}} = 0`, a warning is issued and `cutoff = NaN`.

5. **Compute the low-to-peak ratio**

- Let :math:`M = \max_i\,n_i` be the height of the largest bin in the histogram.
- Define

.. math::
\mathrm{ratio} = \frac{\mu_{\mathrm{low}}}{M}.
- If there are no low bins, :math:`\mathrm{ratio} = NaN`.


Together, (cutoff, ratio) quantify how suppressed the low‐end of the amplitude distribution is relative to the top quantile and to the peak.

Expectation and use
-------------------

Noise cutoff attempts to describe whether an amplitude distribution is cut off.
The metric is loosely based on [Hill]_'s amplitude cutoff, but is here adapted (originally by [IBL]_) to avoid making the Gaussianity assumption on spike distributions.
Noise cutoff provides an estimate of false negative rate, so a lower value indicates fewer missed spikes (a more complete unit).
Larger values of `cutoff` and `ratio` suggest that the distribution is cut-off.
IBL uses the default value of 1 to choose the number of lower bins, with a suggested threshold of 5 for `cutoff` and 0.1 for `ratio` to determine whether a unit is cut off or not.
In practice, the IBL threshold is quite conservative, and a lower threshold might be better for your data.
We suggest plotting the data using the `plot_amplitudes` widget to view your data when choosing your threshold.
It is suggested to use this metric when the amplitude histogram is **unimodal**.

The metric is loosely based on [Hill]_'s amplitude cutoff, but is here adapted (originally by [IBL2024]_) to avoid making the Gaussianity assumption on spike distributions.

Example code
------------

.. code-block:: python
import numpy as np
import matplotlib.pyplot as plt
from spikeinterface.full as si
# Suppose `sorting_analyzer` has been computed with spike amplitudes:
# Select units you are interested in visualizing
unit_ids = ...
# Compute noise_cutoff:
summary_dict = compute_noise_cutoff(
sorting_analyzer=sorting_analyzer
high_quantile=0.25,
low_quantile=0.10,
n_bins=100,
unit_ids=unit_ids
)
Reference
---------

.. autofunction:: spikeinterface.qualitymetrics.misc_metrics.compute_noise_cutoff

Examples with plots
-------------------

Here is shown the histogram of two units, with the vertical lines separating low- and high-amplitude regions.

- On the left, we have a unit with no truncation at the left end, and the cutoff and ratio are small.
- On the right, we have a unit with truncation at -1, and the cutoff and ratio are much larger.

.. image:: example_cutoff.png
:width: 600

Links to original implementations
---------------------------------

* From `IBL implementation <https://github.com/int-brain-lab/ibllib/blob/2e1f91c622ba8dbd04fc53946c185c99451ce5d6/brainbox/metrics/single_units.py>`_

Note: Compared to the original implementation, we have added a comparison between the low-amplitude bins to the largest bin (`noise_ratio`).
The selection of low-amplitude bins is based on the `low_quantile` rather than the number of bins.

Literature
----------

Metric introduced by [IBL]_ (adapted from [Hill]_'s amplitude cutoff metric).
Metric introduced by [IBL2024]_.
2 changes: 2 additions & 0 deletions doc/references.rst
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,8 @@ References

.. [IBL] `Spike sorting pipeline for the International Brain Laboratory. 2022. <https://figshare.com/articles/online_resource/Spike_sorting_pipeline_for_the_International_Brain_Laboratory/19705522/3>`_

.. [IBL2024] `Spike sorting pipeline for the International Brain Laboratory - Version 2. 2024. <https://figshare.com/articles/online_resource/Spike_sorting_pipeline_for_the_International_Brain_Laboratory/19705522?file=49783080>`_

.. [Jackson] `Quantitative assessment of extracellular multichannel recording quality using measures of cluster separation. Society of Neuroscience Abstract. 2005. <https://www.sciencedirect.com/science/article/abs/pii/S0306452204008425>`_

.. [Jain] `UnitRefine: A Community Toolbox for Automated Spike Sorting Curation. 2025 <https://www.biorxiv.org/content/10.1101/2025.03.30.645770v1>`_
Expand Down
161 changes: 161 additions & 0 deletions src/spikeinterface/qualitymetrics/misc_metrics.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,167 @@
_default_params = dict()


def compute_noise_cutoffs(sorting_analyzer, high_quantile=0.25, low_quantile=0.1, n_bins=100, unit_ids=None):
"""
A metric to determine if a unit's amplitude distribution is cut off as it approaches zero, without assuming a Gaussian distribution.

Based on the histogram of the (transformed) amplitude:

1. This method compares counts in the lower-amplitude bins to counts in the top 'high_quantile' of the amplitude range.
It computes the mean and std of an upper quantile of the distribution, and calculates how many standard deviations away
from that mean the lower-quantile bins lie.

2. The method also compares the counts in the lower-amplitude bins to the count in the highest bin and return their ratio.

Parameters
----------
sorting_analyzer : SortingAnalyzer
A SortingAnalyzer object.
high_quantile : float, default: 0.25
Quantile of the amplitude range above which values are treated as "high" (e.g. 0.25 = top 25%), the reference region.
low_quantile : int, default: 0.1
Quantile of the amplitude range below which values are treated as "low" (e.g. 0.1 = lower 10%), the test region.
n_bins: int, default: 100
The number of bins to use to compute the amplitude histogram.
unit_ids : list or None
List of unit ids to compute the amplitude cutoffs. If None, all units are used.

Returns
-------
noise_cutoff_dict : dict of floats
Estimated metrics based on the amplitude distribution, for each unit ID.

References
----------
Inspired by metric described in [IBL2024]_

"""
res = namedtuple("cutoff_metrics", ["noise_cutoff", "noise_ratio"])
if unit_ids is None:
unit_ids = sorting_analyzer.unit_ids

noise_cutoff_dict = {}
noise_ratio_dict = {}
if not sorting_analyzer.has_extension("spike_amplitudes"):
warnings.warn(
"`compute_noise_cutoffs` requires the 'spike_amplitudes` extension. Please run sorting_analyzer.compute('spike_amplitudes') to be able to compute `noise_cutoff`"
)
for unit_id in unit_ids:
noise_cutoff_dict[unit_id] = np.nan
noise_ratio_dict[unit_id] = np.nan
return res(noise_cutoff_dict, noise_ratio_dict)

amplitude_extension = sorting_analyzer.get_extension("spike_amplitudes")
peak_sign = amplitude_extension.params["peak_sign"]
if peak_sign == "both":
raise TypeError(
'`peak_sign` should either be "pos" or "neg". You can set `peak_sign` as an argument when you compute spike_amplitudes.'
)

amplitudes_by_units = _get_amplitudes_by_units(sorting_analyzer, unit_ids, peak_sign)

for unit_id in unit_ids:
amplitudes = amplitudes_by_units[unit_id]

# We assume the noise (zero values) is on the lower tail of the amplitude distribution.
# But if peak_sign == 'neg', the noise will be on the higher tail, so we flip the distribution.
if peak_sign == "neg":
amplitudes = -amplitudes

cutoff, ratio = _noise_cutoff(amplitudes, high_quantile=high_quantile, low_quantile=low_quantile, n_bins=n_bins)
noise_cutoff_dict[unit_id] = cutoff
noise_ratio_dict[unit_id] = ratio

return res(noise_cutoff_dict, noise_ratio_dict)


_default_params["noise_cutoff"] = dict(high_quantile=0.25, low_quantile=0.1, n_bins=100)


def _noise_cutoff(amps, high_quantile=0.25, low_quantile=0.1, n_bins=100):
"""
A metric to determine if a unit's amplitude distribution is cut off as it approaches zero, without assuming a Gaussian distribution.

Based on the histogram of the (transformed) amplitude:

1. This method compares counts in the lower-amplitude bins to counts in the higher_amplitude bins.
It computes the mean and std of an upper quantile of the distribution, and calculates how many standard deviations away
from that mean the lower-quantile bins lie.

2. The method also compares the counts in the lower-amplitude bins to the count in the highest bin and return their ratio.

Parameters
----------
amps : array-like
Spike amplitudes.
high_quantile : float, default: 0.25
Quantile of the amplitude range above which values are treated as "high" (e.g. 0.25 = top 25%), the reference region.
low_quantile : int, default: 0.1
Quantile of the amplitude range below which values are treated as "low" (e.g. 0.1 = lower 10%), the test region.
n_bins: int, default: 100
The number of bins to use to compute the amplitude histogram.

Returns
-------
cutoff : float
(mean(lower_bins_count) - mean(high_bins_count)) / std(high_bins_count)
ratio: float
mean(lower_bins_count) / highest_bin_count

"""
n_per_bin, bin_edges = np.histogram(amps, bins=n_bins)

maximum_bin_height = np.max(n_per_bin)

low_quantile_value = np.quantile(amps, q=low_quantile)

# the indices for low-amplitude bins
low_indices = np.where(bin_edges[1:] <= low_quantile_value)[0]

high_quantile_value = np.quantile(amps, q=1 - high_quantile)

# the indices for high-amplitude bins
high_indices = np.where(bin_edges[:-1] >= high_quantile_value)[0]

if len(low_indices) == 0:
warnings.warn(
"No bin is selected to test cutoff. Please increase low_quantile. Setting noise cutoff and ratio to NaN"
)
return np.nan, np.nan

# compute ratio between low-amplitude bins and the largest bin
low_counts = n_per_bin[low_indices]
mean_low_counts = np.mean(low_counts)
ratio = mean_low_counts / maximum_bin_height

if len(high_indices) == 0:
warnings.warn(
"No bin is selected as the reference region. Please increase high_quantile. Setting noise cutoff to NaN"
)
return np.nan, ratio

if len(high_indices) == 1:
warnings.warn(
"Only one bin is selected as the reference region, and thus the standard deviation cannot be computed. "
"Please increase high_quantile. Setting noise cutoff to NaN"
)
return np.nan, ratio

# compute cutoff from low-amplitude and high-amplitude bins
high_counts = n_per_bin[high_indices]
mean_high_counts = np.mean(high_counts)
std_high_counts = np.std(high_counts)
if std_high_counts == 0:
warnings.warn(
"All the high-amplitude bins have the same size. Please consider changing n_bins. "
"Setting noise cutoff to NaN"
)
return np.nan, ratio

cutoff = (mean_low_counts - mean_high_counts) / std_high_counts
return cutoff, ratio


def compute_num_spikes(sorting_analyzer, unit_ids=None, **kwargs):
"""
Compute the number of spike across segments.
Expand Down
5 changes: 5 additions & 0 deletions src/spikeinterface/qualitymetrics/quality_metric_list.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
compute_firing_ranges,
compute_amplitude_cv_metrics,
compute_sd_ratio,
compute_noise_cutoffs,
)

from .pca_metrics import (
Expand Down Expand Up @@ -51,6 +52,7 @@
"firing_range": compute_firing_ranges,
"drift": compute_drift_metrics,
"sd_ratio": compute_sd_ratio,
"noise_cutoff": compute_noise_cutoffs,
}

# a dict converting the name of the metric for computation to the output of that computation
Expand Down Expand Up @@ -81,6 +83,7 @@
"nn_noise_overlap": ["nn_noise_overlap"],
"silhouette": ["silhouette"],
"silhouette_full": ["silhouette_full"],
"noise_cutoff": ["noise_cutoff", "noise_ratio"],
}

# this dict allows us to ensure the appropriate dtype of metrics rather than allow Pandas to infer them
Expand Down Expand Up @@ -116,4 +119,6 @@
"nn_noise_overlap": float,
"silhouette": float,
"silhouette_full": float,
"noise_cutoff": float,
"noise_ratio": float,
}
16 changes: 16 additions & 0 deletions src/spikeinterface/qualitymetrics/tests/test_metrics_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,13 +43,29 @@
compute_quality_metrics,
)

from spikeinterface.qualitymetrics.misc_metrics import _noise_cutoff

from spikeinterface.core.basesorting import minimum_spike_dtype


job_kwargs = dict(n_jobs=2, progress_bar=True, chunk_duration="1s")


def test_noise_cutoff():
"""
Generate two artifical gaussian, one truncated and one not. Check the metrics are higher for the truncated one.
"""
np.random.seed(1)
amps = np.random.normal(0, 1, 1000)
amps_trunc = amps[amps > -1]

cutoff1, ratio1 = _noise_cutoff(amps=amps)
cutoff2, ratio2 = _noise_cutoff(amps=amps_trunc)

assert cutoff1 <= cutoff2
assert ratio1 <= ratio2


def test_compute_new_quality_metrics(small_sorting_analyzer):
"""
Computes quality metrics then computes a subset of quality metrics, and checks
Expand Down
Loading