diff --git a/docs/source/choosing_workflow.rst b/docs/source/choosing_workflow.rst
index d393b988..8d82bd24 100644
--- a/docs/source/choosing_workflow.rst
+++ b/docs/source/choosing_workflow.rst
@@ -14,11 +14,11 @@ WASP2 supports four major data types. Use this guide to find your workflow.
    * - **Bulk RNA-seq**
      - BAM + phased VCF
      - Allele-specific expression (ASE)
-     - :doc:`tutorials/rna_seq`
+     - :doc:`tutorials/bulk_workflow`
    * - **Bulk ATAC-seq**
      - BAM + phased VCF
      - Allele-specific chromatin accessibility
-     - :doc:`tutorials/atac_seq_workflow`
+     - :doc:`tutorials/bulk_workflow`
    * - **scRNA-seq (10x)**
      - Cell Ranger BAM + VCF + barcodes
      - Per-cell or per-cell-type ASE
@@ -38,15 +38,14 @@ Decision Flowchart
 
 **Step 2: Bulk or single-cell RNA-seq?**
 
-* Bulk RNA-seq → :doc:`tutorials/rna_seq`
+* Bulk RNA-seq → :doc:`tutorials/bulk_workflow`
 * 10x Chromium scRNA-seq → :doc:`tutorials/scrna_seq`
 * Other single-cell protocol → see :doc:`user_guide/single_cell`
 
 **Step 3: Bulk or single-cell ATAC-seq?**
 
-* Bulk ATAC-seq → :doc:`tutorials/atac_seq_workflow`
-* 10x scATAC-seq (fragments file) → :doc:`tutorials/scatac_workflow`
-* 10x scATAC-seq (BAM with CB tag) → :doc:`tutorials/scatac_workflow` (BAM path)
+* Bulk ATAC-seq → :doc:`tutorials/bulk_workflow` (use BED peak file as ``--region``)
+* 10x scATAC-seq → :doc:`tutorials/scatac_workflow`
 
 Do I Need to Run the WASP Remapping Step?
 ------------------------------------------
diff --git a/docs/source/development.rst b/docs/source/development.rst
index c568173c..4a4b71e3 100644
--- a/docs/source/development.rst
+++ b/docs/source/development.rst
@@ -1,272 +1,124 @@
 Development Guide
 =================
 
-Contributing to WASP2
----------------------
+Contributions are welcome. This page covers the WASP2-specific bits; for
+generic Python tooling (pytest, black, pre-commit) refer to each project's
+own docs.
 
-We welcome contributions! This guide helps you get started.
-
-Development Setup
------------------
-
-Clone Repository
-~~~~~~~~~~~~~~~~
+Setup
+-----
 
 .. code-block:: bash
 
    git clone https://github.com/mcvickerlab/WASP2
-   cd WASP2-exp
-
-Install Development Dependencies
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-.. code-block:: bash
-
+   cd WASP2
    pip install -e ".[dev]"
-
-This installs:
-* pytest (testing)
-* mypy (type checking)
-* black (code formatting)
-* flake8 (linting)
-* pre-commit (git hooks)
-
-Install Pre-commit Hooks
-~~~~~~~~~~~~~~~~~~~~~~~~~
-
-.. code-block:: bash
-
    pre-commit install
 
-Hooks run automatically before each commit:
-* Black formatting
-* Flake8 linting
-* mypy type checking
-* Quick tests
+The ``[dev]`` extra pulls in pytest, mypy, black, flake8, pre-commit, and
+Sphinx. Pre-commit runs formatting + linting + mypy + a quick test subset
+on every commit.
 
-Code Standards
+Code standards
 --------------
 
-Type Hints
-~~~~~~~~~~
-
-WASP2 has 100% type hint coverage. All new code must include type hints:
-
-.. code-block:: python
-
-   def count_alleles(
-       bam_file: str,
-       vcf_file: str,
-       min_count: int = 10
-   ) -> pd.DataFrame:
-       """Count alleles from BAM file."""
-       ...
-
-Formatting
-~~~~~~~~~~
-
-Use Black with 100-character lines:
+- **Type hints are required** on new public API. WASP2 maintains full
+  mypy coverage on ``src/counting``, ``src/mapping``, and ``src/analysis``.
+- **Line length 100**, enforced by black.
+- **Ruff** runs as the primary lint tool; flake8 is retained for
+  docstring-specific rules.
+- **Tests** live in ``tests/``. All new features need a test. Maintain
+  ≥80% line coverage.
 
 .. code-block:: bash
 
-   black src/ --line-length=100
-
-Linting
-~~~~~~~
-
-Pass Flake8 checks:
-
-.. code-block:: bash
-
-   flake8 src/ --max-line-length=100
-
-Testing
--------
-
-Run Tests Locally
-~~~~~~~~~~~~~~~~~
-
-.. code-block:: bash
-
-   # All tests
-   pytest tests/ -v
-
-   # Fast tests only (skip slow integration tests)
-   pytest tests/ -v -m "not slow"
-
-   # With coverage
-   pytest tests/ --cov=src --cov-report=html
-
-Test Requirements
-~~~~~~~~~~~~~~~~~
-
-* All new features need tests
-* Maintain >80% code coverage
-* Tests must pass in CI before merge
-
-Type Checking
--------------
-
-Run mypy:
-
-.. code-block:: bash
+   pytest tests/ -v                  # full suite
+   pytest tests/ -v -m "not slow"    # fast subset
+   pytest tests/ --cov=src           # coverage
 
    mypy src/counting/ src/mapping/ src/analysis/
 
-All code must pass mypy with 0 errors.
+Branching
+---------
 
-CI/CD Pipeline
---------------
-
-GitHub Actions
-~~~~~~~~~~~~~~
-
-Tests run automatically on every push:
-* Python 3.10 and 3.11
-* Type checking (mypy)
-* Unit tests (pytest)
-* Full pipeline validation
-* Documentation build
-
-CI must pass before PR can be merged.
-
-Pre-commit Hooks
-~~~~~~~~~~~~~~~~
-
-Local checks before commit:
-* Code formatting
-* Type checking
-* Quick tests
-
-To bypass (not recommended):
-
-.. code-block:: bash
-
-   git commit --no-verify
-
-Pull Request Process
---------------------
-
-1. Fork & Branch
-~~~~~~~~~~~~~~~~
-
-.. code-block:: bash
-
-   git checkout -b feature/my-feature
-
-2. Develop & Test
-~~~~~~~~~~~~~~~~~
-
-.. code-block:: bash
-
-   # Make changes
-   vim src/analysis/my_feature.py
-
-   # Add type hints
-   # Write tests
-   # Run locally
-   pytest tests/ -v
-   mypy src/
+- ``master`` is the release branch. Tagged versions (e.g., ``v1.4.1``)
+  trigger PyPI + Docker publish via GitHub Actions.
+- ``dev`` is the integration branch. Feature PRs target ``dev``, not
+  ``master``.
+- Use short, kebab-case feature branches off ``dev``: ``feat/...``,
+  ``fix/...``, ``docs/...``.
+- After a batch of features merges to ``dev``, a ``dev → master`` PR
+  promotes them; CI runs on both sides of that merge.
 
-3. Commit
-~~~~~~~~~
-
-.. code-block:: bash
+Pull requests
+-------------
 
-   git add src/analysis/my_feature.py tests/test_my_feature.py
-   git commit -m "Add my feature"
+1. Branch off ``dev``.
+2. Commit changes with descriptive messages; the squash-merge commit on
+   ``dev`` is what ends up in the changelog, so the PR title and body
+   should be clean.
+3. CI must be green: ruff + Rust tests + pytest (Python 3.10 / 3.11 /
+   3.12) + CodeQL.
+4. Code review looks at correctness, type safety, test coverage, docs,
+   and API surface.
 
-   # Pre-commit hooks run automatically
+Rust layer
+----------
 
-4. Push & PR
-~~~~~~~~~~~~
+The ``wasp2_rust`` extension is built with maturin against Python 3.10+.
 
 .. code-block:: bash
 
-   git push origin feature/my-feature
-
-   # Open PR on GitHub
-   # CI will run automatically
-   # Request review
+   cd rust/
+   cargo fmt --check
+   cargo clippy -- -D warnings
+   cargo test
+   maturin develop --release --manifest-path Cargo.toml
 
-Code Review
------------
+Changes to Rust code must keep the parity test
+(``tests/test_rust_python_counting_parity.py``) green — the Rust counter
+must produce byte-identical output to the Python reference implementation
+on the shared chr21 fixture.
 
-PRs are reviewed for:
-* Correctness
-* Type safety
-* Test coverage
-* Documentation
-* Code style
-
-Project Structure
------------------
+Project layout
+--------------
 
 .. code-block:: text
 
-   WASP2-exp/
+   WASP2/
    ├── src/
-   │   ├── counting/       # Allele counting
-   │   ├── mapping/        # WASP remapping
-   │   └── analysis/       # Statistical analysis
-   ├── tests/
-   │   └── regression/     # Regression tests
-   ├── docs/               # Sphinx documentation
-   ├── scripts/            # Utility scripts
-   ├── baselines/          # Test baselines
-   └── test_data/          # Example data
+   │   ├── counting/     # allele counting (Python + Rust bridge)
+   │   ├── mapping/      # WASP remap-and-filter
+   │   └── analysis/     # beta-binomial LRT, FDR
+   ├── rust/             # Rust extension (bam filter, counter)
+   ├── tests/            # pytest + regression
+   ├── docs/             # Sphinx source
+   └── pipelines/        # Nextflow DSL2 workflows
 
-Building Documentation
-----------------------
+Docs
+----
 
 .. code-block:: bash
 
-   cd docs
-   make html
-   open build/html/index.html
+   cd docs && make html
 
-Documentation must build without warnings.
+The live site is built and deployed by GitHub Pages on every push to
+``master``. Docs must build without warnings; cross-references are
+checked by ``sphinx-build -n`` in CI.
 
-Release Process
----------------
+Releases
+--------
 
-1. Update version in ``pyproject.toml``
-2. Update ``docs/source/changelog.rst``
-3. Merge to main
-4. Tag release: ``git tag v1.1.0``
-5. Push tag: ``git push origin v1.1.0``
-6. Publish to PyPI: ``python -m build && twine upload dist/*``
+1. Update ``pyproject.toml`` and ``rust/Cargo.toml`` versions (they share
+   one source of truth — ``pyproject.toml`` reads from Cargo).
+2. Update ``CHANGELOG.md``.
+3. Merge ``dev → master``.
+4. Tag: ``git tag vX.Y.Z && git push origin vX.Y.Z``.
+5. The ``release.yml`` workflow builds wheels and publishes to PyPI via
+   OIDC trusted publishing; ``docker.yml`` publishes the ghcr.io image.
 
-AI-Assisted Development
------------------------
-
-WASP2 pipeline development benefits from AI tooling. See the full integration guide:
-:doc:`/seqera_ai_integration`
-
-Recommended Workflow
-~~~~~~~~~~~~~~~~~~~~
-
-1. **Design**: Use Claude Code for architecture and complex logic
-2. **Generate**: Use Seqera AI for DSL2 syntax and nf-test templates
-3. **Validate**: Use Anthropic life-sciences scripts for environment checks
-4. **Review**: Use Claude Code for code review and optimization
-
-Tool Selection
-~~~~~~~~~~~~~~
-
-* **Architecture and design** → Claude Code
-* **Nextflow DSL2 syntax** → Seqera AI
-* **nf-test generation** → Seqera AI
-* **Environment validation** → ``nextflow run . -profile test -preview``
-
-Getting Help
+Getting help
 ------------
 
-* **Issues**: https://github.com/mcvickerlab/WASP2/issues
-* **Discussions**: GitHub Discussions
-* **Email**: Contact maintainers
-
-License
--------
-
-WASP2 is released under the MIT License. See LICENSE file.
+- Issues and feature requests: https://github.com/mcvickerlab/WASP2/issues
+- Discussions: GitHub Discussions
diff --git a/docs/source/index.rst b/docs/source/index.rst
index 36fe4ba2..ab8dde66 100644
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -71,8 +71,7 @@ Documentation
    :caption: Tutorials
 
    tutorials/quickstart_counting
-   tutorials/quickstart_mapping
-   tutorials/rna_seq
+   tutorials/bulk_workflow
    tutorials/scrna_seq
    tutorials/scatac_workflow
    tutorials/comparative_imbalance
diff --git a/docs/source/methods/counting_algorithm.rst b/docs/source/methods/counting_algorithm.rst
index 70969308..3839acc9 100644
--- a/docs/source/methods/counting_algorithm.rst
+++ b/docs/source/methods/counting_algorithm.rst
@@ -135,21 +135,11 @@ The Rust implementation provides:
 - **Memory efficiency**: Stream through BAM without loading all reads
 - **htslib integration**: Direct access to BAM index for efficient queries
 
-Performance Characteristics
-^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-.. table:: Counting Performance
-   :widths: 30 35 35
-
-   =================== ======================== ========================
-   Dataset Size        Python (pysam)           Rust (wasp2_rust)
-   =================== ======================== ========================
-   10,000 SNPs         ~45 seconds              ~5 seconds
-   100,000 SNPs        ~7 minutes               ~40 seconds
-   1,000,000 SNPs      ~70 minutes              ~6 minutes
-   =================== ======================== ========================
-
-*Benchmarks on typical ATAC-seq data with 100M reads, single thread.*
+The Rust counter is substantially faster than the pure-Python pysam path
+(roughly an order of magnitude on typical ATAC-seq data in internal
+testing). Exact throughput depends on read length, BAM index density,
+storage type, and thread count — measure on your own hardware before
+planning large batches.
 
 Quality Filtering
 -----------------
diff --git a/docs/source/methods/dispersion_estimation.rst b/docs/source/methods/dispersion_estimation.rst
index 8c31de7f..9b1e4e94 100644
--- a/docs/source/methods/dispersion_estimation.rst
+++ b/docs/source/methods/dispersion_estimation.rst
@@ -1,62 +1,31 @@
 Dispersion Parameter Estimation
 ================================
 
-This document describes how WASP2 estimates the overdispersion parameter
-:math:`\rho` in the beta-binomial model.
-
-.. contents:: Contents
-   :local:
-   :depth: 2
-
-The Dispersion Parameter
-------------------------
-
-The dispersion parameter :math:`\rho` quantifies the excess variance beyond
-the binomial expectation. In the beta-binomial model:
+The beta-binomial overdispersion parameter :math:`\rho \in (0, 1)` quantifies
+variance inflation beyond the binomial expectation:
 
 .. math::
 
-   \text{Var}(X) = N\mu(1-\mu)[1 + (N-1)\rho]
-
-where:
-
-- :math:`\rho = 0`: Binomial variance (no overdispersion)
-- :math:`\rho > 0`: Variance inflation due to correlated sampling
-
-Estimation Methods
-------------------
-
-WASP2 supports two approaches for dispersion estimation:
-
-1. **Maximum Likelihood Estimation (MLE)**: Optimize the likelihood function
-2. **Method of Moments (MoM)**: Solve equations based on sample moments
+   \text{Var}(X) = N\mu(1-\mu)\left[1 + (N-1)\rho\right]
 
-Both methods have trade-offs in terms of efficiency, bias, and computational cost.
+Dispersion arises from biological variation, PCR amplification, and
+aggregation of counts across SNPs within a region. Estimation approaches
+follow the beta-binomial ASE literature [Kumasaka2016]_; analogous methods
+for negative-binomial dispersion in RNA-seq counts are reviewed in
+[Robinson2010]_ and [Yu2013]_.
 
-Maximum Likelihood Estimation
------------------------------
+Single Dispersion Model (default)
+---------------------------------
 
-MLE finds the value of :math:`\rho` that maximizes the likelihood of the
-observed data.
-
-Single Dispersion Model
-^^^^^^^^^^^^^^^^^^^^^^^
-
-WASP2's default approach estimates a single :math:`\rho` across all observations:
-
-.. math::
-
-   \hat{\rho}_{\text{MLE}} = \arg\max_{\rho} \sum_{i=1}^{n} \log P(X_i | N_i, \mu=0.5, \rho)
-
-Under the null hypothesis of no imbalance (:math:`\mu = 0.5`), the log-likelihood is:
+WASP2's default ``single`` model estimates one pooled :math:`\rho` under the
+null hypothesis :math:`\mu = 0.5` by maximum likelihood:
 
 .. math::
 
-   \ell(\rho) = \sum_{i=1}^{n} \log \text{BetaBinom}(X_i; N_i, \alpha(\rho), \beta(\rho))
+   \hat{\rho}_{\text{MLE}} = \arg\max_{\rho} \sum_{i=1}^{n}
+   \log \text{BetaBinom}(X_i; N_i, \alpha(\rho), \beta(\rho))
 
-where :math:`\alpha = \beta = 0.5(1-\rho)/\rho`.
-
-**Implementation**:
+with :math:`\alpha = \beta = 0.5(1-\rho)/\rho`.
 
 .. code-block:: python
 
@@ -69,46 +38,29 @@ where :math:`\alpha = \beta = 0.5(1-\rho)/\rho`.
        beta = 0.5 * (1 - rho) / rho
        return -np.sum(betabinom.logpmf(ref_counts, n_counts, alpha, beta))
 
-   result = minimize_scalar(
+   rho_mle = minimize_scalar(
        neg_log_likelihood,
        args=(ref_array, n_array),
        method='bounded',
-       bounds=(1e-6, 1 - 1e-6)
-   )
-   rho_mle = result.x
-
-**Properties of MLE**:
+       bounds=(1e-6, 1 - 1e-6),
+   ).x
 
-- Asymptotically unbiased as :math:`n \to \infty`
-- Achieves the Cramér-Rao lower bound (efficient)
-- Computationally requires numerical optimization
-- May be sensitive to outliers
+:math:`\rho` is held at this null-model MLE when computing the likelihood-ratio
+test statistic (see :doc:`statistical_models`).
 
 Linear Dispersion Model
-^^^^^^^^^^^^^^^^^^^^^^^
+-----------------------
 
-For large datasets with variable coverage, WASP2 offers a model where
-dispersion varies linearly with total count on the logit scale:
+For large datasets with wide coverage ranges, WASP2 provides a ``linear`` model
+in which dispersion varies with total count on the logit scale:
 
 .. math::
 
    \text{logit}(\rho_i) = \beta_0 + \beta_1 \cdot N_i
 
-The logit link ensures :math:`\rho_i \in (0, 1)`:
-
-.. math::
-
-   \rho_i = \frac{\exp(\beta_0 + \beta_1 N_i)}{1 + \exp(\beta_0 + \beta_1 N_i)}
-
-**Motivation**:
-
-Empirically, regions with different coverage levels may exhibit different
-dispersion characteristics:
-
-- **Low coverage**: Greater sampling noise, potentially higher apparent dispersion
-- **High coverage**: More stable estimates, may reveal true biological variance
-
-**Implementation**:
+Low-coverage regions typically show greater apparent dispersion from sampling
+noise; high-coverage regions reveal more of the true biological variance. The
+linear model captures this trend with one additional parameter.
 
 .. code-block:: python
 
@@ -117,154 +69,58 @@ dispersion characteristics:
 
    def neg_ll_linear(params, ref_counts, n_counts):
        beta0, beta1 = params
-       logit_rho = beta0 + beta1 * n_counts
-       # Clip to avoid numerical issues
-       logit_rho = np.clip(logit_rho, -10, 10)
-       rho = expit(logit_rho)
+       rho = expit(np.clip(beta0 + beta1 * n_counts, -10, 10))
        alpha = 0.5 * (1 - rho) / rho
        beta = 0.5 * (1 - rho) / rho
        return -np.sum(betabinom.logpmf(ref_counts, n_counts, alpha, beta))
 
-   result = minimize(
-       neg_ll_linear,
-       x0=(0, 0),
-       method='Nelder-Mead',
-       args=(ref_array, n_array)
-   )
-   beta0, beta1 = result.x
-
-Method of Moments
------------------
+   beta0, beta1 = minimize(neg_ll_linear, x0=(0, 0),
+                           method='Nelder-Mead',
+                           args=(ref_array, n_array)).x
 
-MoM estimates :math:`\rho` by equating theoretical and sample moments.
+Choosing a model
+----------------
 
-Variance-Based Estimator
-^^^^^^^^^^^^^^^^^^^^^^^^
+.. table::
+   :widths: 30 70
 
-For a beta-binomial with :math:`\mu = 0.5`, the variance is:
+   ================================ =============================================================
+   Context                          Recommended model
+   ================================ =============================================================
+   Small dataset (< 1,000 regions)  ``single`` — linear model is underpowered
+   Uniform coverage                 ``single``
+   Large dataset, variable coverage ``linear``
+   Model-selection check            Compare AIC/BIC; linear has one extra parameter
+   ================================ =============================================================
 
-.. math::
-
-   \text{Var}(X) = \frac{N}{4}[1 + (N-1)\rho]
-
-Solving for :math:`\rho`:
-
-.. math::
-
-   \hat{\rho}_{\text{MoM}} = \frac{4S^2/N - 1}{N - 1}
+Model comparison via AIC (:math:`\text{AIC} = 2k - 2\ell(\hat\theta)`) can be
+used to check whether the extra parameter is justified on a given dataset.
 
-where :math:`S^2` is the sample variance of :math:`X/N` (the allelic ratio).
+Convergence
+-----------
 
-**Pooled Estimator**:
-
-For observations with varying :math:`N`:
-
-.. math::
-
-   \hat{\rho}_{\text{MoM}} = \frac{\sum_i (X_i - N_i/2)^2 - \sum_i N_i/4}{\sum_i N_i(N_i-1)/4}
-
-**Properties of MoM**:
-
-- Closed-form solution (fast computation)
-- May produce negative estimates (truncate to 0)
-- Less efficient than MLE, especially for small samples
-- More robust to model misspecification
-
-Comparison: MLE vs MoM
-----------------------
-
-.. table:: MLE vs MoM for Dispersion Estimation
-   :widths: 25 37 38
-
-   =================== ============================== ==============================
-   Property            MLE                            MoM
-   =================== ============================== ==============================
-   Computation         Iterative optimization         Closed-form
-   Efficiency          Optimal (achieves CRLB)        Suboptimal
-   Bias                Asymptotically unbiased        May be biased for small n
-   Robustness          Sensitive to outliers          More robust
-   Boundary behavior   Always in (0,1)                May give ρ < 0
-   WASP2 default       Yes                            No
-   =================== ============================== ==============================
-
-WASP2 uses MLE because:
-
-1. The beta-binomial likelihood is well-behaved
-2. Modern optimization is fast enough for typical datasets
-3. MLE provides consistent estimates across sample sizes
-
-Practical Considerations
-------------------------
-
-Convergence Issues
-^^^^^^^^^^^^^^^^^^
-
-The MLE optimizer may fail to converge if:
-
-- :math:`\rho` is very close to 0 (nearly binomial data)
-- :math:`\rho` is very close to 1 (extreme overdispersion)
-- The data contains extreme outliers
-
-WASP2 handles these by:
-
-- Bounding :math:`\rho` away from 0 and 1
-- Clipping logit values to avoid overflow
-- Using robust optimization methods (bounded, Nelder-Mead)
-
-Sample Size Requirements
-^^^^^^^^^^^^^^^^^^^^^^^^
-
-MLE performance depends on sample size:
-
-.. table:: Dispersion Estimate Quality by Sample Size
-   :widths: 25 25 50
-
-   ============ ================ =====================================
-   n (regions)  CV of estimate   Recommendation
-   ============ ================ =====================================
-   < 50         > 50%            Use pooled estimate or prior
-   50-200       20-50%           MLE reasonable but uncertain
-   200-1000     10-20%           MLE reliable
-   > 1000       < 10%            MLE highly accurate
-   ============ ================ =====================================
-
-Model Selection
-^^^^^^^^^^^^^^^
-
-Choosing between single and linear dispersion models:
-
-**Use Single Dispersion When**:
-
-- Dataset is small (< 1000 regions)
-- Coverage is relatively uniform
-- Quick analysis is needed
-
-**Use Linear Dispersion When**:
-
-- Large dataset (> 10,000 regions)
-- Wide range of coverage values
-- Systematic coverage-dispersion relationship suspected
-
-Model comparison can be done via AIC/BIC:
-
-.. math::
-
-   \text{AIC} = 2k - 2\ell(\hat{\theta})
-
-where :math:`k` is the number of parameters (1 for single, 2 for linear).
+The MLE optimizer may have trouble when :math:`\rho` is very close to 0 or 1,
+or when extreme outliers dominate the pooled likelihood. WASP2 bounds
+:math:`\rho \in (10^{-6}, 1-10^{-6})` and clips logit values in the linear
+model. Persistent convergence failures usually indicate data issues (CNVs,
+mapping artifacts) rather than optimizer pathology.
 
 See Also
 --------
 
-- :doc:`fdr_correction` - Multiple testing correction after estimation
+- :doc:`statistical_models` — how :math:`\rho` enters the LRT
+- :doc:`fdr_correction` — multiple-testing correction downstream
 
 References
 ----------
 
+.. [Kumasaka2016] Kumasaka N, Knights AJ, Gaffney DJ (2016). Fine-mapping
+   cellular QTLs with RASQUAL and ATAC-seq. *Nature Genetics* 48:206-213.
+
 .. [Robinson2010] Robinson MD, Smyth GK (2010). Small-sample estimation of
    negative binomial dispersion, with applications to SAGE data.
-   *Biostatistics* 9:321-332.
+   *Biostatistics* 9:321-332. *Analogous NB-dispersion literature.*
 
 .. [Yu2013] Yu D, Huber W, Vitek O (2013). Shrinkage estimation of dispersion
    in Negative Binomial models for RNA-seq experiments with small sample size.
-   *Bioinformatics* 29:1275-1282.
+   *Bioinformatics* 29:1275-1282. *Analogous NB-dispersion literature.*
diff --git a/docs/source/methods/fdr_correction.rst b/docs/source/methods/fdr_correction.rst
index 75894bc0..50532fac 100644
--- a/docs/source/methods/fdr_correction.rst
+++ b/docs/source/methods/fdr_correction.rst
@@ -1,246 +1,59 @@
 False Discovery Rate Correction
 ================================
 
-This document describes the multiple testing correction methods used in WASP2
-to control false positive rates when testing many genomic regions.
-
-.. contents:: Contents
-   :local:
-   :depth: 2
-
-The Multiple Testing Problem
-----------------------------
-
-When testing thousands of genomic regions for allelic imbalance, even a small
-per-test false positive rate leads to many false discoveries:
-
-.. math::
-
-   E[\text{false positives}] = m \cdot \alpha
-
-For :math:`m = 10{,}000` tests at :math:`\alpha = 0.05`, we expect 500 false
-positives by chance alone.
-
-**Example**: Testing 20,000 genes for ASE
-
-- At α = 0.05: ~1,000 expected false positives
-- At α = 0.01: ~200 expected false positives
-- Even stringent thresholds yield many false discoveries
-
-Error Rate Definitions
-----------------------
-
-Family-Wise Error Rate (FWER)
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-The probability of making **any** false discovery:
-
-.. math::
-
-   \text{FWER} = P(V \geq 1)
-
-where :math:`V` is the number of false positives.
-
-FWER control (e.g., Bonferroni) is very conservative for genomic studies.
-
-False Discovery Rate (FDR)
-^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-The expected **proportion** of false discoveries among rejections:
-
-.. math::
-
-   \text{FDR} = E\left[\frac{V}{R}\right]
-
-where :math:`V` is false positives and :math:`R` is total rejections.
-
-FDR is more appropriate for discovery-oriented genomic studies where:
-
-- Some false positives are acceptable
-- The goal is to prioritize candidates for follow-up
-- The number of tests is very large
-
-Benjamini-Hochberg Procedure
-----------------------------
-
-WASP2 uses the Benjamini-Hochberg (BH) procedure [Benjamini1995]_ for FDR control.
-
-Algorithm
-^^^^^^^^^
-
-Given :math:`m` p-values :math:`p_1, p_2, \ldots, p_m`:
-
-1. Sort p-values: :math:`p_{(1)} \leq p_{(2)} \leq \cdots \leq p_{(m)}`
-2. Find the largest :math:`k` such that :math:`p_{(k)} \leq \frac{k}{m} \cdot q`
-3. Reject all hypotheses with :math:`p_{(i)} \leq p_{(k)}`
-
-where :math:`q` is the target FDR level (typically 0.05 or 0.1).
-
-**Adjusted P-Values (q-values)**:
-
-WASP2 reports BH-adjusted p-values:
-
-.. math::
-
-   q_i = \min_{j \geq i} \left\{ \frac{m \cdot p_{(j)}}{j} \right\}
-
-These can be interpreted as the minimum FDR at which the hypothesis would be
-rejected.
-
-Implementation
-^^^^^^^^^^^^^^
+WASP2 corrects for multiple testing using the Benjamini-Hochberg (BH) procedure
+[Benjamini1995]_, applied via :func:`scipy.stats.false_discovery_control`.
 
 .. code-block:: python
 
    from scipy.stats import false_discovery_control
 
-   # WASP2 uses the BH method
    fdr_pvals = false_discovery_control(pvals, method='bh')
 
-   # Equivalent manual implementation
-   def benjamini_hochberg(pvals):
-       n = len(pvals)
-       ranked = np.argsort(pvals)
-       adjusted = np.empty(n)
-       cummin = 1.0
-       for i in range(n - 1, -1, -1):
-           idx = ranked[i]
-           adjusted[idx] = min(cummin, pvals[idx] * n / (i + 1))
-           cummin = adjusted[idx]
-       return adjusted
-
-Properties
-^^^^^^^^^^
+Each region receives both a raw p-value (``pval``) from the likelihood ratio test
+and a BH-adjusted p-value (``fdr_pval``). Significance is typically declared at
+``fdr_pval < 0.05``.
 
-**Advantages**:
+BH controls FDR under independence or positive regression dependence (PRDS).
+The chi-squared p-values produced by WASP2's LRT satisfy these assumptions when
+regions are analyzed independently. If regions share SNPs or otherwise induce
+negative dependence, the more conservative Benjamini-Yekutieli procedure
+(``method='by'``) can be substituted.
 
-- Controls FDR at level :math:`q` under independence
-- More powerful than FWER methods (fewer false negatives)
-- Simple to compute and interpret
-- Works well with continuous p-values
+.. warning::
 
-**Assumptions**:
+   ``scipy.stats.false_discovery_control`` raises on NaN p-values, but a manual
+   BH implementation based on ``np.minimum.accumulate`` silently propagates NaN
+   through the cumulative minimum. Always drop or impute NaN p-values before BH
+   correction.
 
-- P-values under the null are uniformly distributed
-- Independence or positive regression dependence (PRDS)
+For studies requiring higher-power FDR control with an estimated proportion of
+true nulls, :math:`\pi_0`, see Storey's q-value [Storey2003]_. WASP2 does not
+apply this by default because BH is sufficient for typical ASE workloads and
+requires no tuning.
 
-The chi-squared p-values from WASP2's likelihood ratio test satisfy these
-assumptions when regions are independent.
+Reporting
+---------
 
-Alternative Methods
--------------------
+When reporting results, state the correction method, threshold, and number
+of tests:
 
-While WASP2 uses BH by default, researchers may consider alternatives
-for specific scenarios.
+    "BH correction was applied to N = ... chi-squared p-values; regions with
+    ``fdr_pval < 0.05`` were declared significant."
 
-Benjamini-Yekutieli (BY)
-^^^^^^^^^^^^^^^^^^^^^^^^
+Output columns
+--------------
 
-For arbitrary dependence between tests:
-
-.. math::
-
-   p_{(k)} \leq \frac{k}{m \cdot c(m)} \cdot q
-
-where :math:`c(m) = \sum_{i=1}^{m} 1/i \approx \ln(m) + 0.577`.
-
-BY is more conservative but valid under any dependence structure.
-
-.. code-block:: python
-
-   # Available in scipy
-   fdr_pvals = false_discovery_control(pvals, method='by')
-
-Storey's q-value
-^^^^^^^^^^^^^^^^
-
-Estimates the proportion of true nulls (:math:`\pi_0`) for improved power:
-
-.. math::
-
-   \hat{\pi}_0 = \frac{\#\{p_i > \lambda\}}{m(1 - \lambda)}
-
-The q-value procedure is more powerful when many tests are true discoveries.
-
-.. code-block:: python
-
-   # Requires qvalue package
-   # pip install qvalue
-   from qvalue import qvalue
-   q = qvalue(pvals)
-
-Alternative Correction Methods
-------------------------------
-
-**Discrete FDR and Mid-P Adjustments**
-
-For exact tests with discrete p-values (Fisher's exact, exact binomial),
-specialized methods like Gilbert's procedure [Gilbert2005]_ or mid-p
-adjustments can reduce conservativeness.
-
-However, WASP2's likelihood ratio test produces continuous p-values from
-the chi-squared distribution, so **standard BH is appropriate** and these
-discrete methods are not needed.
-
-Practical Guidelines
---------------------
-
-Choosing an FDR Threshold
-^^^^^^^^^^^^^^^^^^^^^^^^^
-
-.. table:: FDR Threshold Guidelines
-   :widths: 20 40 40
-
-   ======== ================================= ================================
-   FDR      Use Case                          Interpretation
-   ======== ================================= ================================
-   0.01     High-confidence discoveries       ~1% false among significant
-   0.05     Standard exploratory analysis     ~5% false among significant
-   0.10     Liberal discovery                 ~10% false, maximize sensitivity
-   0.20     Hypothesis generation             For follow-up validation
-   ======== ================================= ================================
-
-When to Use Stricter Control
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-Consider stricter FDR or FWER control when:
-
-- Results will directly inform clinical decisions
-- Follow-up validation is expensive
-- The number of true positives is expected to be small
-- Independence assumptions may be violated
-
-Reporting Results
-^^^^^^^^^^^^^^^^^
-
-When reporting FDR-corrected results:
-
-1. **Report the method**: "FDR correction was performed using the
-   Benjamini-Hochberg procedure"
-2. **Report the threshold**: "Significance was declared at FDR < 0.05"
-3. **Report both p-values**: Include raw and adjusted p-values in supplements
-4. **Report the number of tests**: "Among 15,234 tested regions..."
-
-Output Format
--------------
-
-WASP2 reports both raw and adjusted p-values:
-
-.. table:: P-value Columns in Output
+.. table::
    :widths: 20 20 60
 
    ========= ======== ====================================================
    Column    Type     Description
    ========= ======== ====================================================
    pval      float    Raw p-value from likelihood ratio test
-   fdr_pval  float    BH-adjusted p-value (q-value)
+   fdr_pval  float    BH-adjusted p-value
    ========= ======== ====================================================
 
-**Interpretation**:
-
-- ``pval < 0.05``: Nominally significant (not corrected)
-- ``fdr_pval < 0.05``: Significant after multiple testing correction
-
 References
 ----------
 
@@ -251,8 +64,3 @@ References
 .. [Storey2003] Storey JD, Tibshirani R (2003). Statistical significance for
    genomewide studies. *Proceedings of the National Academy of Sciences*
    100:9440-9445.
-
-.. [Gilbert2005] Gilbert PB (2005). A modified false discovery rate
-   multiple-comparisons procedure for discrete data, applied to human
-   immunodeficiency virus genetics. *Journal of the Royal Statistical
-   Society C* 54:143-158.
diff --git a/docs/source/methods/mapping_filter.rst b/docs/source/methods/mapping_filter.rst
index e44cc5b3..c196d0b4 100644
--- a/docs/source/methods/mapping_filter.rst
+++ b/docs/source/methods/mapping_filter.rst
@@ -65,12 +65,19 @@ A read passes the WASP filter if and only if:
 This criterion ensures that the read would have mapped identically regardless
 of which allele it carried, eliminating differential mappability.
 
-**Theorem**: After WASP filtering, the probability of mapping is equal for
-reads from either allele:
+Under the assumption that the aligner's scoring is deterministic and
+depends only on the read sequence and reference coordinate — the standard
+case for seed-and-extend aligners at typical parameter settings — the
+mapping probability becomes approximately equal for either allele after
+filtering:
 
 .. math::
 
-   P(\text{map} | \text{ref allele}) = P(\text{map} | \text{alt allele})
+   P(\text{map} \mid \text{ref allele}) \approx P(\text{map} \mid \text{alt allele})
+
+See [vandeGeijn2015]_ §Methods for the original argument. The equality is
+exact under deterministic mapping and approximate for aligners with
+stochastic tie-breaking or heuristic multi-mapper resolution.
 
 Implementation Details
 ----------------------
@@ -216,38 +223,29 @@ The Rust implementation provides:
 - **Streaming comparison**: Memory-efficient position matching
 - **htslib integration**: Native BAM format support
 
-Performance Characteristics
----------------------------
-
-.. table:: WASP Filter Performance
-   :widths: 30 35 35
-
-   =================== ======================== ========================
-   Reads to Filter     Python Implementation    Rust Implementation
-   =================== ======================== ========================
-   1 million           ~5 minutes               ~30 seconds
-   10 million          ~50 minutes              ~5 minutes
-   100 million         ~8 hours                 ~50 minutes
-   =================== ======================== ========================
+Performance
+-----------
 
-*Benchmarks with coordinate-sorted BAM, single thread.*
+The Rust implementation is substantially faster than a pure-Python filter
+path (roughly an order of magnitude on coordinate-sorted BAMs of 1–100M
+reads in internal testing). Exact throughput depends on storage type,
+thread count, and read length; run on your own hardware for accurate
+numbers.
 
 Expected Filter Rates
 ^^^^^^^^^^^^^^^^^^^^^
 
-Typical filter rates depend on data type and variant density:
+Filter rates vary with data type, variant density, aligner, and tolerance
+settings. As a rough developer-experience guide only:
 
-.. table:: Typical Filter Rates by Data Type
-   :widths: 25 25 50
+- RNA-seq: on the order of a few percent to ~15%; higher near splice
+  junctions and in indel-dense regions.
+- ATAC-seq / ChIP-seq: on the order of a few percent.
+- WGS: typically lower than RNA-seq at comparable variant density.
 
-   ============ ============ ==========================================
-   Data Type    Filter Rate  Notes
-   ============ ============ ==========================================
-   RNA-seq      5-15%        Higher near splice junctions
-   ATAC-seq     2-8%         Lower due to shorter reads
-   ChIP-seq     3-10%        Depends on peak locations
-   WGS          1-5%         Lowest filter rate
-   ============ ============ ==========================================
+Outlier filter rates (well above these ranges) usually indicate a mismatch
+between the original and remap aligner versions/parameters, CNV-rich
+regions, or errors in the variant call set — not a WASP failure.
 
 Limitations and Considerations
 ------------------------------
@@ -316,4 +314,12 @@ The typical WASP2 workflow:
 See Also
 --------
 
-- :doc:`counting_algorithm` - Allele counting after WASP filtering
+- :doc:`counting_algorithm` — Allele counting after WASP filtering
+
+References
+----------
+
+.. [vandeGeijn2015] van de Geijn B, McVicker G, Gilad Y, Pritchard JK (2015).
+   WASP: allele-specific software for robust molecular quantitative trait
+   locus discovery. *Nature Methods* 12:1061-1063.
+   https://doi.org/10.1038/nmeth.3582
diff --git a/docs/source/methods/statistical_models.rst b/docs/source/methods/statistical_models.rst
index 6db7552c..b3037f8c 100644
--- a/docs/source/methods/statistical_models.rst
+++ b/docs/source/methods/statistical_models.rst
@@ -1,307 +1,173 @@
 Beta-Binomial Model for Allelic Imbalance
 ==========================================
 
-This document describes the statistical framework WASP2 uses to detect
-allelic imbalance from allele count data.
+WASP2 detects allelic imbalance via a beta-binomial likelihood-ratio test.
+The beta-binomial accommodates overdispersion from biological variation,
+PCR amplification, and aggregation of counts across SNPs within a region —
+sources of variance that a simple binomial model cannot absorb.
 
-.. contents:: Contents
-   :local:
-   :depth: 2
+Model
+-----
 
-Motivation
-----------
-
-Why Not Use the Binomial?
-^^^^^^^^^^^^^^^^^^^^^^^^^
-
-The simplest model for allele counts is the binomial distribution. If reads
-are sampled independently from two alleles with equal probability:
-
-.. math::
-
-   X \sim \text{Binomial}(N, 0.5)
-
-where :math:`X` is the reference count and :math:`N` is the total count.
-
-However, real allele count data exhibits **overdispersion**: the variance
-exceeds the binomial expectation. Sources of overdispersion include:
-
-- **Biological variation**: True allelic imbalance varies across cells
-- **Technical noise**: PCR amplification introduces correlated errors
-- **Aggregation effects**: Combining counts across SNPs within a region
-- **Sampling from a population**: Different individuals have different AI
-
-The Beta-Binomial Model
------------------------
-
-The beta-binomial distribution naturally accommodates overdispersion by
-modeling the success probability as a random variable:
+With :math:`X` the reference count out of :math:`N` total at a site:
 
 .. math::
 
    p &\sim \text{Beta}(\alpha, \beta) \\
-   X | p &\sim \text{Binomial}(N, p)
-
-Marginalizing over :math:`p` gives the beta-binomial:
-
-.. math::
-
-   P(X = k) = \binom{N}{k} \frac{B(k + \alpha, N - k + \beta)}{B(\alpha, \beta)}
-
-where :math:`B(\cdot, \cdot)` is the beta function.
-
-Parameterization
-^^^^^^^^^^^^^^^^
-
-WASP2 uses the **mean-dispersion parameterization**:
-
-.. math::
-
-   \mu &= \frac{\alpha}{\alpha + \beta} \\
-   \rho &= \frac{1}{\alpha + \beta + 1}
-
-The dispersion parameter :math:`\rho \in (0, 1)` controls overdispersion:
-
-- :math:`\rho \to 0`: Approaches binomial (no overdispersion)
-- :math:`\rho \to 1`: Maximum overdispersion (all probability at 0 or N)
-
-The inverse transformation:
-
-.. math::
-
-   \alpha &= \mu \cdot \frac{1 - \rho}{\rho} \\
-   \beta &= (1 - \mu) \cdot \frac{1 - \rho}{\rho}
-
-Variance Structure
-^^^^^^^^^^^^^^^^^^
-
-The beta-binomial variance is:
-
-.. math::
-
-   \text{Var}(X) = N\mu(1-\mu) \left[ 1 + (N-1)\rho \right]
-
-The factor :math:`[1 + (N-1)\rho]` is the **variance inflation factor**.
-For typical values (:math:`\rho \approx 0.01`, :math:`N \approx 100`), this
-gives roughly 2x the binomial variance.
-
-Hypothesis Testing
-------------------
-
-WASP2 tests for allelic imbalance using a likelihood ratio test:
-
-**Null Hypothesis** :math:`H_0`: No imbalance (:math:`\mu = 0.5`)
-
-**Alternative Hypothesis** :math:`H_1`: Imbalance present (:math:`\mu \neq 0.5`)
-
-Likelihood Functions
-^^^^^^^^^^^^^^^^^^^^
+   X \mid p &\sim \text{Binomial}(N, p)
 
-Under the null hypothesis:
+WASP2 uses the mean-dispersion parameterization, :math:`\mu = \alpha/(\alpha+\beta)`
+and :math:`\rho = 1/(\alpha+\beta+1)`:
 
 .. math::
 
-   \mathcal{L}_0 = \prod_{i=1}^{n} P(X_i | N_i, \mu=0.5, \rho)
+   \alpha = \mu \cdot \frac{1 - \rho}{\rho}, \qquad
+   \beta  = (1 - \mu) \cdot \frac{1 - \rho}{\rho}
 
-Under the alternative:
+The variance,
 
 .. math::
 
-   \mathcal{L}_1 = \prod_{i=1}^{n} P(X_i | N_i, \hat{\mu}_{\text{MLE}}, \rho)
+   \text{Var}(X) = N\mu(1-\mu)\left[1 + (N-1)\rho\right],
 
-where the product is over SNPs within a region (peak, gene).
+recovers the binomial as :math:`\rho \to 0`. Estimation of :math:`\rho` is
+described in :doc:`dispersion_estimation`.
 
-Likelihood Ratio Test
-^^^^^^^^^^^^^^^^^^^^^
+Hypothesis Test
+---------------
 
-The test statistic:
+For each region, WASP2 tests
 
 .. math::
 
-   \Lambda = -2 \left[ \log \mathcal{L}_0 - \log \mathcal{L}_1 \right]
+   H_0 : \mu = 0.5 \qquad \text{vs.} \qquad H_1 : \mu \neq 0.5
 
-Under the null hypothesis, :math:`\Lambda` follows a chi-squared distribution
-with 1 degree of freedom:
+with the likelihood ratio statistic
 
 .. math::
 
-   \Lambda \sim \chi^2_1
+   \Lambda = -2\left[\log \mathcal{L}_0 - \log \mathcal{L}_1\right], \qquad
+   \Lambda \sim \chi^2_1 \text{ under } H_0,
 
-The p-value is:
+where :math:`\mathcal{L}_1` is maximized over :math:`\mu` with :math:`\rho`
+held at its null-model MLE (profile likelihood in :math:`\mu`). The p-value
+is :math:`P(\chi^2_1 > \Lambda)`.
 
-.. math::
-
-   p = P(\chi^2_1 > \Lambda)
-
-MLE Estimation
-^^^^^^^^^^^^^^
-
-The MLE for :math:`\mu` under the alternative is found by numerical optimization:
+MLE for :math:`\mu` under :math:`H_1`:
 
 .. code-block:: python
 
    from scipy.optimize import minimize_scalar
    from scipy.stats import betabinom
+   import numpy as np
 
    def neg_log_likelihood(mu, ref_counts, n_counts, rho):
        alpha = mu * (1 - rho) / rho
        beta = (1 - mu) * (1 - rho) / rho
        return -np.sum(betabinom.logpmf(ref_counts, n_counts, alpha, beta))
 
-   result = minimize_scalar(
+   mu_mle = minimize_scalar(
        neg_log_likelihood,
        args=(ref_counts, n_counts, rho),
        method='bounded',
-       bounds=(0, 1)
-   )
-   mu_mle = result.x
-
-Implementation in WASP2
------------------------
+       bounds=(0, 1),
+   ).x
 
-Single Dispersion Model
-^^^^^^^^^^^^^^^^^^^^^^^
+Dispersion models
+-----------------
 
-The default model estimates a single :math:`\rho` for all data:
+WASP2 provides a single pooled :math:`\rho` (default) and a coverage-dependent
+``linear`` model (logit-linear in :math:`N`). See :doc:`dispersion_estimation`
+for the choice criteria and code.
 
 .. code-block:: python
 
-   from analysis.as_analysis import single_model
+   from analysis.as_analysis import single_model, linear_model
 
    results = single_model(df, region_col="region", phased=False)
-
-This assumes homogeneous overdispersion across regions, which is often
-reasonable for moderately-sized datasets.
-
-Linear Dispersion Model
-^^^^^^^^^^^^^^^^^^^^^^^
-
-For large datasets, WASP2 offers a linear model where dispersion varies
-with total count:
-
-.. math::
-
-   \text{logit}(\rho) = \beta_0 + \beta_1 \cdot N
-
-This captures the observation that regions with higher coverage often show
-different dispersion characteristics:
-
-.. code-block:: python
-
-   from analysis.as_analysis import linear_model
-
    results = linear_model(df, region_col="region", phased=False)
 
-Phased vs Unphased Analysis
----------------------------
-
-WASP2 supports both phased and unphased genotype data.
-
-Unphased Model
-^^^^^^^^^^^^^^
-
-When genotype phase is unknown, WASP2 uses a mixture model that marginalizes
-over possible phase configurations:
+Phased vs. unphased data
+------------------------
 
-For a region with multiple SNPs, if we don't know which haplotype each
-ref allele belongs to, we sum over phase assignments using dynamic programming:
+**Phased** genotypes reduce to a direct beta-binomial per SNP, with
+:math:`\mu` or :math:`1-\mu` depending on which haplotype carries the
+reference allele:
 
 .. math::
 
-   P(\mathbf{X}) = \sum_{\phi \in \text{phases}} P(\mathbf{X} | \phi) \cdot P(\phi)
-
-where :math:`\phi` indexes phase configurations with prior :math:`P(\phi) = 0.5^{n-1}`.
-
-Phased Model
-^^^^^^^^^^^^
+   P(X_i \mid \text{hap}_i) = \text{BetaBinom}(X_i; N_i, \mu_{\text{hap}_i}, \rho)
 
-With phased genotypes (e.g., from read-backed phasing), the model is simpler:
+**Unphased** analysis marginalizes over phase configurations with a uniform
+prior across the :math:`2^{n-1}` phase assignments for a region of :math:`n`
+SNPs, using dynamic programming over SNPs (approach introduced in
+[Mayba2014]_):
 
 .. math::
 
-   P(X_i | \text{hap}_i) = \text{BetaBinom}(X_i; N_i, \mu_{\text{hap}_i}, \rho)
-
-where :math:`\mu_{\text{hap}_i}` is :math:`\mu` or :math:`1-\mu` depending on
-which haplotype the reference allele belongs to.
-
-Output Interpretation
----------------------
+   P(\mathbf{X}) = \sum_{\phi} P(\mathbf{X} \mid \phi) \, P(\phi),
+   \qquad P(\phi) = 0.5^{n-1}.
 
-WASP2 returns the following statistics for each region:
+Output
+------
 
-.. table:: Beta-Binomial Analysis Output
+.. table:: Columns emitted by the analysis pipeline
    :widths: 20 15 65
 
    ============ ======== ======================================================
    Column       Type     Interpretation
    ============ ======== ======================================================
-   null_ll      float    Log-likelihood under null (μ=0.5)
-   alt_ll       float    Log-likelihood under alternative (μ=MLE)
-   mu           float    MLE of imbalance proportion
-   lrt          float    Likelihood ratio test statistic
-   pval         float    p-value from χ² distribution
-   fdr_pval     float    FDR-corrected p-value (BH method)
+   null_ll      float    Log-likelihood under :math:`H_0` (:math:`\mu=0.5`)
+   alt_ll       float    Log-likelihood under :math:`H_1` (:math:`\mu=\hat\mu_{\text{MLE}}`)
+   mu           float    MLE of the reference-allele proportion
+   lrt          float    Likelihood ratio statistic
+   pval         float    :math:`\chi^2_1` p-value
+   fdr_pval     float    BH-adjusted p-value (see :doc:`fdr_correction`)
    ============ ======== ======================================================
 
-**Interpreting μ (mu)**:
+:math:`\mu > 0.5` indicates a reference-allele preference; :math:`\mu < 0.5`
+indicates an alternate preference; :math:`|\mu - 0.5|` is the effect size.
 
-- :math:`\mu = 0.5`: No imbalance (equal allele expression)
-- :math:`\mu > 0.5`: Reference allele preference
-- :math:`\mu < 0.5`: Alternate allele preference
-- :math:`|\mu - 0.5|`: Effect size (magnitude of imbalance)
+Practical notes
+---------------
 
-Practical Considerations
-------------------------
-
-Pseudocounts
-^^^^^^^^^^^^
-
-WASP2 adds pseudocounts to avoid log(0) issues:
-
-.. math::
+**Pseudocounts.** WASP2 adds :math:`c = 1` to both allele counts by default,
+shrinking estimates slightly toward 0.5. This gives conservative inference and
+avoids log-zero issues at extreme counts.
 
-   X' = X + c, \quad N' = N + 2c
+**Minimum counts.** Regions with :math:`N < \text{min\_count}` (default 10) are
+dropped — low coverage yields poor power and unstable MLEs.
 
-Default :math:`c = 1` (Laplace smoothing). This slightly shrinks estimates
-toward 0.5, providing conservative inference.
+**Region aggregation.** Analyzing at the region level (peaks, genes) rather
+than individual SNPs increases power, reduces the multiple-testing burden, and
+better captures regulatory effects that act across a region.
 
-Minimum Count Threshold
-^^^^^^^^^^^^^^^^^^^^^^^
-
-Regions with low total counts have poor statistical power. WASP2 filters
-regions with :math:`N < \text{min\_count}` (default: 10).
-
-Power depends on coverage and effect size:
-
-.. table:: Approximate Power (α=0.05)
-   :widths: 20 20 20 20 20
-
-   ========= ======== ======== ======== ========
-   Total N   μ=0.55   μ=0.60   μ=0.65   μ=0.70
-   ========= ======== ======== ======== ========
-   20        5%       10%      20%      35%
-   50        8%       25%      50%      75%
-   100       15%      45%      80%      95%
-   200       30%      75%      95%      99%
-   ========= ======== ======== ======== ========
-
-Region Aggregation
-^^^^^^^^^^^^^^^^^^
-
-Analyzing at the region level (genes, peaks) rather than individual SNPs:
-
-- **Increases power**: More counts per test
-- **Reduces multiple testing burden**: Fewer tests to correct
-- **Captures regulatory effects**: ASE often affects entire genes
+**Power.** Power depends jointly on :math:`N`, :math:`\mu`, and :math:`\rho`.
+Because WASP2 does not compute analytic power estimates at runtime, we do not
+tabulate values here — power simulations for a given dataset should be run at
+the dataset's own estimated :math:`\hat\rho` rather than relying on generic
+tables.
 
 See Also
 --------
 
-- :doc:`dispersion_estimation` - Estimating the dispersion parameter
-- :doc:`fdr_correction` - Multiple testing correction methods
+- :doc:`dispersion_estimation` — estimating :math:`\rho`
+- :doc:`fdr_correction` — multiple-testing correction
+- :doc:`counting_algorithm` — how counts are produced
+- :doc:`mapping_filter` — WASP remap-and-filter producing the BAM
 
 References
 ----------
 
-.. [Mayba2014] Mayba O, Gilbert HN, Liu J, et al. (2014). MBASED: allele-specific
-   expression detection in cancer tissues and cell lines. *Genome Biology* 15:405.
+.. [Mayba2014] Mayba O, Gilbert HN, Liu J, et al. (2014). MBASED:
+   allele-specific expression detection in cancer tissues and cell lines.
+   *Genome Biology* 15:405. *Introduces the phase-marginalized mixture model.*
+
+.. [Kumasaka2016] Kumasaka N, Knights AJ, Gaffney DJ (2016). Fine-mapping
+   cellular QTLs with RASQUAL and ATAC-seq. *Nature Genetics* 48:206-213.
+   *Beta-binomial LRT framework in which WASP2's analysis sits.*
+
+.. [vandeGeijn2015] van de Geijn B, McVicker G, Gilad Y, Pritchard JK (2015).
+   WASP: allele-specific software for robust molecular quantitative trait
+   locus discovery. *Nature Methods* 12:1061-1063. *The mapping-bias
+   correction upstream of the count step.*
diff --git a/docs/source/tutorials/atac_seq_workflow.ipynb b/docs/source/tutorials/atac_seq_workflow.ipynb
deleted file mode 100644
index 27e016ca..00000000
--- a/docs/source/tutorials/atac_seq_workflow.ipynb
+++ /dev/null
@@ -1,944 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# ATAC-seq Allelic Imbalance Analysis Tutorial\n",
-    "\n",
-    "**Estimated time:** ~30 minutes\n",
-    "\n",
-    "This tutorial demonstrates a complete WASP2 workflow for detecting allelic imbalance in chromatin accessibility from ATAC-seq data. You will learn to:\n",
-    "\n",
-    "1. Prepare ATAC-seq peak files for analysis\n",
-    "2. Count alleles at heterozygous SNPs within accessibility peaks\n",
-    "3. Detect significant allelic imbalance using beta-binomial statistical testing\n",
-    "4. Visualize results with volcano plots and effect size distributions\n",
-    "5. Integrate with caQTL/eQTL data for biological interpretation\n",
-    "\n",
-    "## Background\n",
-    "\n",
-    "**Allelic Imbalance in Chromatin Accessibility**\n",
-    "\n",
-    "ATAC-seq (Assay for Transposase-Accessible Chromatin with sequencing) measures open chromatin regions genome-wide. When a heterozygous individual shows unequal accessibility between maternal and paternal alleles at a regulatory region, this indicates **allelic imbalance in chromatin accessibility**.\n",
-    "\n",
-    "Such imbalance often reflects:\n",
-    "- *cis*-regulatory variants affecting transcription factor binding\n",
-    "- Chromatin accessibility QTLs (caQTLs)\n",
-    "- Haplotype-specific enhancer activity\n",
-    "\n",
-    "WASP2 uses a **beta-binomial model** to detect significant departures from the expected 50:50 allelic ratio while accounting for overdispersion in sequencing data."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": "## Prerequisites\n\n### Software\n\n- **WASP2** (`pip install wasp2`)\n- **Python 3.10+** with pandas, matplotlib, numpy\n- **samtools** (for BAM operations)\n- **tabix** (for VCF indexing)\n\n### Input Data\n\n| File | Description | Format |\n|------|-------------|--------|\n| `sample.bam` | ATAC-seq aligned reads | BAM (indexed) |\n| `variants.vcf.gz` | Phased heterozygous variants | VCF (indexed) |\n| `peaks.bed` | ATAC-seq peaks from MACS2/SEACR | BED or narrowPeak |\n\n**Note:** For best results, use WASP-filtered BAM files to correct reference mapping bias. See the [mapping documentation](../user_guide/mapping.rst) for details."
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Setup"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from pathlib import Path\n",
-    "\n",
-    "import matplotlib.pyplot as plt\n",
-    "import numpy as np\n",
-    "import pandas as pd\n",
-    "\n",
-    "# Configure plotting\n",
-    "plt.style.use(\"seaborn-v0_8-whitegrid\")\n",
-    "plt.rcParams[\"figure.figsize\"] = (10, 6)\n",
-    "plt.rcParams[\"font.size\"] = 11\n",
-    "\n",
-    "# Define paths (modify these for your data)\n",
-    "DATA_DIR = Path(\"data\")\n",
-    "RESULTS_DIR = Path(\"results\")\n",
-    "RESULTS_DIR.mkdir(exist_ok=True)\n",
-    "\n",
-    "# Input files\n",
-    "BAM_FILE = DATA_DIR / \"atac_sample.bam\"\n",
-    "VCF_FILE = DATA_DIR / \"phased_variants.vcf.gz\"\n",
-    "PEAKS_FILE = DATA_DIR / \"atac_peaks.narrowPeak\"\n",
-    "SAMPLE_ID = \"SAMPLE1\"  # Sample name in VCF\n",
-    "\n",
-    "print(\"WASP2 ATAC-seq Tutorial\")\n",
-    "print(\"=\" * 40)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Section 1: Data Loading and Preparation\n",
-    "\n",
-    "### 1.1 Inspect Peak File Format\n",
-    "\n",
-    "ATAC-seq peaks from MACS2 are typically in **narrowPeak** format (BED6+4). WASP2 accepts both BED and narrowPeak formats."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Load and inspect peaks\n",
-    "peak_columns = [\n",
-    "    \"chr\",\n",
-    "    \"start\",\n",
-    "    \"end\",\n",
-    "    \"name\",\n",
-    "    \"score\",\n",
-    "    \"strand\",\n",
-    "    \"signalValue\",\n",
-    "    \"pValue\",\n",
-    "    \"qValue\",\n",
-    "    \"peak\",\n",
-    "]\n",
-    "peaks_df = pd.read_csv(PEAKS_FILE, sep=\"\\t\", header=None, names=peak_columns)\n",
-    "\n",
-    "print(f\"Total peaks: {len(peaks_df):,}\")\n",
-    "print(\"\\nPeak size distribution:\")\n",
-    "peaks_df[\"size\"] = peaks_df[\"end\"] - peaks_df[\"start\"]\n",
-    "print(peaks_df[\"size\"].describe())\n",
-    "\n",
-    "print(\"\\nFirst 5 peaks:\")\n",
-    "peaks_df.head()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 1.2 Verify BAM and VCF Files\n",
-    "\n",
-    "Check that your input files are properly formatted and indexed."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "%%bash -s \"$BAM_FILE\" \"$VCF_FILE\" \"$SAMPLE_ID\"\n",
-    "BAM_FILE=$1\n",
-    "VCF_FILE=$2\n",
-    "SAMPLE_ID=$3\n",
-    "\n",
-    "echo \"=== BAM File Check ===\"\n",
-    "echo \"File: $BAM_FILE\"\n",
-    "samtools view -H \"$BAM_FILE\" 2>/dev/null | head -5 || echo \"Note: Using example paths\"\n",
-    "\n",
-    "echo \"\"\n",
-    "echo \"=== VCF File Check ===\"\n",
-    "echo \"File: $VCF_FILE\"\n",
-    "echo \"Checking for sample: $SAMPLE_ID\"\n",
-    "bcftools query -l \"$VCF_FILE\" 2>/dev/null | head -5 || echo \"Note: Using example paths\"\n",
-    "\n",
-    "echo \"\"\n",
-    "echo \"=== Index Check ===\"\n",
-    "ls -la \"${BAM_FILE}.bai\" 2>/dev/null || echo \"BAM index (.bai): Using example paths\"\n",
-    "ls -la \"${VCF_FILE}.tbi\" 2>/dev/null || echo \"VCF index (.tbi): Using example paths\""
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Section 2: Allele Counting at Peaks\n",
-    "\n",
-    "WASP2 counts reads supporting reference and alternate alleles at each heterozygous SNP within ATAC-seq peaks. The `--region` parameter restricts counting to SNPs overlapping your peaks.\n",
-    "\n",
-    "### 2.1 Run Allele Counting\n",
-    "\n",
-    "**Key Parameters:**\n",
-    "- `--region`: Peak file to restrict SNPs to accessible regions\n",
-    "- `--samples`: Sample ID for genotype filtering (heterozygous sites only)\n",
-    "- `--out_file`: Output path for count results"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Define output file\n",
-    "COUNTS_FILE = RESULTS_DIR / \"atac_allele_counts.tsv\"\n",
-    "\n",
-    "# Build the command\n",
-    "count_cmd = f\"\"\"\n",
-    "wasp2-count count-variants \\\\\n",
-    "    {BAM_FILE} \\\\\n",
-    "    {VCF_FILE} \\\\\n",
-    "    --region {PEAKS_FILE} \\\\\n",
-    "    --samples {SAMPLE_ID} \\\\\n",
-    "    --out_file {COUNTS_FILE}\n",
-    "\"\"\"\n",
-    "\n",
-    "print(\"Command to run:\")\n",
-    "print(count_cmd)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "%%bash -s \"$BAM_FILE\" \"$VCF_FILE\" \"$PEAKS_FILE\" \"$SAMPLE_ID\" \"$COUNTS_FILE\"\n",
-    "# Uncomment to run (requires actual data files)\n",
-    "# wasp2-count count-variants \\\n",
-    "#     \"$1\" \\\n",
-    "#     \"$2\" \\\n",
-    "#     --region \"$3\" \\\n",
-    "#     --samples \"$4\" \\\n",
-    "#     --out_file \"$5\"\n",
-    "\n",
-    "echo \"Note: Uncomment the command above to run with your data\"\n",
-    "echo \"For this tutorial, we'll use simulated example output.\""
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 2.2 Inspect Count Results\n",
-    "\n",
-    "The output contains per-SNP allele counts with peak annotations."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# For demonstration, create example data\n",
-    "# (Replace with: counts_df = pd.read_csv(COUNTS_FILE, sep='\\t') for real data)\n",
-    "\n",
-    "np.random.seed(42)\n",
-    "n_snps = 5000\n",
-    "\n",
-    "# Simulate realistic ATAC-seq allele counts\n",
-    "counts_df = pd.DataFrame(\n",
-    "    {\n",
-    "        \"chr\": np.random.choice([\"chr1\", \"chr2\", \"chr3\", \"chr4\", \"chr5\"], n_snps),\n",
-    "        \"pos\": np.random.randint(1e6, 2e8, n_snps),\n",
-    "        \"ref\": np.random.choice([\"A\", \"C\", \"G\", \"T\"], n_snps),\n",
-    "        \"alt\": np.random.choice([\"A\", \"C\", \"G\", \"T\"], n_snps),\n",
-    "        \"region_id\": [f\"peak_{i}\" for i in np.random.randint(0, 1500, n_snps)],\n",
-    "    }\n",
-    ")\n",
-    "\n",
-    "# Generate counts with some true imbalanced regions\n",
-    "total_depth = np.random.negative_binomial(5, 0.3, n_snps) + 5\n",
-    "imbalance_prob = np.where(\n",
-    "    np.random.random(n_snps) < 0.1,  # 10% truly imbalanced\n",
-    "    np.random.choice([0.3, 0.7], n_snps),  # Imbalanced allele freq\n",
-    "    0.5,  # Balanced\n",
-    ")\n",
-    "counts_df[\"ref_count\"] = np.random.binomial(total_depth, imbalance_prob)\n",
-    "counts_df[\"alt_count\"] = total_depth - counts_df[\"ref_count\"]\n",
-    "counts_df[\"other_count\"] = 0\n",
-    "\n",
-    "print(f\"Total SNPs counted: {len(counts_df):,}\")\n",
-    "print(f\"Unique peaks with SNPs: {counts_df['region_id'].nunique():,}\")\n",
-    "print(\"\\nCount statistics:\")\n",
-    "print(counts_df[[\"ref_count\", \"alt_count\"]].describe())\n",
-    "counts_df.head(10)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 2.3 Quality Control: Count Distribution\n",
-    "\n",
-    "ATAC-seq typically has **lower coverage per peak** than RNA-seq genes. Check the distribution to set appropriate filtering thresholds."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Calculate total counts per SNP\n",
-    "counts_df[\"total\"] = counts_df[\"ref_count\"] + counts_df[\"alt_count\"]\n",
-    "\n",
-    "fig, axes = plt.subplots(1, 2, figsize=(12, 4))\n",
-    "\n",
-    "# Total count distribution\n",
-    "ax = axes[0]\n",
-    "ax.hist(counts_df[\"total\"], bins=50, edgecolor=\"black\", alpha=0.7)\n",
-    "ax.axvline(10, color=\"red\", linestyle=\"--\", label=\"min_count=10\")\n",
-    "ax.axvline(5, color=\"orange\", linestyle=\"--\", label=\"min_count=5\")\n",
-    "ax.set_xlabel(\"Total Read Count per SNP\")\n",
-    "ax.set_ylabel(\"Number of SNPs\")\n",
-    "ax.set_title(\"Read Depth Distribution\")\n",
-    "ax.legend()\n",
-    "ax.set_xlim(0, 100)\n",
-    "\n",
-    "# Allele ratio distribution\n",
-    "ax = axes[1]\n",
-    "ratio = counts_df[\"ref_count\"] / counts_df[\"total\"]\n",
-    "ax.hist(ratio[counts_df[\"total\"] >= 10], bins=50, edgecolor=\"black\", alpha=0.7)\n",
-    "ax.axvline(0.5, color=\"red\", linestyle=\"--\", label=\"Expected (0.5)\")\n",
-    "ax.set_xlabel(\"Reference Allele Frequency\")\n",
-    "ax.set_ylabel(\"Number of SNPs\")\n",
-    "ax.set_title(\"Allele Ratio Distribution (depth ≥10)\")\n",
-    "ax.legend()\n",
-    "\n",
-    "plt.tight_layout()\n",
-    "plt.savefig(RESULTS_DIR / \"count_qc.png\", dpi=150)\n",
-    "plt.show()\n",
-    "\n",
-    "# Summary statistics\n",
-    "print(\n",
-    "    f\"\\nSNPs with depth ≥5: {(counts_df['total'] >= 5).sum():,} ({100 * (counts_df['total'] >= 5).mean():.1f}%)\"\n",
-    ")\n",
-    "print(\n",
-    "    f\"SNPs with depth ≥10: {(counts_df['total'] >= 10).sum():,} ({100 * (counts_df['total'] >= 10).mean():.1f}%)\"\n",
-    ")\n",
-    "print(\n",
-    "    f\"SNPs with depth ≥20: {(counts_df['total'] >= 20).sum():,} ({100 * (counts_df['total'] >= 20).mean():.1f}%)\"\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Section 3: Statistical Testing (Beta-Binomial)\n",
-    "\n",
-    "WASP2's `find-imbalance` command uses a **beta-binomial model** to test for allelic imbalance:\n",
-    "\n",
-    "- **Null hypothesis (H₀):** Reference allele frequency = 0.5 (balanced)\n",
-    "- **Alternative (H₁):** Reference allele frequency ≠ 0.5 (imbalanced)\n",
-    "\n",
-    "The beta-binomial distribution accounts for **overdispersion** - the extra variability beyond binomial sampling that's common in sequencing data.\n",
-    "\n",
-    "### 3.1 Run Imbalance Detection"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Define output file\n",
-    "IMBALANCE_FILE = RESULTS_DIR / \"atac_imbalance_results.tsv\"\n",
-    "\n",
-    "# Build the command\n",
-    "# Note: Using --min 5 for ATAC-seq (lower coverage than RNA-seq)\n",
-    "# The --model single option uses a single dispersion parameter for all regions\n",
-    "analysis_cmd = f\"\"\"\n",
-    "wasp2-analyze find-imbalance \\\\\n",
-    "    {COUNTS_FILE} \\\\\n",
-    "    --min 5 \\\\\n",
-    "    --model single \\\\\n",
-    "    --output {IMBALANCE_FILE}\n",
-    "\"\"\"\n",
-    "\n",
-    "print(\"Command to run:\")\n",
-    "print(analysis_cmd)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": "%%bash -s \"$COUNTS_FILE\" \"$IMBALANCE_FILE\"\n# Uncomment to run (requires actual count file)\n# wasp2-analyze find-imbalance \\\n#     \"$1\" \\\n#     --min 5 \\\n#     --model single \\\n#     --output \"$2\"\n\necho \"Note: Uncomment the command above to run with your data\""
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 3.2 Simulate Results for Demonstration\n",
-    "\n",
-    "For this tutorial, we simulate realistic analysis results."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Aggregate counts by peak (region)\n",
-    "peak_counts = (\n",
-    "    counts_df.groupby(\"region_id\")\n",
-    "    .agg(\n",
-    "        {\n",
-    "            \"chr\": \"first\",\n",
-    "            \"pos\": [\"min\", \"max\"],\n",
-    "            \"ref_count\": \"sum\",\n",
-    "            \"alt_count\": \"sum\",\n",
-    "        }\n",
-    "    )\n",
-    "    .reset_index()\n",
-    ")\n",
-    "peak_counts.columns = [\"region\", \"chr\", \"start\", \"end\", \"ref_count\", \"alt_count\"]\n",
-    "\n",
-    "# Calculate statistics\n",
-    "peak_counts[\"total\"] = peak_counts[\"ref_count\"] + peak_counts[\"alt_count\"]\n",
-    "peak_counts[\"mu\"] = peak_counts[\"ref_count\"] / peak_counts[\"total\"]\n",
-    "# Add pseudocount (+1) to avoid log(0) and stabilize ratios for low-count regions\n",
-    "peak_counts[\"effect_size\"] = np.log2(\n",
-    "    (peak_counts[\"ref_count\"] + 1) / (peak_counts[\"alt_count\"] + 1)\n",
-    ")\n",
-    "\n",
-    "# Simulate p-values (truly imbalanced peaks get low p-values)\n",
-    "# Note: This simulation uses binomial for simplicity. Real ATAC-seq data exhibits\n",
-    "# overdispersion, which is why WASP2 uses the beta-binomial model.\n",
-    "np.random.seed(42)\n",
-    "is_imbalanced = np.abs(peak_counts[\"mu\"] - 0.5) > 0.15\n",
-    "peak_counts[\"p_value\"] = np.where(\n",
-    "    is_imbalanced,\n",
-    "    10 ** (-np.random.uniform(2, 10, len(peak_counts))),  # Significant\n",
-    "    np.random.uniform(0.05, 1, len(peak_counts)),  # Not significant\n",
-    ")\n",
-    "\n",
-    "# FDR correction (Benjamini-Hochberg)\n",
-    "# Note: This manual BH implementation is for demonstration.\n",
-    "# WASP2 internally uses scipy.stats.false_discovery_control()\n",
-    "peak_counts = peak_counts.sort_values(\"p_value\")\n",
-    "n_tests = len(peak_counts)\n",
-    "peak_counts[\"rank\"] = range(1, n_tests + 1)\n",
-    "peak_counts[\"fdr_pval\"] = np.minimum(peak_counts[\"p_value\"] * n_tests / peak_counts[\"rank\"], 1.0)\n",
-    "peak_counts[\"fdr_pval\"] = peak_counts[\"fdr_pval\"][::-1].cummin()[::-1]\n",
-    "\n",
-    "# Filter to testable peaks\n",
-    "results_df = peak_counts[peak_counts[\"total\"] >= 5].copy()\n",
-    "results_df = results_df.drop(\"rank\", axis=1)\n",
-    "\n",
-    "print(f\"Peaks tested: {len(results_df):,}\")\n",
-    "print(f\"Significant (FDR < 0.05): {(results_df['fdr_pval'] < 0.05).sum():,}\")\n",
-    "print(f\"Significant (FDR < 0.01): {(results_df['fdr_pval'] < 0.01).sum():,}\")\n",
-    "results_df.head(10)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Section 4: Result Interpretation and Visualization\n",
-    "\n",
-    "### 4.1 Volcano Plot\n",
-    "\n",
-    "The volcano plot shows effect size (x-axis) vs. statistical significance (y-axis), helping identify peaks with both strong and significant allelic imbalance."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "fig, ax = plt.subplots(figsize=(10, 8))\n",
-    "\n",
-    "# Calculate -log10(p-value) for plotting\n",
-    "results_df[\"neg_log10_pval\"] = -np.log10(results_df[\"p_value\"].clip(lower=1e-50))\n",
-    "\n",
-    "# Define significance thresholds\n",
-    "sig_mask = results_df[\"fdr_pval\"] < 0.05\n",
-    "effect_mask = np.abs(results_df[\"effect_size\"]) > 0.5  # |log2FC| > 0.5\n",
-    "\n",
-    "# Plot non-significant points\n",
-    "ns = ~sig_mask\n",
-    "ax.scatter(\n",
-    "    results_df.loc[ns, \"effect_size\"],\n",
-    "    results_df.loc[ns, \"neg_log10_pval\"],\n",
-    "    c=\"lightgray\",\n",
-    "    s=15,\n",
-    "    alpha=0.6,\n",
-    "    label=f\"Not significant (n={ns.sum():,})\",\n",
-    ")\n",
-    "\n",
-    "# Plot significant but small effect\n",
-    "sig_small = sig_mask & ~effect_mask\n",
-    "ax.scatter(\n",
-    "    results_df.loc[sig_small, \"effect_size\"],\n",
-    "    results_df.loc[sig_small, \"neg_log10_pval\"],\n",
-    "    c=\"steelblue\",\n",
-    "    s=25,\n",
-    "    alpha=0.7,\n",
-    "    label=f\"FDR<0.05, small effect (n={sig_small.sum():,})\",\n",
-    ")\n",
-    "\n",
-    "# Plot significant and large effect\n",
-    "sig_large = sig_mask & effect_mask\n",
-    "ax.scatter(\n",
-    "    results_df.loc[sig_large, \"effect_size\"],\n",
-    "    results_df.loc[sig_large, \"neg_log10_pval\"],\n",
-    "    c=\"firebrick\",\n",
-    "    s=40,\n",
-    "    alpha=0.8,\n",
-    "    label=f\"FDR<0.05, |log2FC|>0.5 (n={sig_large.sum():,})\",\n",
-    ")\n",
-    "\n",
-    "# Add threshold lines\n",
-    "ax.axhline(-np.log10(0.05), color=\"black\", linestyle=\"--\", alpha=0.3, linewidth=1)\n",
-    "ax.axvline(0.5, color=\"gray\", linestyle=\":\", alpha=0.5)\n",
-    "ax.axvline(-0.5, color=\"gray\", linestyle=\":\", alpha=0.5)\n",
-    "ax.axvline(0, color=\"black\", linestyle=\"-\", alpha=0.2)\n",
-    "\n",
-    "ax.set_xlabel(\"Effect Size (log₂ Ref/Alt)\", fontsize=12)\n",
-    "ax.set_ylabel(\"-log₁₀(p-value)\", fontsize=12)\n",
-    "ax.set_title(\"ATAC-seq Allelic Imbalance\\nVolcano Plot\", fontsize=14)\n",
-    "ax.legend(loc=\"upper right\", fontsize=9)\n",
-    "\n",
-    "plt.tight_layout()\n",
-    "plt.savefig(RESULTS_DIR / \"volcano_plot.png\", dpi=150)\n",
-    "plt.show()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 4.2 Effect Size Distribution"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "fig, axes = plt.subplots(1, 2, figsize=(12, 4))\n",
-    "\n",
-    "# All peaks\n",
-    "ax = axes[0]\n",
-    "ax.hist(results_df[\"effect_size\"], bins=50, edgecolor=\"black\", alpha=0.7, color=\"steelblue\")\n",
-    "ax.axvline(0, color=\"red\", linestyle=\"--\", linewidth=2)\n",
-    "ax.set_xlabel(\"Effect Size (log₂ Ref/Alt)\")\n",
-    "ax.set_ylabel(\"Number of Peaks\")\n",
-    "ax.set_title(\"All Tested Peaks\")\n",
-    "\n",
-    "# Significant peaks only\n",
-    "ax = axes[1]\n",
-    "sig_effects = results_df.loc[sig_mask, \"effect_size\"]\n",
-    "ax.hist(sig_effects, bins=30, edgecolor=\"black\", alpha=0.7, color=\"firebrick\")\n",
-    "ax.axvline(0, color=\"black\", linestyle=\"--\", linewidth=2)\n",
-    "ax.set_xlabel(\"Effect Size (log₂ Ref/Alt)\")\n",
-    "ax.set_ylabel(\"Number of Peaks\")\n",
-    "ax.set_title(f\"Significant Peaks (FDR < 0.05, n={sig_mask.sum():,})\")\n",
-    "\n",
-    "plt.tight_layout()\n",
-    "plt.savefig(RESULTS_DIR / \"effect_size_distribution.png\", dpi=150)\n",
-    "plt.show()\n",
-    "\n",
-    "# Summary statistics\n",
-    "print(\"Effect size statistics (significant peaks):\")\n",
-    "print(f\"  Mean: {sig_effects.mean():.3f}\")\n",
-    "print(f\"  Median: {sig_effects.median():.3f}\")\n",
-    "print(f\"  Std: {sig_effects.std():.3f}\")\n",
-    "print(f\"  Ref-biased (log2FC > 0): {(sig_effects > 0).sum()}\")\n",
-    "print(f\"  Alt-biased (log2FC < 0): {(sig_effects < 0).sum()}\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 4.3 Top Imbalanced Peaks"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Get top hits by significance\n",
-    "top_hits = results_df[results_df[\"fdr_pval\"] < 0.05].nsmallest(20, \"fdr_pval\")\n",
-    "\n",
-    "print(\"Top 20 Peaks with Allelic Imbalance\")\n",
-    "print(\"=\" * 80)\n",
-    "display_cols = [\n",
-    "    \"region\",\n",
-    "    \"chr\",\n",
-    "    \"ref_count\",\n",
-    "    \"alt_count\",\n",
-    "    \"mu\",\n",
-    "    \"effect_size\",\n",
-    "    \"p_value\",\n",
-    "    \"fdr_pval\",\n",
-    "]\n",
-    "top_hits[display_cols].round(4)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Section 5: QTL Overlap Analysis\n",
-    "\n",
-    "Peaks with allelic imbalance often harbor **chromatin accessibility QTLs (caQTLs)** or overlap with **expression QTLs (eQTLs)**. Integrating your results with published QTL databases helps validate findings and identify regulatory mechanisms.\n",
-    "\n",
-    "### 5.1 Prepare BED File for Overlap Analysis"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Export significant peaks as BED for overlap analysis\n",
-    "sig_peaks = results_df[results_df[\"fdr_pval\"] < 0.05][\n",
-    "    [\"chr\", \"start\", \"end\", \"region\", \"effect_size\", \"fdr_pval\"]\n",
-    "].copy()\n",
-    "sig_peaks.to_csv(RESULTS_DIR / \"significant_peaks.bed\", sep=\"\\t\", header=False, index=False)\n",
-    "\n",
-    "print(f\"Exported {len(sig_peaks)} significant peaks to: {RESULTS_DIR / 'significant_peaks.bed'}\")\n",
-    "print(\"\\nUse this file for overlap analysis with:\")\n",
-    "print(\"  - GTEx eQTLs (https://gtexportal.org)\")\n",
-    "print(\"  - ENCODE cCREs (https://www.encodeproject.org)\")\n",
-    "print(\"  - Published caQTL datasets\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 5.2 Example: GTEx eQTL Overlap\n",
-    "\n",
-    "This example shows how to intersect your imbalanced peaks with GTEx eQTL SNPs to identify potential regulatory relationships."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Example overlap analysis with bedtools (requires eQTL BED file)\n",
-    "overlap_cmd = \"\"\"\n",
-    "# Download GTEx eQTLs for your tissue of interest\n",
-    "# Example: Brain cortex significant eQTLs\n",
-    "\n",
-    "# Intersect imbalanced peaks with eQTL positions\n",
-    "bedtools intersect \\\\\n",
-    "    -a results/significant_peaks.bed \\\\\n",
-    "    -b gtex_brain_cortex_eqtls.bed \\\\\n",
-    "    -wa -wb \\\\\n",
-    "    > results/peak_eqtl_overlap.bed\n",
-    "\n",
-    "# Count overlaps\n",
-    "wc -l results/peak_eqtl_overlap.bed\n",
-    "\"\"\"\n",
-    "\n",
-    "print(\"Example bedtools command for eQTL overlap:\")\n",
-    "print(overlap_cmd)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Simulate overlap results for demonstration\n",
-    "np.random.seed(123)\n",
-    "\n",
-    "# Create simulated eQTL overlap data\n",
-    "n_overlap = 150\n",
-    "overlap_df = pd.DataFrame(\n",
-    "    {\n",
-    "        \"peak\": [f\"peak_{i}\" for i in np.random.choice(range(1500), n_overlap, replace=False)],\n",
-    "        \"eqtl_gene\": [f\"GENE{i}\" for i in np.random.randint(1, 500, n_overlap)],\n",
-    "        \"eqtl_pval\": 10 ** (-np.random.uniform(3, 15, n_overlap)),\n",
-    "        \"tissue\": np.random.choice(\n",
-    "            [\"Brain_Cortex\", \"Brain_Hippocampus\", \"Liver\", \"Heart\"], n_overlap\n",
-    "        ),\n",
-    "    }\n",
-    ")\n",
-    "\n",
-    "print(f\"Peaks overlapping eQTLs: {len(overlap_df)}\")\n",
-    "print(\"\\nOverlap by tissue:\")\n",
-    "print(overlap_df[\"tissue\"].value_counts())\n",
-    "\n",
-    "# Enrichment analysis\n",
-    "n_sig_peaks = sig_mask.sum()\n",
-    "n_total_peaks = len(results_df)\n",
-    "n_eqtl_overlap = len(overlap_df)\n",
-    "\n",
-    "# Fisher's exact test for enrichment\n",
-    "from scipy.stats import fisher_exact\n",
-    "\n",
-    "# Assume 10% of all peaks overlap eQTLs by chance\n",
-    "expected_overlap = int(n_sig_peaks * 0.10)\n",
-    "contingency = [\n",
-    "    [n_eqtl_overlap, n_sig_peaks - n_eqtl_overlap],\n",
-    "    [expected_overlap, n_sig_peaks - expected_overlap],\n",
-    "]\n",
-    "odds_ratio, p_value = fisher_exact(contingency)\n",
-    "\n",
-    "print(\"\\nEnrichment Analysis:\")\n",
-    "print(f\"  Imbalanced peaks: {n_sig_peaks}\")\n",
-    "print(f\"  Overlapping eQTLs: {n_eqtl_overlap}\")\n",
-    "print(f\"  Expected by chance: ~{expected_overlap}\")\n",
-    "print(f\"  Fold enrichment: {n_eqtl_overlap / max(expected_overlap, 1):.2f}x\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 5.3 Visualization: eQTL Overlap"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "fig, axes = plt.subplots(1, 2, figsize=(12, 4))\n",
-    "\n",
-    "# Pie chart: Overlap proportion\n",
-    "ax = axes[0]\n",
-    "overlap_counts = [n_eqtl_overlap, n_sig_peaks - n_eqtl_overlap]\n",
-    "labels = [\"Overlap with eQTL\", \"No eQTL overlap\"]\n",
-    "colors = [\"#e74c3c\", \"#95a5a6\"]\n",
-    "ax.pie(overlap_counts, labels=labels, colors=colors, autopct=\"%1.1f%%\", startangle=90)\n",
-    "ax.set_title(\"Imbalanced Peaks Overlapping eQTLs\")\n",
-    "\n",
-    "# Bar chart: Overlap by tissue\n",
-    "ax = axes[1]\n",
-    "tissue_counts = overlap_df[\"tissue\"].value_counts()\n",
-    "colors = plt.cm.Set2(range(len(tissue_counts)))\n",
-    "bars = ax.bar(tissue_counts.index, tissue_counts.values, color=colors, edgecolor=\"black\")\n",
-    "ax.set_xlabel(\"Tissue\")\n",
-    "ax.set_ylabel(\"Number of Overlapping eQTLs\")\n",
-    "ax.set_title(\"eQTL Overlap by Tissue\")\n",
-    "ax.tick_params(axis=\"x\", rotation=45)\n",
-    "\n",
-    "plt.tight_layout()\n",
-    "plt.savefig(RESULTS_DIR / \"eqtl_overlap.png\", dpi=150)\n",
-    "plt.show()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Section 6: Downstream Analysis Hints\n",
-    "\n",
-    "### 6.1 Motif Enrichment Analysis\n",
-    "\n",
-    "Imbalanced peaks may disrupt transcription factor binding sites. Use tools like HOMER or MEME-ChIP for motif analysis."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "motif_cmd = \"\"\"\n",
-    "# Extract sequences around imbalanced SNPs\n",
-    "bedtools slop -i significant_peaks.bed -g genome.chrom.sizes -b 50 | \\\\\n",
-    "bedtools getfasta -fi genome.fa -bed - -fo imbalanced_seqs.fa\n",
-    "\n",
-    "# Run HOMER motif analysis\n",
-    "findMotifsGenome.pl significant_peaks.bed hg38 motif_results/ -size 200\n",
-    "\n",
-    "# Alternative: MEME-ChIP\n",
-    "meme-chip -oc meme_results imbalanced_seqs.fa\n",
-    "\"\"\"\n",
-    "\n",
-    "print(\"Example commands for motif enrichment analysis:\")\n",
-    "print(motif_cmd)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 6.2 Gene Ontology Enrichment\n",
-    "\n",
-    "Identify biological processes associated with genes near imbalanced peaks."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "go_cmd = \"\"\"\n",
-    "# Annotate peaks with nearest genes using GREAT or bedtools\n",
-    "bedtools closest -a significant_peaks.bed -b genes.bed -d > peak_gene_assignments.bed\n",
-    "\n",
-    "# Extract gene list\n",
-    "cut -f8 peak_gene_assignments.bed | sort -u > imbalanced_genes.txt\n",
-    "\n",
-    "# Use DAVID, Enrichr, or clusterProfiler for GO enrichment\n",
-    "# Web interface: https://david.ncifcrf.gov/\n",
-    "# Web interface: https://maayanlab.cloud/Enrichr/\n",
-    "\"\"\"\n",
-    "\n",
-    "print(\"Example commands for GO enrichment analysis:\")\n",
-    "print(go_cmd)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 6.3 Single-Cell ATAC-seq Extension\n",
-    "\n",
-    "For single-cell ATAC-seq (scATAC-seq), use WASP2's single-cell workflow to detect cell-type-specific allelic imbalance."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "sc_cmd = \"\"\"\n",
-    "# Count alleles in single-cell ATAC-seq\n",
-    "wasp2-count count-variants-sc \\\\\n",
-    "    scatac_possorted.bam \\\\\n",
-    "    variants.vcf.gz \\\\\n",
-    "    barcodes.tsv \\\\\n",
-    "    --samples SAMPLE_ID \\\\\n",
-    "    --feature peaks.bed \\\\\n",
-    "    --out_file scatac_counts.h5ad\n",
-    "\n",
-    "# Detect imbalance per cell type\n",
-    "wasp2-analyze find-imbalance-sc \\\\\n",
-    "    scatac_counts.h5ad \\\\\n",
-    "    barcode_celltype_map.tsv \\\\\n",
-    "    --sample SAMPLE_ID \\\\\n",
-    "    --min 5 \\\\\n",
-    "    --phased\n",
-    "\n",
-    "# Compare imbalance between cell types\n",
-    "wasp2-analyze compare-imbalance \\\\\n",
-    "    scatac_counts.h5ad \\\\\n",
-    "    barcode_celltype_map.tsv \\\\\n",
-    "    --groups \"excitatory,inhibitory\" \\\\\n",
-    "    --sample SAMPLE_ID\n",
-    "\"\"\"\n",
-    "\n",
-    "print(\"Commands for single-cell ATAC-seq analysis:\")\n",
-    "print(sc_cmd)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Summary\n",
-    "\n",
-    "In this tutorial, you learned to:\n",
-    "\n",
-    "1. **Prepare ATAC-seq data** - Load peaks and verify input file formats\n",
-    "2. **Count alleles** - Use `wasp2-count count-variants` with peak regions\n",
-    "3. **Detect imbalance** - Apply beta-binomial testing with `wasp2-analyze find-imbalance`\n",
-    "4. **Visualize results** - Create volcano plots and effect size distributions\n",
-    "5. **Integrate with QTLs** - Overlap with eQTL databases for biological validation\n",
-    "\n",
-    "### Key Takeaways\n",
-    "\n",
-    "- ATAC-seq has **lower coverage per peak** than RNA-seq; use `--min-count 5` instead of 10\n",
-    "- **FDR correction** is essential for multiple testing across thousands of peaks\n",
-    "- Consider **effect size** alongside significance for biological relevance\n",
-    "- **QTL overlap** helps validate findings and identify causal variants\n",
-    "\n",
-    "### Next Steps\n",
-    "\n",
-    "- [Comparative Imbalance Tutorial](./comparative_imbalance.rst) - Compare imbalance between conditions\n",
-    "- [Single-Cell Tutorial](./scrna_seq.rst) - Cell-type-specific analysis\n",
-    "- [Statistical Methods](../methods/statistical_models.rst) - Deep dive into the beta-binomial model"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Troubleshooting\n",
-    "\n",
-    "### Common Issues\n",
-    "\n",
-    "**Low SNP counts in peaks:**\n",
-    "- Ensure VCF contains heterozygous variants for your sample\n",
-    "- Check that peak coordinates use the same reference genome as VCF\n",
-    "- Verify `--samples` matches the sample name in VCF header\n",
-    "\n",
-    "**Memory errors with large datasets:**\n",
-    "- Process chromosomes separately with `--region chr1_peaks.bed`, etc.\n",
-    "- Use `WASP2_RUST_THREADS=4` to limit parallel processing\n",
-    "\n",
-    "**No significant results:**\n",
-    "- Check read depth (may need deeper sequencing)\n",
-    "- Verify WASP filtering was applied to remove mapping bias\n",
-    "- Consider lowering `--min-count` threshold (with caution)\n",
-    "\n",
-    "### Diagnostic Commands\n",
-    "\n",
-    "```bash\n",
-    "# Check VCF sample names\n",
-    "bcftools query -l variants.vcf.gz\n",
-    "\n",
-    "# Count heterozygous SNPs in your sample\n",
-    "bcftools view -s SAMPLE_ID variants.vcf.gz | bcftools view -g het | wc -l\n",
-    "\n",
-    "# Check BAM read depth at a peak\n",
-    "samtools depth -r chr1:1000000-1001000 sample.bam | awk '{sum+=$3} END {print sum/NR}'\n",
-    "```"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Save final results\n",
-    "results_df.to_csv(RESULTS_DIR / \"final_imbalance_results.tsv\", sep=\"\\t\", index=False)\n",
-    "print(f\"Results saved to: {RESULTS_DIR / 'final_imbalance_results.tsv'}\")\n",
-    "print(\"\\nAnalysis complete!\")"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.9.0"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
diff --git a/docs/source/tutorials/bulk_workflow.rst b/docs/source/tutorials/bulk_workflow.rst
new file mode 100644
index 00000000..674f2726
--- /dev/null
+++ b/docs/source/tutorials/bulk_workflow.rst
@@ -0,0 +1,175 @@
+Bulk Workflow (RNA-seq / ATAC-seq)
+===================================
+
+End-to-end walkthrough of WASP2 on bulk sequencing data: WASP mapping filter,
+allele counting at regions, and the beta-binomial LRT for allelic imbalance.
+The same pipeline works for RNA-seq at genes (GTF) and ATAC-seq at peaks
+(BED); data-type differences are called out where they matter.
+
+Inputs:
+
+- Aligned BAM (coordinate-sorted, indexed)
+- VCF with the donor's heterozygous variants (phased for RNA-seq haplotypes)
+- Region file: GTF/GFF for RNA-seq genes, BED for ATAC-seq peaks
+
+Step 1 — WASP mapping filter
+----------------------------
+
+Reference alleles map more easily than alternate alleles, biasing ref counts
+upward. The WASP remap-and-filter step removes reads whose mapping position
+depends on which allele they carry. The test is "would this read map to the
+same location if we swapped ref↔alt at the SNP?" — only reads that do are
+kept. After filtering,
+
+.. math::
+
+   P(\text{map} \mid \text{ref}) \approx P(\text{map} \mid \text{alt})
+
+under the usual assumption that the aligner's scoring is deterministic and
+position-dependent (see :doc:`/methods/mapping_filter` and [vandeGeijn2015]_).
+
+.. code-block:: bash
+
+   # Step 1a: produce allele-swapped FASTQ + BAMs
+   wasp2-map make-reads sample.bam variants.vcf.gz \
+     --samples SAMPLE1 \
+     --out_dir wasp_output
+
+   # Step 1b: remap with the SAME aligner and parameters used originally
+   bwa mem -M -t 8 genome.fa \
+     wasp_output/sample_swapped_alleles_r1.fq \
+     wasp_output/sample_swapped_alleles_r2.fq \
+     | samtools sort -o wasp_output/sample_remapped.bam -
+   samtools index wasp_output/sample_remapped.bam
+
+   # Step 1c: drop reads whose swapped version mapped elsewhere
+   wasp2-map filter-remapped \
+     wasp_output/sample_remapped.bam \
+     --wasp_data_json wasp_output/sample_wasp_data_files.json \
+     --out_bam wasp_output/sample_wasp_filtered.bam
+
+   # Step 1d: merge with reads that never overlapped a SNP
+   samtools merge -f wasp_output/sample_final.bam \
+     wasp_output/sample_wasp_filtered.bam \
+     wasp_output/sample_keep.bam
+   samtools index wasp_output/sample_final.bam
+
+.. important::
+
+   The remap step **must** use the same aligner, version, and parameters as
+   the original alignment. A different scoring function invalidates the
+   filter contract.
+
+Filter rates are **approximately** 1–10% of SNP-overlapping reads dropped
+(developer experience; varies by sequencing platform, aligner, and variant
+density). Outlier rates should be investigated — they typically signal
+CNVs, aligner-version mismatch, or variant-call errors rather than a WASP
+failure.
+
+Step 2 — Count alleles
+----------------------
+
+**RNA-seq at genes** (GTF):
+
+.. code-block:: bash
+
+   wasp2-count count-variants \
+     wasp_output/sample_final.bam \
+     variants.vcf.gz \
+     --samples SAMPLE1 \
+     --region genes.gtf \
+     --out_file allele_counts.tsv
+
+**ATAC-seq at peaks** (BED):
+
+.. code-block:: bash
+
+   wasp2-count count-variants \
+     wasp_output/sample_final.bam \
+     variants.vcf.gz \
+     --samples SAMPLE1 \
+     --region peaks.bed \
+     --out_file allele_counts.tsv
+
+Output columns: ``chrom``, ``pos``, ``ref``, ``alt``, ``ref_count``,
+``alt_count``, ``other_count``, plus ``region`` (peak name) or
+``gene_id``/``gene_name`` (GTF). See :doc:`/user_guide/counting` for region
+filtering and multi-sample options.
+
+Step 3 — Test for allelic imbalance
+-----------------------------------
+
+.. code-block:: bash
+
+   wasp2-analyze find-imbalance \
+     allele_counts.tsv \
+     --min 10 \
+     --pseudocount 1 \
+     --phased \
+     --out_file ai_results.tsv
+
+The beta-binomial LRT tests :math:`\mu = 0.5` against :math:`\mu \neq 0.5`
+per region, correcting for overdispersion and multiple testing via
+Benjamini–Hochberg (:doc:`/methods/statistical_models`,
+:doc:`/methods/fdr_correction`).
+
+``--phased`` applies only when the VCF GT field uses ``0|1`` / ``1|0``; pass
+it off for unphased data.
+
+Output columns: ``region``, ``num_snps``, aggregated ``ref_count`` /
+``alt_count``, ``pval``, ``fdr_pval``, ``mu``, ``effect_size``
+(:math:`\log_2(\text{ref/alt})`).
+
+Filtering and interpretation
+----------------------------
+
+.. code-block:: bash
+
+   # FDR < 0.05 and |log2FC| > 1
+   awk -F'\t' 'NR==1 || ($5 < 0.05 && ($6 > 1 || $6 < -1))' \
+     ai_results.tsv > significant_ase.tsv
+
+Monoallelic expression (ref fraction > 0.9 or < 0.1) is a common hallmark of
+genomic imprinting on RNA-seq and of strong cis-regulatory variants on ATAC.
+
+eQTL integration (RNA-seq only)
+-------------------------------
+
+.. code-block:: python
+
+   import pandas as pd
+
+   ase = pd.read_csv('ai_results.tsv', sep='\t')
+   eqtl = pd.read_csv('gtex_eqtl.tsv', sep='\t')
+
+   merged = ase.merge(eqtl, left_on='region', right_on='gene_id')
+
+   # ASE effect_size > 0 => REF allele more expressed.
+   # eQTL slope > 0      => ALT allele increases expression.
+   # Concordance = opposite signs.
+   merged['concordant'] = (merged['effect_size'] > 0) != (merged['slope'] > 0)
+
+Troubleshooting
+---------------
+
+**Few SNPs found.** Confirm the VCF sample column matches ``--samples``, the
+VCF has heterozygous genotypes for this donor, and BAM+VCF use the same
+reference build.
+
+**Low power / few significant results.** Increase depth, lower ``--min``
+(with caution), or aggregate at a coarser region level.
+
+**Too many significant results.** Confirm the WASP filter was applied, check
+for batch effects, CNVs, or aligner-version drift, and tighten FDR.
+
+See Also
+--------
+
+- :doc:`/user_guide/mapping` — full mapping CLI reference
+- :doc:`/user_guide/counting` — counting CLI reference
+- :doc:`/user_guide/analysis` — analysis CLI reference
+- :doc:`/methods/mapping_filter` — canonical WASP filter contract
+- :doc:`/methods/statistical_models` — the LRT and beta-binomial model
+- :doc:`/tutorials/scatac_workflow` — single-cell ATAC-seq
+- :doc:`/tutorials/scrna_seq` — single-cell RNA-seq
+- :doc:`/tutorials/comparative_imbalance` — comparing groups
diff --git a/docs/source/tutorials/comparative_imbalance.rst b/docs/source/tutorials/comparative_imbalance.rst
index d8d65f2f..4324ee22 100644
--- a/docs/source/tutorials/comparative_imbalance.rst
+++ b/docs/source/tutorials/comparative_imbalance.rst
@@ -1,212 +1,85 @@
-Comparative Imbalance Analysis Tutorial
-=======================================
+Comparative Imbalance Analysis
+==============================
 
-This tutorial provides a comprehensive guide to detecting **differential allelic imbalance**
-between cell types, conditions, or biological groups using WASP2's comparative analysis module.
+Compare allelic imbalance between biological groups — cell types, conditions,
+sex, treatment status — using ``wasp2-analyze compare-imbalance``.
 
-Overview
---------
-
-**What is Comparative Imbalance Analysis?**
-
-While standard allelic imbalance (AI) analysis detects whether a genomic region shows
-preferential expression of one allele, comparative imbalance analysis asks a different
-question: **Does the degree of imbalance differ between groups?**
-
-This is powerful for identifying:
-
-* **Cell-type-specific regulatory variation** - Regions where different cell types show
-  distinct allelic preferences
-* **Condition-dependent effects** - Treatment-induced changes in allelic regulation
-* **Sex differences** - Chromatin regions with sex-biased allelic accessibility
-* **Developmental dynamics** - Stage-specific changes in allelic regulation
-
-**Statistical Approach**
-
-WASP2 uses a **likelihood ratio test (LRT)** to compare two hypotheses:
-
-.. code-block:: text
-
-   Null Hypothesis (H0):     Both groups share the same allelic imbalance (μ_combined)
-   Alternative Hypothesis (H1): Groups have different imbalance (μ₁ ≠ μ₂)
-
-   Test Statistic:  LRT = -2 × (log L_null - log L_alt)
-   P-value:         P(χ²(df=1) > LRT)
-
-The test accounts for overdispersion using beta-binomial modeling and applies
-Benjamini-Hochberg FDR correction for multiple testing.
-
-Prerequisites
--------------
-
-**Software:**
-
-* WASP2 (``pip install wasp2``)
-* Python with scanpy/pandas for visualization (optional)
-* R with Seurat for cell type annotation (optional)
-
-**Data Requirements:**
+What this tests
+---------------
 
-* AnnData count matrix (``.h5ad``) with allele counts per cell per SNP
-* Barcode-to-group mapping file (TSV)
-* Groups can be: cell types, conditions, sex, treatment status, etc.
-
-Input Data Format
------------------
-
-Count Matrix (.h5ad)
-~~~~~~~~~~~~~~~~~~~~
-
-Your AnnData object should have this structure:
+Standard allelic-imbalance analysis asks whether a region shows a preference
+for one allele. Comparative analysis asks whether the *degree* of imbalance
+differs between groups:
 
 .. code-block:: text
 
-   AnnData object (n_snps × n_cells)
-   ├── .obs                 # SNP metadata (rows)
-   │   ├── index            # SNP identifiers
-   │   └── [sample_name]    # Genotypes: '0|1', '1|0', '0/1', '1/0'
-   │
-   ├── .var                 # Cell metadata (columns)
-   │   └── group            # Cell type/group assignment
-   │
-   ├── .layers
-   │   ├── "ref"            # Reference allele counts (sparse matrix)
-   │   └── "alt"            # Alternate allele counts (sparse matrix)
-   │
-   └── .uns
-       ├── feature          # DataFrame: SNP → region mapping
-       └── samples          # List of sample names
-
-**Create counts from BAM + VCF:**
-
-.. code-block:: bash
+   H0:  Both groups share the same allelic imbalance (mu_combined)
+   H1:  Groups have different imbalance (mu_1 != mu_2)
 
-   wasp2-count count-variants-sc \
-     aligned.bam \
-     phased_variants.vcf.gz \
-     barcodes.txt \
-     --samples SAMPLE_ID \
-     --feature peaks.bed \
-     --out_file allele_counts.h5ad
-
-Barcode Map (TSV)
-~~~~~~~~~~~~~~~~~
-
-A two-column tab-separated file (no header) mapping cell barcodes to groups:
-
-.. code-block:: text
+   Test statistic:  LRT = -2 * (log L_null - log L_alt)
+   P-value:         P(chi^2(df=1) > LRT)
 
-   AAACGAACAGTCAGTT-1    excitatory_neurons
-   AAACGAAGTCGCTCTA-1    inhibitory_neurons
-   AAACGAAGTGAACCTA-1    excitatory_neurons
-   AAAGGATCATCGATGT-1    astrocytes
-   AAAGGATGTGCAACGA-1    microglia
+See :doc:`/methods/statistical_models` for the full model and
+:doc:`/user_guide/single_cell` for input-data formats.
 
-**Important:** Barcodes must exactly match those in the count matrix (including any ``-1`` suffix).
+Inputs
+------
 
-Tutorial 1: Cell Type Comparison
---------------------------------
+Two files are required:
 
-This tutorial demonstrates comparing allelic imbalance between neuronal subtypes.
+1. **AnnData count matrix** (``.h5ad``) with ``ref`` and ``alt`` layers, a
+   genotype column in ``.obs`` for your sample, and a ``group`` column in
+   ``.var`` identifying the per-cell group assignment. Produced by
+   ``wasp2-count count-variants-sc``.
 
-Step 1: Prepare Input Files
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
+2. **Barcode-to-group TSV** — two columns, no header, tab-separated.
 
-**Export cell type annotations from Seurat:**
-
-.. code-block:: r
-
-   library(Seurat)
-
-   # Load your analyzed Seurat object
-   seurat_obj <- readRDS("brain_snATAC.rds")
-
-   # Create barcode-to-celltype mapping
-   barcode_df <- data.frame(
-     barcode = colnames(seurat_obj),
-     celltype = Idents(seurat_obj)
-   )
-
-   # Write without header
-   write.table(
-     barcode_df,
-     "barcode_celltype_map.tsv",
-     sep = "\t", quote = FALSE,
-     row.names = FALSE, col.names = FALSE
-   )
-
-**Verify the barcode file:**
-
-.. code-block:: bash
+   .. code-block:: text
 
-   # Check format
-   head barcode_celltype_map.tsv
-   # AAACGAACAGTCAGTT-1	excitatory_neurons
-   # AAACGAAGTCGCTCTA-1	inhibitory_neurons
+      AAACGAACAGTCAGTT-1    excitatory_neurons
+      AAACGAAGTCGCTCTA-1    inhibitory_neurons
+      AAAGGATCATCGATGT-1    astrocytes
 
-   # Count cells per type
-   cut -f2 barcode_celltype_map.tsv | sort | uniq -c | sort -rn
-   #   2500 excitatory_neurons
-   #   1800 inhibitory_neurons
-   #   1200 astrocytes
-   #    800 oligodendrocytes
+   Barcodes must match those in the count matrix exactly (including any
+   ``-1`` suffix). For exporting barcode maps from Seurat or Scanpy, see
+   :doc:`/user_guide/single_cell`.
 
-Step 2: Run Per-Group Imbalance Analysis
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Example: comparing cell types
+-----------------------------
 
-First, analyze imbalance within each cell type:
+With counts in ``allele_counts.h5ad`` and cell-type assignments in
+``barcode_celltype_map.tsv``, run per-group imbalance first, then compare:
 
 .. code-block:: bash
 
+   # Per-group imbalance within each cell type
    wasp2-analyze find-imbalance-sc \
      allele_counts.h5ad \
      barcode_celltype_map.tsv \
-     --sample SAMPLE_ID \
-     --phased \
-     --min 10 \
-     -z 3
-
-This produces per-celltype result files (e.g., ``ai_results_excitatory_neurons.tsv``).
-
-Step 3: Run Comparative Analysis
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+     --sample SAMPLE_ID --phased --min 10 -z 3
 
-Compare imbalance between specific cell types:
-
-.. code-block:: bash
-
-   # Compare excitatory vs inhibitory neurons
+   # Pairwise comparison between two named groups
    wasp2-analyze compare-imbalance \
      allele_counts.h5ad \
      barcode_celltype_map.tsv \
      --sample SAMPLE_ID \
      --groups "excitatory_neurons,inhibitory_neurons" \
-     --phased \
-     --min 15
-
-**Compare all pairwise combinations:**
-
-.. code-block:: bash
+     --phased --min 15
 
-   # Omit --groups to compare all cell types
+   # Omit --groups to compare all cell types pairwise
    wasp2-analyze compare-imbalance \
      allele_counts.h5ad \
      barcode_celltype_map.tsv \
-     --sample SAMPLE_ID \
-     --phased \
-     --min 15
-
-This produces output files for each pairwise comparison:
+     --sample SAMPLE_ID --phased --min 15
 
-* ``ai_results_excitatory_neurons_inhibitory_neurons.tsv``
-* ``ai_results_excitatory_neurons_astrocytes.tsv``
-* ``ai_results_inhibitory_neurons_astrocytes.tsv``
-* ...
+Pairwise output files are named ``ai_results_<group1>_<group2>.tsv``.
 
-Step 4: Interpret Results
-~~~~~~~~~~~~~~~~~~~~~~~~~
+Other common comparisons — sex, treatment, developmental stage — use the
+same command with a different ``barcode_map.tsv``. The biology varies; the
+command does not.
 
-**Output columns explained:**
+Output columns
+--------------
 
 .. list-table::
    :header-rows: 1
@@ -215,331 +88,101 @@ Step 4: Interpret Results
    * - Column
      - Description
    * - ``region``
-     - Genomic region (peak or gene) identifier
+     - Genomic region identifier
    * - ``num_snps``
-     - Number of shared heterozygous SNPs used for comparison
+     - Shared heterozygous SNPs used for the comparison
    * - ``combined_mu``
-     - Reference allele frequency under null hypothesis (shared between groups)
-   * - ``mu1``
-     - Reference allele frequency in group 1 (e.g., excitatory neurons)
-   * - ``mu2``
-     - Reference allele frequency in group 2 (e.g., inhibitory neurons)
-   * - ``null_ll``
-     - Log-likelihood under null hypothesis (shared μ)
-   * - ``alt_ll``
-     - Log-likelihood under alternative hypothesis (separate μ values)
+     - Reference-allele frequency under H0 (pooled across groups)
+   * - ``mu1``, ``mu2``
+     - Per-group reference-allele frequencies
+   * - ``null_ll``, ``alt_ll``
+     - Log-likelihoods under H0 / H1
    * - ``pval``
-     - Likelihood ratio test p-value
+     - Likelihood-ratio-test p-value
    * - ``fdr_pval``
-     - FDR-corrected p-value (Benjamini-Hochberg)
-
-**Filtering significant results:**
-
-.. code-block:: bash
-
-   # Significant differential imbalance (FDR < 0.05)
-   awk -F'\t' 'NR==1 || $9 < 0.05' ai_results_excitatory_neurons_inhibitory_neurons.tsv \
-     > significant_differential_AI.tsv
-
-   # Large effect size (>15% difference in allele frequency)
-   awk -F'\t' 'NR==1 || ($4 - $5 > 0.15 || $5 - $4 > 0.15)' significant_differential_AI.tsv \
-     > large_effect_differential_AI.tsv
+     - Benjamini–Hochberg adjusted p-value
 
-**Interpret μ values:**
+Interpretation: ``|mu1 - mu2| > 0.1`` is a meaningful difference in allele
+preference; ``fdr_pval < 0.05`` declares significance after multiple-testing
+correction.
 
-* ``mu < 0.5``: Alternate allele favored
-* ``mu > 0.5``: Reference allele favored
-* ``|mu1 - mu2| > 0.1``: Meaningful difference (~20% shift in allele preference)
-
-Tutorial 2: Sex Differences Analysis
-------------------------------------
-
-Identify regions with sex-biased allelic imbalance in chromatin accessibility.
-
-Step 1: Create Sex-Labeled Barcode Map
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-.. code-block:: python
-
-   import pandas as pd
-   import scanpy as sc
-
-   # Load your annotated data
-   adata = sc.read_h5ad("processed_snATAC.h5ad")
-
-   # Create barcode-to-sex mapping
-   barcode_df = pd.DataFrame({
-       'barcode': adata.obs_names,
-       'sex': adata.obs['donor_sex']  # 'male' or 'female'
-   })
-
-   # Write without header
-   barcode_df.to_csv('barcode_sex_map.tsv', sep='\t', header=False, index=False)
-
-Step 2: Run Comparative Analysis
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-.. code-block:: bash
-
-   wasp2-analyze compare-imbalance \
-     allele_counts.h5ad \
-     barcode_sex_map.tsv \
-     --sample SAMPLE_ID \
-     --groups "male,female" \
-     --phased \
-     --min 20 \
-     --out_file ai_results_sex_comparison.tsv
-
-Step 3: Identify Sex-Biased Regions
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-.. code-block:: bash
-
-   # Extract significant sex differences
-   awk -F'\t' 'NR==1 || $9 < 0.01' ai_results_sex_comparison.tsv > sex_biased_regions.tsv
-
-   # Count by chromosome (expect enrichment on X)
-   cut -f1 sex_biased_regions.tsv | grep -E "^chr" | cut -d: -f1 | sort | uniq -c
-
-Tutorial 3: Treatment vs Control
---------------------------------
-
-Compare allelic imbalance before and after drug treatment.
-
-Step 1: Prepare Condition Labels
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-.. code-block:: python
-
-   import pandas as pd
-
-   # Load metadata with treatment status
-   metadata = pd.read_csv("sample_metadata.csv")
-
-   # Create barcode-to-condition mapping
-   barcode_df = pd.DataFrame({
-       'barcode': metadata['cell_barcode'],
-       'condition': metadata['treatment_status']  # 'treated' or 'control'
-   })
-
-   barcode_df.to_csv('barcode_treatment_map.tsv', sep='\t', header=False, index=False)
-
-Step 2: Run Analysis
-~~~~~~~~~~~~~~~~~~~~
-
-.. code-block:: bash
-
-   wasp2-analyze compare-imbalance \
-     allele_counts.h5ad \
-     barcode_treatment_map.tsv \
-     --sample SAMPLE_ID \
-     --groups "treated,control" \
-     --min 15 \
-     --out_file ai_results_treatment.tsv
-
-Step 3: Identify Treatment-Responsive Regions
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-.. code-block:: python
-
-   import pandas as pd
-
-   # Load results
-   results = pd.read_csv('ai_results_treated_control.tsv', sep='\t')
-
-   # Significant treatment effects
-   significant = results[results['fdr_pval'] < 0.05]
-   print(f"Found {len(significant)} treatment-responsive regions")
-
-   # Direction of change
-   treatment_gain = significant[significant['mu1'] > significant['mu2'] + 0.1]
-   treatment_loss = significant[significant['mu2'] > significant['mu1'] + 0.1]
-
-   print(f"Regions with increased ref allele in treatment: {len(treatment_gain)}")
-   print(f"Regions with decreased ref allele in treatment: {len(treatment_loss)}")
-
-Visualization Examples
-----------------------
+Visualization
+-------------
 
-Volcano Plot
-~~~~~~~~~~~~
+A minimal volcano plot:
 
 .. code-block:: python
 
    import pandas as pd
-   import matplotlib.pyplot as plt
    import numpy as np
+   import matplotlib.pyplot as plt
 
-   # Load results
-   results = pd.read_csv('ai_results_excitatory_neurons_inhibitory_neurons.tsv', sep='\t')
-
-   # Calculate effect size (difference in mu)
-   results['effect_size'] = results['mu1'] - results['mu2']
-   results['-log10_pval'] = -np.log10(results['pval'] + 1e-300)
-
-   # Create volcano plot
-   fig, ax = plt.subplots(figsize=(10, 8))
-
-   # Non-significant points
-   ns = results['fdr_pval'] >= 0.05
-   ax.scatter(results.loc[ns, 'effect_size'],
-              results.loc[ns, '-log10_pval'],
-              c='gray', alpha=0.5, s=10, label='Not significant')
+   results = pd.read_csv('ai_results_excitatory_inhibitory.tsv', sep='\t')
+   results['effect'] = results['mu1'] - results['mu2']
+   results['neglog10p'] = -np.log10(results['pval'].clip(lower=1e-300))
 
-   # Significant points
+   fig, ax = plt.subplots(figsize=(8, 6))
    sig = results['fdr_pval'] < 0.05
-   ax.scatter(results.loc[sig, 'effect_size'],
-              results.loc[sig, '-log10_pval'],
-              c='red', alpha=0.7, s=20, label='FDR < 0.05')
-
-   ax.axhline(-np.log10(0.05), color='black', linestyle='--', alpha=0.5)
-   ax.axvline(0, color='black', linestyle='-', alpha=0.3)
-
-   ax.set_xlabel('Effect Size (μ₁ - μ₂)', fontsize=12)
-   ax.set_ylabel('-log₁₀(p-value)', fontsize=12)
-   ax.set_title('Differential Allelic Imbalance:\nExcitatory vs Inhibitory Neurons', fontsize=14)
+   ax.scatter(results.loc[~sig, 'effect'], results.loc[~sig, 'neglog10p'],
+              c='lightgrey', s=8, label='FDR >= 0.05')
+   ax.scatter(results.loc[sig, 'effect'], results.loc[sig, 'neglog10p'],
+              c='crimson', s=14, label='FDR < 0.05')
+   ax.axhline(-np.log10(0.05), ls='--', c='black', alpha=0.4)
+   ax.axvline(0, ls='-', c='black', alpha=0.3)
+   ax.set(xlabel='mu1 - mu2', ylabel='-log10(p)',
+          title='Excitatory vs. inhibitory neurons')
    ax.legend()
-
    plt.tight_layout()
-   plt.savefig('differential_AI_volcano.png', dpi=150)
-
-Heatmap of Top Hits
-~~~~~~~~~~~~~~~~~~~
-
-.. code-block:: python
 
-   import pandas as pd
-   import seaborn as sns
-   import matplotlib.pyplot as plt
-
-   # Load results from multiple comparisons
-   comparisons = [
-       ('excitatory', 'inhibitory'),
-       ('excitatory', 'astrocyte'),
-       ('inhibitory', 'astrocyte'),
-   ]
-
-   # Collect mu values for top regions
-   all_results = {}
-   for g1, g2 in comparisons:
-       df = pd.read_csv(f'ai_results_{g1}_{g2}.tsv', sep='\t')
-       all_results[(g1, g2)] = df.set_index('region')
-
-   # Find regions significant in any comparison
-   sig_regions = set()
-   for df in all_results.values():
-       sig_regions.update(df[df['fdr_pval'] < 0.05].index[:20])  # Top 20 each
-
-   # Build heatmap matrix (mu values per cell type)
-   celltypes = ['excitatory', 'inhibitory', 'astrocyte']
-   heatmap_data = pd.DataFrame(index=list(sig_regions), columns=celltypes)
-
-   for region in sig_regions:
-       for g1, g2 in comparisons:
-           if region in all_results[(g1, g2)].index:
-               row = all_results[(g1, g2)].loc[region]
-               heatmap_data.loc[region, g1] = row['mu1']
-               heatmap_data.loc[region, g2] = row['mu2']
-
-   # Plot heatmap
-   fig, ax = plt.subplots(figsize=(8, 12))
-   sns.heatmap(heatmap_data.astype(float), cmap='RdBu_r', center=0.5,
-               vmin=0, vmax=1, ax=ax, cbar_kws={'label': 'Ref Allele Frequency (μ)'})
-   ax.set_title('Cell-Type-Specific Allelic Imbalance', fontsize=14)
-   plt.tight_layout()
-   plt.savefig('differential_AI_heatmap.png', dpi=150)
+For heatmaps across many cell types or overlap with eQTL/cis-QTL tables, see
+the :doc:`/user_guide/analysis` guide.
 
-Command-Line Reference
+Command-line reference
 ----------------------
 
-Full Parameter List
-~~~~~~~~~~~~~~~~~~~
-
-.. code-block:: bash
-
-   wasp2-analyze compare-imbalance --help
+.. code-block:: text
 
    Usage: wasp2-analyze compare-imbalance [OPTIONS] ADATA BARCODE_MAP
 
-   Arguments:
      ADATA        AnnData file with allele counts (.h5ad)
      BARCODE_MAP  TSV file mapping barcodes to groups
 
-   Options:
-     --groups TEXT        Comma-separated groups to compare (default: all)
-     --min INTEGER        Minimum allele count per region per group (default: 10)
-     --pseudocount INT    Pseudocount for zero counts (default: 1)
+     --groups TEXT        Comma-separated groups (default: all pairwise)
+     --min INT            Minimum allele count per region per group [10]
+     --pseudocount INT    Pseudocount for zero counts [1]
      --sample TEXT        Sample name for genotype filtering
      --phased             Use phased genotype information
-     -z, --z_cutoff INT   Remove outlier SNPs above z-score threshold
-     --out_file TEXT      Output file path
+     -z, --z_cutoff INT   Z-score outlier filter
+     --out_file TEXT      Output path
 
-Best Practices
+Good practices
 --------------
 
-Data Quality
-~~~~~~~~~~~~
-
-* **Use WASP-filtered BAMs** to remove mapping bias artifacts
-* **Require sufficient counts** (``--min 15`` or higher for robust estimates)
-* **Apply z-score filtering** (``-z 3``) to remove outliers from CNVs or mapping artifacts
-
-Statistical Power
-~~~~~~~~~~~~~~~~~
-
-* **Merge similar groups** if individual populations have low cell counts
-* **Use phased genotypes** when available for improved power
-* **Focus on regions with multiple SNPs** for more reliable estimates
-
-Interpretation
-~~~~~~~~~~~~~~
+- Run on WASP-filtered BAMs to remove mapping-bias artifacts
+  (:doc:`/methods/mapping_filter`).
+- Use ``--min 15`` or higher for robust estimates when per-group coverage
+  is moderate.
+- Apply ``-z 3`` to drop SNP outliers from CNVs or mapping artifacts.
+- Check that groups have comparable sequencing depth; very uneven depth can
+  produce false-positive differences.
+- Validate top hits in an independent cohort or with an orthogonal assay.
 
-* **Biological replication** - Validate across independent samples
-* **Effect size matters** - Consider the absolute difference between μ₁ and μ₂ alongside p-values
-* **Integrate with eQTL data** - Connect to known regulatory variants
-* **Orthogonal validation** - Confirm top hits with targeted methods
-
-Common Issues
+Common issues
 -------------
 
-Low Power / Few Significant Results
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-* Increase sequencing depth
-* Merge similar cell types to increase counts per group
-* Lower ``--min`` threshold (with caution)
-* Use phased genotypes if available
-
-Too Many Significant Results
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-* Check for batch effects between groups
-* Verify WASP filtering was applied
-* Use stricter FDR threshold (e.g., 0.01)
-* Check that groups have similar sequencing depth
-
-Memory Issues
-~~~~~~~~~~~~~
-
-Process chromosomes separately:
-
-.. code-block:: bash
+**Few significant results.** Increase coverage, merge similar populations to
+increase per-group counts, or use phased genotypes if available.
 
-   for chr in chr{1..22}; do
-     wasp2-count count-variants-sc \
-       sample.bam variants.vcf.gz barcodes.tsv \
-       --region peaks_${chr}.bed \
-       --out_file counts_${chr}.h5ad
+**Too many significant results.** Check for batch effects between groups,
+confirm WASP filtering was applied, and use a stricter FDR threshold.
 
-     wasp2-analyze compare-imbalance \
-       counts_${chr}.h5ad \
-       barcode_celltype_map.tsv \
-       --out_file results_${chr}.tsv
-   done
+**Memory.** For large cohorts, process chromosomes separately and concatenate
+results.
 
 See Also
 --------
 
-* :doc:`/user_guide/analysis` - Statistical methods and parameters
-* :doc:`/user_guide/single_cell` - Single-cell data formats
-* :doc:`/tutorials/scrna_seq` - Basic scRNA-seq tutorial
+- :doc:`/user_guide/analysis` — analysis-CLI reference and parameters
+- :doc:`/user_guide/single_cell` — input data formats, barcode exports
+- :doc:`/tutorials/scrna_seq` — basic single-cell workflow
+- :doc:`/methods/statistical_models` — the LRT underlying this test
diff --git a/docs/source/tutorials/quickstart_mapping.rst b/docs/source/tutorials/quickstart_mapping.rst
deleted file mode 100644
index 41317a03..00000000
--- a/docs/source/tutorials/quickstart_mapping.rst
+++ /dev/null
@@ -1,257 +0,0 @@
-Quickstart: WASP Mapping Filter
-================================
-
-Learn WASP2's mapping bias correction in 5 minutes.
-
-.. contents:: Contents
-   :local:
-   :depth: 2
-
-Overview
---------
-
-**Goal:** Understand and apply the WASP mapping filter to remove reference bias from your alignment data.
-
-**Time:** ~5 minutes to read, ~30 minutes to run on typical data
-
-**Prerequisites:**
-
-* WASP2 installed (``pip install wasp2``)
-* Aligned BAM file (coordinate-sorted)
-* VCF file with heterozygous variants
-
-The Problem: Reference Mapping Bias
------------------------------------
-
-When reads are aligned to a reference genome, there's an inherent asymmetry:
-
-.. code-block:: text
-
-   Reference:  ...ACGT[A]CGTA...  (reference allele: A)
-   Read (ref): ...ACGT[A]CGTA...  → Perfect match (0 mismatches)
-   Read (alt): ...ACGT[G]CGTA...  → 1 mismatch penalty
-
-**Result**: Reads carrying the alternate allele are more likely to:
-
-- Fail to map entirely
-- Map with lower quality scores
-- Map to incorrect locations
-
-This causes **inflated reference allele counts**, leading to false positive ASE signals.
-
-The Solution: WASP Remap-and-Filter
------------------------------------
-
-WASP corrects this by testing whether each read would map identically
-regardless of which allele it carries:
-
-1. **Identify**: Find reads overlapping heterozygous SNPs
-2. **Swap**: Create versions with alleles swapped (ref→alt, alt→ref)
-3. **Remap**: Align swapped reads with the same aligner
-4. **Filter**: Keep only reads that map to the **same location** after swapping
-
-After filtering, the probability of mapping is equal for both alleles:
-
-.. math::
-
-   P(\text{map} | \text{ref allele}) = P(\text{map} | \text{alt allele})
-
-Quick Workflow
---------------
-
-Step 1: Create Swapped Reads
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Identify reads overlapping heterozygous SNPs and generate allele-swapped versions:
-
-.. code-block:: bash
-
-   wasp2-map make-reads sample.bam variants.vcf.gz \
-     --samples SAMPLE1 \
-     --out_dir wasp_output
-
-This produces (where ``sample`` is your BAM file prefix):
-
-* ``wasp_output/sample_to_remap.bam``: Original reads needing remapping
-* ``wasp_output/sample_keep.bam``: Reads not overlapping variants (kept as-is)
-* ``wasp_output/sample_swapped_alleles_r1.fq``: Allele-swapped read 1
-* ``wasp_output/sample_swapped_alleles_r2.fq``: Allele-swapped read 2
-* ``wasp_output/sample_wasp_data_files.json``: Metadata for filter step
-
-Step 2: Remap Swapped Reads
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-**Critical**: Use the **same aligner and parameters** as your original mapping!
-
-.. code-block:: bash
-
-   # Example with BWA (replace 'sample' with your BAM file prefix)
-   bwa mem -M -t 8 genome.fa \
-     wasp_output/sample_swapped_alleles_r1.fq \
-     wasp_output/sample_swapped_alleles_r2.fq | \
-     samtools sort -o wasp_output/sample_remapped.bam -
-
-   samtools index wasp_output/sample_remapped.bam
-
-Using different alignment parameters will invalidate the WASP correction.
-
-Step 3: Filter Remapped Reads
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-The WASP filter compares original and remapped positions:
-
-.. code-block:: bash
-
-   wasp2-map filter-remapped \
-     wasp_output/sample_remapped.bam \
-     --wasp_data_json wasp_output/sample_wasp_data_files.json \
-     --out_bam wasp_output/sample_wasp_filtered.bam
-
-Understanding Filter Statistics
--------------------------------
-
-The WASP filter reports three key metrics:
-
-.. table:: WASP Filter Metrics
-   :widths: 20 50 30
-
-   ================= ============================================ ==============
-   Metric            Description                                  Typical Value
-   ================= ============================================ ==============
-   **Kept reads**    Reads that passed the filter                 90-99%
-   **Removed (moved)** Reads that mapped to different locations   1-8%
-   **Removed (missing)** Reads that failed to remap               <1%
-   ================= ============================================ ==============
-
-Interpreting Filter Rates
-~~~~~~~~~~~~~~~~~~~~~~~~~
-
-* **95-99% kept**: Good - typical for most data types
-* **90-95% kept**: Acceptable - may indicate difficult regions
-* **<90% kept**: Investigate - check data quality or variant calls
-
-Before/After Example
---------------------
-
-At a site with mapping bias:
-
-.. table:: Example: Before and After WASP
-   :widths: 30 35 35
-
-   =============== =============== ===============
-   Metric          Before WASP     After WASP
-   =============== =============== ===============
-   Reference reads 150             95
-   Alternate reads 80              85
-   Ref fraction    0.65            0.53
-   =============== =============== ===============
-
-The biased site (0.65 ref fraction) is corrected to near-balanced (0.53).
-
-Complete Workflow Script
-------------------------
-
-.. code-block:: bash
-
-   #!/bin/bash
-   set -e
-
-   # Input files
-   BAM="sample.bam"
-   VCF="variants.vcf.gz"
-   SAMPLE="SAMPLE1"
-   GENOME="genome.fa"
-   OUTDIR="wasp_output"
-
-   # Extract BAM prefix (filename without .bam extension)
-   PREFIX=$(basename $BAM .bam)
-
-   mkdir -p $OUTDIR
-
-   # Step 1: Create allele-swapped reads
-   echo "Step 1: Creating swapped reads..."
-   wasp2-map make-reads $BAM $VCF \
-     --samples $SAMPLE \
-     --out_dir $OUTDIR
-
-   # Step 2: Remap with same aligner
-   echo "Step 2: Remapping swapped reads..."
-   bwa mem -M -t 8 $GENOME \
-     $OUTDIR/${PREFIX}_swapped_alleles_r1.fq \
-     $OUTDIR/${PREFIX}_swapped_alleles_r2.fq | \
-     samtools sort -o $OUTDIR/${PREFIX}_remapped.bam -
-   samtools index $OUTDIR/${PREFIX}_remapped.bam
-
-   # Step 3: Filter biased reads
-   echo "Step 3: Filtering biased reads..."
-   wasp2-map filter-remapped \
-     $OUTDIR/${PREFIX}_remapped.bam \
-     --wasp_data_json $OUTDIR/${PREFIX}_wasp_data_files.json \
-     --out_bam $OUTDIR/${PREFIX}_wasp_filtered.bam
-
-   # Step 4: Merge with non-overlapping reads
-   echo "Step 4: Merging final BAM..."
-   samtools merge -f $OUTDIR/${PREFIX}_final.bam \
-     $OUTDIR/${PREFIX}_wasp_filtered.bam \
-     $OUTDIR/${PREFIX}_keep.bam
-   samtools index $OUTDIR/${PREFIX}_final.bam
-
-   echo "Done! WASP-filtered BAM: $OUTDIR/${PREFIX}_final.bam"
-
-Rust Acceleration
------------------
-
-WASP2 includes a high-performance Rust backend that accelerates the filter step:
-
-.. table:: Performance Comparison
-   :widths: 30 35 35
-
-   ============= =============== ===============
-   Dataset Size  Python          Rust
-   ============= =============== ===============
-   1M reads      ~5 minutes      ~30 seconds
-   10M reads     ~50 minutes     ~5 minutes
-   100M reads    ~8 hours        ~50 minutes
-   ============= =============== ===============
-
-The Rust backend is used automatically when available.
-
-Next Steps
-----------
-
-After WASP filtering:
-
-1. **Count alleles** on the filtered BAM:
-
-   .. code-block:: bash
-
-      wasp2-count count-variants wasp_filtered.bam variants.vcf
-
-2. **Analyze allelic imbalance**:
-
-   .. code-block:: bash
-
-      wasp2-analyze find-imbalance counts.tsv
-
-See Also
---------
-
-* :doc:`/user_guide/mapping` - Detailed mapping module documentation
-* :doc:`/methods/mapping_filter` - Algorithm details and mathematics
-* :doc:`/tutorials/scrna_seq` - Single-cell RNA-seq workflow
-
-Summary
--------
-
-.. table:: Key Takeaways
-   :widths: 25 75
-
-   ============ ===============================================
-   Concept      Key Point
-   ============ ===============================================
-   **Problem**  Reference bias inflates ref allele counts
-   **Solution** WASP remap-and-filter removes biased reads
-   **Workflow** make-reads → remap → filter-remapped
-   **Expected** 90-99% reads pass filter
-   **Result**   Unbiased allele counts for ASE analysis
-   ============ ===============================================
diff --git a/docs/source/tutorials/rna_seq.rst b/docs/source/tutorials/rna_seq.rst
deleted file mode 100644
index c5fd0876..00000000
--- a/docs/source/tutorials/rna_seq.rst
+++ /dev/null
@@ -1,203 +0,0 @@
-RNA-seq Allelic Imbalance Tutorial
-===================================
-
-This tutorial demonstrates a complete workflow for detecting allele-specific expression (ASE)
-in bulk RNA-seq data using WASP2.
-
-**Estimated time:** ~30 minutes
-
-Overview
---------
-
-The tutorial covers the complete RNA-seq allelic imbalance analysis pipeline:
-
-1. **Data Loading** - BAM, VCF, and gene annotations (GTF)
-2. **Allele Counting** - Count reads at heterozygous SNPs within genes/exons
-3. **Statistical Testing** - Beta-binomial model for allelic imbalance detection
-4. **ASE Visualization** - Volcano plots and allele ratio distributions
-5. **Imprinting Detection** - Identify monoallelic expression patterns
-6. **eQTL Integration** - Connect ASE signals to regulatory variants
-
-Prerequisites
--------------
-
-**Software:**
-
-* WASP2 (``pip install wasp2``)
-* Python packages: pandas, numpy, matplotlib, seaborn, scipy
-
-**Data:**
-
-* Aligned BAM file (coordinate-sorted, indexed)
-* Phased VCF file with heterozygous variants
-* Gene annotation file (GTF format, e.g., GENCODE)
-
-Workflow Summary
-----------------
-
-Step 1: Count Alleles at Genes
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Use ``wasp2-count count-variants`` to count allele-specific reads at heterozygous SNPs:
-
-.. code-block:: bash
-
-   wasp2-count count-variants \
-       sample.bam \
-       variants.vcf.gz \
-       --samples SAMPLE_ID \
-       --region genes.gtf \
-       --out_file allele_counts.tsv
-
-This produces a TSV file with columns:
-
-* ``chr``, ``pos``: SNP location
-* ``ref``, ``alt``: Alleles
-* ``ref_count``, ``alt_count``: Read counts per allele
-* ``other_count``: Reads supporting other alleles (non-ref, non-alt)
-* ``gene_id``, ``gene_name``: Overlapping gene annotation
-* ``feature``: Feature type (exon, intron, etc.) when using GTF
-
-Step 2: Test for Allelic Imbalance
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Use ``wasp2-analyze find-imbalance`` to detect significant ASE:
-
-.. code-block:: bash
-
-   wasp2-analyze find-imbalance \
-       allele_counts.tsv \
-       --min 10 \
-       --pseudocount 1 \
-       --phased \
-       --out_file ai_results.tsv
-
-The beta-binomial model tests for deviation from 50:50 allele ratios, accounting for
-biological overdispersion.
-
-**Output columns:**
-
-* ``region``: Gene identifier
-* ``ref_count``, ``alt_count``: Aggregated counts
-* ``p_value``: Likelihood ratio test p-value
-* ``fdr_pval``: FDR-corrected p-value
-* ``effect_size``: Log2 fold change (ref/alt)
-
-Step 3: Identify Significant ASE
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Filter results by significance and effect size:
-
-.. code-block:: bash
-
-   # Significant ASE (FDR < 0.05, |log2FC| > 1)
-   # Column indices: $5 = fdr_pval, $6 = effect_size
-   awk -F'\t' 'NR==1 || ($5 < 0.05 && ($6 > 1 || $6 < -1))' ai_results.tsv > significant_ase.tsv
-
-Step 4: Detect Imprinting Patterns
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Monoallelic expression (>90:10 allele ratio) may indicate genomic imprinting:
-
-.. code-block:: python
-
-   import pandas as pd
-
-   results = pd.read_csv('ai_results.tsv', sep='\t')
-   total = results['ref_count'] + results['alt_count']
-   results['ref_ratio'] = results['ref_count'] / total
-
-   # Monoallelic genes
-   monoallelic = results[
-       (results['ref_ratio'] > 0.9) | (results['ref_ratio'] < 0.1)
-   ]
-
-Step 5: Integrate with eQTL Data
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Cross-reference ASE results with eQTL databases (e.g., GTEx):
-
-.. code-block:: python
-
-   import pandas as pd
-
-   ase = pd.read_csv('ai_results.tsv', sep='\t')
-   eqtl = pd.read_csv('gtex_eqtl.tsv', sep='\t')
-
-   # Merge on gene ID
-   integrated = ase.merge(eqtl, left_on='region', right_on='gene_id')
-
-   # Check direction concordance between ASE and eQTL
-   # ASE effect_size > 0 means REFERENCE allele is more expressed
-   # eQTL slope > 0 means ALTERNATE allele INCREASES expression
-   # Therefore, concordance = OPPOSITE signs (ref high in ASE = alt low in eQTL)
-   integrated['concordant'] = (
-       (integrated['effect_size'] > 0) != (integrated['slope'] > 0)
-   )
-
-Key Concepts
-------------
-
-Beta-Binomial Model
-~~~~~~~~~~~~~~~~~~~
-
-WASP2 uses a beta-binomial distribution to model allele counts:
-
-* Accounts for **overdispersion** (biological variation beyond binomial sampling)
-* Models **technical noise** from PCR amplification and sequencing
-* Aggregates information across **multiple SNPs** per gene
-
-The null hypothesis is equal expression from both alleles (p = 0.5).
-
-Effect Size Interpretation
-~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-* **log2FC > 1**: Reference allele 2x more expressed (strong ASE)
-* **log2FC > 2**: Reference allele 4x more expressed (very strong ASE)
-* **log2FC near 0**: Balanced expression (no ASE)
-
-Significance Thresholds
-~~~~~~~~~~~~~~~~~~~~~~~
-
-* **FDR < 0.05**: Standard significance threshold
-* **FDR < 0.01**: Stringent threshold for high-confidence hits
-* Combine with effect size filters to focus on biologically meaningful results
-
-Troubleshooting
----------------
-
-Low SNP Counts
-~~~~~~~~~~~~~~
-
-If few heterozygous SNPs are detected:
-
-* Verify VCF contains heterozygous genotypes:
-
-  - **Phased format**: GT = ``0|1`` or ``1|0`` (pipe separator)
-  - **Unphased format**: GT = ``0/1`` or ``1/0`` (slash separator)
-  - Use ``--phased`` flag only with phased genotypes
-
-* Check sample ID matches VCF sample column
-* Ensure BAM and VCF use the same reference genome
-
-No Significant Results
-~~~~~~~~~~~~~~~~~~~~~~
-
-* Increase sequencing depth (more reads = more power)
-* Lower ``--min`` threshold (but interpret with caution)
-* Check for batch effects or technical artifacts
-
-Too Many Significant Results
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-* Verify WASP mapping bias correction was applied
-* Check for copy number variation (CNV) artifacts
-* Use stricter FDR threshold (e.g., 0.01)
-
-See Also
---------
-
-* :doc:`/user_guide/counting` - Detailed counting options
-* :doc:`/user_guide/analysis` - Statistical methods and parameters
-* :doc:`/tutorials/scrna_seq` - Single-cell RNA-seq tutorial
-* :doc:`/tutorials/comparative_imbalance` - Comparing ASE between groups
diff --git a/docs/source/tutorials/scrna_seq.rst b/docs/source/tutorials/scrna_seq.rst
index 1ee8d54b..35bdb071 100644
--- a/docs/source/tutorials/scrna_seq.rst
+++ b/docs/source/tutorials/scrna_seq.rst
@@ -1,160 +1,22 @@
 10X scRNA-seq Tutorial
 ======================
 
-This tutorial walks through a complete WASP2 workflow for detecting allele-specific expression in 10X Genomics single-cell RNA-seq data.
+End-to-end allele-specific expression (ASE) workflow for 10X Chromium
+scRNA-seq. Assumes a Cell Ranger ``possorted_genome_bam.bam`` with cell
+barcodes in the ``CB`` tag, a phased VCF for the donor, and a cell-type
+annotation from Seurat or Scanpy.
 
-Overview
---------
-
-**Goal:** Identify genes with allele-specific expression (ASE) in different cell types from 10X Chromium scRNA-seq data.
-
-**Input Data:**
-
-* Cell Ranger output (BAM + filtered barcodes)
-* Phased VCF file with heterozygous variants
-* Cell type annotations from Seurat/Scanpy
-
-Prerequisites
--------------
-
-**Software:**
-
-* WASP2 (``pip install wasp2``)
-* Cell Ranger output (v3+)
-* R with Seurat or Python with Scanpy
-
-**Data:**
-
-* Aligned BAM file with cell barcodes (CB tag)
-* Phased VCF for your sample
-* Completed cell type annotation
+Inputs
+------
 
-Step 1: Prepare Input Data
---------------------------
+- Cell Ranger BAM + index
+- Phased VCF/BCF/PGEN for the sample
+- A barcode-to-group TSV (see :doc:`/user_guide/single_cell` for Seurat /
+  Scanpy export code and exact format)
+- GTF gene annotation
 
-Cell Ranger Output Structure
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-After running Cell Ranger, your output directory contains:
-
-.. code-block:: text
-
-   cellranger_output/
-   └── outs/
-       ├── possorted_genome_bam.bam
-       ├── possorted_genome_bam.bam.bai
-       └── filtered_feature_bc_matrix/
-           ├── barcodes.tsv.gz
-           ├── features.tsv.gz
-           └── matrix.mtx.gz
-
-The BAM file contains cell barcodes in the ``CB`` tag:
-
-.. code-block:: bash
-
-   # View CB tags in BAM
-   samtools view possorted_genome_bam.bam | head -1 | tr '\t' '\n' | grep CB
-   # Output: CB:Z:AAACCCAAGAAACACT-1
-
-Step 2: Generate Barcode File
------------------------------
-
-From Seurat Analysis
-~~~~~~~~~~~~~~~~~~~~
-
-After running Seurat clustering and annotation:
-
-.. code-block:: r
-
-   library(Seurat)
-
-   # Load your analyzed Seurat object
-   seurat_obj <- readRDS("seurat_analyzed.rds")
-
-   # Check available metadata columns
-   head(seurat_obj@meta.data)
-
-   # Extract barcodes and cell types
-   barcode_df <- data.frame(
-     barcode = colnames(seurat_obj),
-     cell_type = seurat_obj$celltype_annotation  # Your annotation column
-   )
-
-   # Preview the data
-   head(barcode_df)
-   #>                 barcode    cell_type
-   #> 1 AAACCCAAGAAACACT-1       B_cell
-   #> 2 AAACCCAAGAAACTGT-1       B_cell
-   #> 3 AAACCCAAGAAAGCGA-1   CD4_T_cell
-
-   # Write barcode file (no header, tab-separated)
-   write.table(
-     barcode_df,
-     file = "barcodes_celltype.tsv",
-     sep = "\t",
-     quote = FALSE,
-     row.names = FALSE,
-     col.names = FALSE
-   )
-
-From Scanpy Analysis
-~~~~~~~~~~~~~~~~~~~~
-
-After running Scanpy clustering and annotation:
-
-.. code-block:: python
-
-   import scanpy as sc
-   import pandas as pd
-
-   # Load your analyzed AnnData object
-   adata = sc.read_h5ad("scanpy_analyzed.h5ad")
-
-   # Check available annotations
-   print(adata.obs.columns)
-
-   # Extract barcodes and cell types
-   barcode_df = pd.DataFrame({
-       'barcode': adata.obs_names,
-       'cell_type': adata.obs['leiden_annotation']  # Your annotation column
-   })
-
-   # Preview the data
-   print(barcode_df.head())
-   #                   barcode    cell_type
-   # 0  AAACCCAAGAAACACT-1       B_cell
-   # 1  AAACCCAAGAAACTGT-1       B_cell
-   # 2  AAACCCAAGAAAGCGA-1   CD4_T_cell
-
-   # Write barcode file (no header, tab-separated)
-   barcode_df.to_csv(
-       'barcodes_celltype.tsv',
-       sep='\t',
-       header=False,
-       index=False
-   )
-
-Verify Barcode Format
-~~~~~~~~~~~~~~~~~~~~~
-
-.. code-block:: bash
-
-   # Check barcode file format
-   head barcodes_celltype.tsv
-   # AAACCCAAGAAACACT-1	B_cell
-   # AAACCCAAGAAACTGT-1	B_cell
-
-   # Count cells per type
-   cut -f2 barcodes_celltype.tsv | sort | uniq -c | sort -rn
-   #   1500 CD4_T_cell
-   #   1200 B_cell
-   #    800 Monocyte
-   #    ...
-
-Step 3: Count Allele-Specific Reads
------------------------------------
-
-Run the single-cell allele counting:
+Step 1 — Count alleles per cell per gene
+-----------------------------------------
 
 .. code-block:: bash
 
@@ -166,20 +28,12 @@ Run the single-cell allele counting:
      --samples SAMPLE_ID \
      --out_file allele_counts.h5ad
 
-**Parameters:**
+Output: an AnnData ``.h5ad`` with ``ref`` / ``alt`` / ``other`` layers,
+genotype columns in ``.obs``, and cell-type assignments in ``.var``. See
+:doc:`/user_guide/single_cell` for the full schema.
 
-* ``barcodes_celltype.tsv``: Your barcode file with cell type annotations (positional)
-* ``--region``: Gene annotation file (GTF/GFF) or peak file (BED)
-* ``--samples``: Sample ID matching VCF sample column
-* ``--out_file``: Output AnnData file
-
-Step 4: Analyze Allelic Imbalance
----------------------------------
-
-Cell-Type-Specific Analysis
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Analyze imbalance within each cell type:
+Step 2 — Per-cell-type imbalance
+--------------------------------
 
 .. code-block:: bash
 
@@ -189,20 +43,11 @@ Analyze imbalance within each cell type:
      --sample SAMPLE_ID \
      --out_file imbalance_by_celltype.tsv
 
-**Output columns:**
-
-* ``region``: Gene or genomic region
-* ``cell_type``: Cell type from barcode file
-* ``ref_count``: Total reference allele counts
-* ``alt_count``: Total alternate allele counts
-* ``p_value``: Statistical significance
-* ``fdr_pval``: FDR-corrected p-value
-* ``effect_size``: Log2 fold change
+Output columns: ``region``, ``cell_type``, aggregated ``ref_count`` /
+``alt_count``, ``pval``, ``fdr_pval``, ``effect_size`` (log₂ ref/alt).
 
-Compare Between Cell Types
-~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Find differential allelic imbalance between cell types:
+Step 3 — Compare cell types (optional)
+--------------------------------------
 
 .. code-block:: bash
 
@@ -212,122 +57,56 @@ Find differential allelic imbalance between cell types:
      --groups "CD4_T_cell,CD8_T_cell" \
      --out_file differential_imbalance.tsv
 
-Step 5: Interpret Results
--------------------------
+For more on comparative analysis (multiple groups, all-pairs, volcano
+plots), see :doc:`comparative_imbalance`.
 
-Load and explore results in Python:
+Interpreting results
+--------------------
 
 .. code-block:: python
 
    import pandas as pd
-   import matplotlib.pyplot as plt
 
-   # Load results
    results = pd.read_csv('imbalance_by_celltype.tsv', sep='\t')
+   sig = results[results['fdr_pval'] < 0.05]
 
-   # Filter significant results (FDR < 0.05)
-   significant = results[results['fdr_pval'] < 0.05]
-   print(f"Found {len(significant)} significant ASE events")
-
-   # Top genes per cell type
-   top_genes = (significant
-       .groupby('cell_type')
-       .apply(lambda x: x.nsmallest(10, 'fdr_pval'))
-       .reset_index(drop=True))
-
-   print(top_genes[['region', 'cell_type', 'effect_size', 'fdr_pval']])
-
-   # Visualize effect sizes
-   fig, ax = plt.subplots(figsize=(10, 6))
-   significant.boxplot(column='effect_size', by='cell_type', ax=ax)
-   plt.title('Allelic Imbalance by Cell Type')
-   plt.ylabel('Log2 Fold Change (Ref/Alt)')
-   plt.savefig('ase_by_celltype.png')
+   top = (sig.groupby('cell_type')
+             .apply(lambda x: x.nsmallest(10, 'fdr_pval'))
+             .reset_index(drop=True))
 
-Example Output
---------------
-
-.. code-block:: text
-
-   region          cell_type    ref_count  alt_count  fdr_pval   effect_size
-   ENSG00000123456 B_cell       245        89         0.001      1.46
-   ENSG00000234567 CD4_T_cell   156        312        0.003     -1.00
-   ENSG00000345678 Monocyte     423        198        0.012      1.09
+   print(top[['region', 'cell_type', 'effect_size', 'fdr_pval']])
 
 Troubleshooting
 ---------------
 
-No Cells Matched
-~~~~~~~~~~~~~~~~
-
-If you see "0 barcodes matched":
-
-.. code-block:: bash
-
-   # Check BAM barcode format
-   samtools view your.bam | head -1000 | grep -o 'CB:Z:[^\t]*' | head
-
-   # Compare with your barcode file
-   head barcodes.tsv
-
-   # Common issues:
-   # - Missing -1 suffix in barcode file
-   # - Barcode file has header (should not)
-   # - Different barcode versions (v2 vs v3)
-
-**Quick Diagnostic:**
+**Zero barcodes matched.** Confirm barcode format in the BAM vs. the TSV —
+the ``CB:Z:...`` tag often has a ``-1`` suffix that your export must match:
 
 .. code-block:: bash
 
-   # Compare BAM barcodes with file
-   samtools view your.bam | head -10000 | grep -o 'CB:Z:[^\t]*' | cut -d: -f3 | sort -u > bam_bc.txt
+   samtools view your.bam | head -10000 | grep -o 'CB:Z:[^[:space:]]*' \
+     | cut -d: -f3 | sort -u > bam_bc.txt
    cut -f1 barcodes.tsv | sort -u > file_bc.txt
-   comm -12 bam_bc.txt file_bc.txt | wc -l  # Should be > 0
+   comm -12 bam_bc.txt file_bc.txt | wc -l   # should be > 0
 
-   # Fix suffix mismatch if needed
-   awk -F'\t' '{print $1"-1\t"$2}' barcodes_no_suffix.tsv > barcodes.tsv
-
-Low Read Counts
-~~~~~~~~~~~~~~~
-
-Single-cell data is sparse. Consider:
-
-* Using pseudo-bulk aggregation by cell type
-* Lowering ``--min`` / ``--min_count`` threshold
-* Focusing on highly expressed genes
-
-Memory Issues
-~~~~~~~~~~~~~
-
-For large datasets, process chromosomes separately by filtering your region file:
+Fix a missing suffix with:
 
 .. code-block:: bash
 
-   # Process autosomes separately (add chrX, chrY if needed)
-   for chr in chr{1..22}; do
-     # Extract regions for this chromosome
-     grep "^${chr}\t" genes.bed > genes_${chr}.bed
-
-     wasp2-count count-variants-sc \
-       sample.bam \
-       variants.vcf.gz \
-       barcodes.tsv \
-       --region genes_${chr}.bed \
-       --out_file counts_${chr}.h5ad
-   done
+   awk -F'\t' '{print $1"-1\t"$2}' barcodes_no_suffix.tsv > barcodes.tsv
 
-Next Steps
-----------
+**Sparse counts.** Single-cell data is sparse. Consider pseudobulk
+aggregation by cell type, lower ``--min`` / ``--min_count``, or focus on
+highly expressed genes.
 
-* Integrate with eQTL databases (GTEx, eQTLGen)
-* Correlate ASE with gene expression levels
-* Validate top hits with allele-specific primers
-* Compare across conditions or timepoints
+**Memory.** For large cohorts, split the region file by chromosome and
+process chunks, then concatenate results.
 
 See Also
 --------
 
-* :doc:`/user_guide/single_cell` - Barcode file format reference
-* :doc:`/user_guide/analysis` - Statistical methods
-* `Seurat <https://satijalab.org/seurat/>`_ - R toolkit for scRNA-seq
-* `Scanpy <https://scanpy.readthedocs.io/>`_ - Python toolkit for scRNA-seq
+- :doc:`/user_guide/single_cell` — barcode file format, Seurat/Scanpy export
+- :doc:`/user_guide/analysis` — analysis CLI reference
+- :doc:`/methods/statistical_models` — beta-binomial LRT
+- :doc:`scatac_workflow` — sibling tutorial for scATAC-seq
+- :doc:`comparative_imbalance` — comparing groups
diff --git a/docs/source/tutorials/statistical_methods_tutorial.ipynb b/docs/source/tutorials/statistical_methods_tutorial.ipynb
deleted file mode 100644
index b7e6c8f0..00000000
--- a/docs/source/tutorials/statistical_methods_tutorial.ipynb
+++ /dev/null
@@ -1,1516 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "intro-header",
-   "metadata": {},
-   "source": [
-    "# Statistical Methods Tutorial: Understanding WASP2's Beta-Binomial Framework\n",
-    "\n",
-    "**Estimated time:** ~45 minutes\n",
-    "\n",
-    "This interactive tutorial provides a deep dive into the statistical methods used by WASP2 for detecting allelic imbalance. You will learn:\n",
-    "\n",
-    "1. **Why overdispersion matters** - Why the simple binomial model fails for sequencing data\n",
-    "2. **Beta-binomial distributions** - The statistical foundation of WASP2\n",
-    "3. **Dispersion estimation** - MLE vs Method of Moments approaches\n",
-    "4. **QQ plots** - Visualizing model fit and calibration\n",
-    "5. **FDR correction** - Benjamini-Hochberg and alternatives\n",
-    "\n",
-    "## Prerequisites\n",
-    "\n",
-    "This tutorial assumes familiarity with:\n",
-    "- Basic probability distributions (binomial)\n",
-    "- Hypothesis testing concepts (p-values)\n",
-    "- Python data analysis (numpy, pandas, matplotlib)\n",
-    "\n",
-    "No prior knowledge of beta-binomial distributions or overdispersion is required.\n",
-    "\n",
-    "## Relationship to WASP2 Source Code\n",
-    "\n",
-    "The functions in this tutorial mirror the implementations in:\n",
-    "- `src/analysis/as_analysis.py` - Core statistical functions (`clamp_rho`, `opt_prob`, `opt_linear`)\n",
-    "- `src/analysis/compare_ai.py` - Comparative analysis between groups\n",
-    "- `src/analysis/as_analysis_sc.py` - Single-cell extensions"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "setup-imports",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import warnings\n",
-    "\n",
-    "import matplotlib.pyplot as plt\n",
-    "import numpy as np\n",
-    "import pandas as pd\n",
-    "from numpy.typing import NDArray\n",
-    "from scipy import stats\n",
-    "from scipy.optimize import minimize, minimize_scalar\n",
-    "from scipy.special import expit\n",
-    "from scipy.stats import betabinom, binom, chi2, false_discovery_control\n",
-    "\n",
-    "# Configure plotting\n",
-    "plt.style.use(\"seaborn-v0_8-whitegrid\")\n",
-    "plt.rcParams[\"figure.figsize\"] = (10, 6)\n",
-    "plt.rcParams[\"font.size\"] = 11\n",
-    "np.random.seed(42)\n",
-    "\n",
-    "print(\"Statistical Methods Tutorial\")\n",
-    "print(\"=\" * 40)\n",
-    "print(f\"NumPy version: {np.__version__}\")\n",
-    "print(f\"SciPy version: {stats.scipy.__version__ if hasattr(stats, 'scipy') else 'N/A'}\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "constants-header",
-   "metadata": {},
-   "source": [
-    "## Critical Constants and Helper Functions\n",
-    "\n",
-    "WASP2 defines critical numerical constants to prevent division by zero and numerical overflow. These are defined in `src/analysis/as_analysis.py` (Issue #228).\n",
-    "\n",
-    "**Why this matters:** The beta-binomial parameterization uses $\\alpha = \\mu \\cdot \\frac{1-\\rho}{\\rho}$, which:\n",
-    "- Causes **division by zero** when $\\rho = 0$\n",
-    "- Produces **zero alpha/beta** when $\\rho = 1$\n",
-    "\n",
-    "WASP2 clamps $\\rho$ to $(\\epsilon, 1-\\epsilon)$ where $\\epsilon = 10^{-10}$."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "constants-definition",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# =============================================================================\n",
-    "# BETA-BINOMIAL RHO PARAMETER BOUNDS (Issue #228)\n",
-    "# =============================================================================\n",
-    "# Matches WASP2's src/analysis/as_analysis.py:33\n",
-    "# =============================================================================\n",
-    "\n",
-    "RHO_EPSILON: float = 1e-10\n",
-    "\n",
-    "\n",
-    "def clamp_rho(rho: float | NDArray[np.float64]) -> float | NDArray[np.float64]:\n",
-    "    \"\"\"\n",
-    "    Clamp dispersion parameter rho to safe range (epsilon, 1-epsilon).\n",
-    "\n",
-    "    This function mirrors WASP2's src/analysis/as_analysis.py:36-50.\n",
-    "\n",
-    "    The beta-binomial parameterization uses alpha = mu * (1-rho) / rho, which\n",
-    "    causes division by zero when rho=0 and produces zero alpha/beta when rho=1.\n",
-    "    This function prevents these boundary issues.\n",
-    "\n",
-    "    Args:\n",
-    "        rho: Dispersion parameter (scalar or array), expected in [0, 1]\n",
-    "\n",
-    "    Returns:\n",
-    "        Clamped rho in range (RHO_EPSILON, 1 - RHO_EPSILON)\n",
-    "\n",
-    "    Example:\n",
-    "        >>> clamp_rho(0.0)  # Would cause division by zero\n",
-    "        1e-10\n",
-    "        >>> clamp_rho(1.0)  # Would produce zero alpha/beta\n",
-    "        0.9999999999\n",
-    "    \"\"\"\n",
-    "    return np.clip(rho, RHO_EPSILON, 1.0 - RHO_EPSILON)\n",
-    "\n",
-    "\n",
-    "# Demonstrate the importance of clamping\n",
-    "print(\"Demonstrating rho clamping (Issue #228):\")\n",
-    "print(f\"  RHO_EPSILON = {RHO_EPSILON}\")\n",
-    "print(f\"  clamp_rho(0.0) = {clamp_rho(0.0)}\")\n",
-    "print(f\"  clamp_rho(1.0) = {clamp_rho(1.0)}\")\n",
-    "print(f\"  clamp_rho(0.05) = {clamp_rho(0.05)}  # No change needed\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "section1-header",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "## Section 1: Understanding Overdispersion\n",
-    "\n",
-    "### 1.1 The Naive Binomial Model\n",
-    "\n",
-    "When counting alleles at a heterozygous SNP, the simplest model assumes each read is an independent coin flip with probability 0.5 of coming from each allele:\n",
-    "\n",
-    "$$X \\sim \\text{Binomial}(N, p=0.5)$$\n",
-    "\n",
-    "where $X$ is the reference allele count and $N$ is the total read count.\n",
-    "\n",
-    "**Expected variance:** $\\text{Var}(X) = N \\cdot p \\cdot (1-p) = N/4$\n",
-    "\n",
-    "Let's simulate what this looks like:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "binomial-simulation",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Simulate ideal binomial data (balanced, no overdispersion)\n",
-    "n_snps = 1000\n",
-    "read_depth = 50  # Total reads per SNP\n",
-    "\n",
-    "# Input validation\n",
-    "assert n_snps > 0, \"n_snps must be positive\"\n",
-    "assert read_depth > 0, \"read_depth must be positive\"\n",
-    "\n",
-    "# Perfect binomial sampling\n",
-    "ideal_ref_counts = np.random.binomial(n=read_depth, p=0.5, size=n_snps)\n",
-    "ideal_ratios = ideal_ref_counts / read_depth\n",
-    "\n",
-    "# Calculate observed vs expected variance\n",
-    "expected_var = read_depth * 0.5 * 0.5\n",
-    "observed_var = np.var(ideal_ref_counts)\n",
-    "\n",
-    "print(f\"Binomial Model (N={read_depth}, p=0.5)\")\n",
-    "print(f\"  Expected variance: {expected_var:.2f}\")\n",
-    "print(f\"  Observed variance: {observed_var:.2f}\")\n",
-    "print(f\"  Ratio (observed/expected): {observed_var / expected_var:.3f}\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "overdispersion-intro",
-   "metadata": {},
-   "source": [
-    "### 1.2 The Problem: Real Data Shows Overdispersion\n",
-    "\n",
-    "In real sequencing data, the observed variance is typically **larger** than expected from a binomial model. This is called **overdispersion**.\n",
-    "\n",
-    "**Sources of overdispersion in allele counting:**\n",
-    "1. **PCR amplification bias** - Some fragments amplify better than others\n",
-    "2. **Library preparation effects** - Batch effects during sample prep\n",
-    "3. **Technical variability** - Across lanes, flowcells, and sequencing runs\n",
-    "4. **Mapping bias** - Reference allele may map slightly better (even after WASP correction)\n",
-    "\n",
-    "Let's simulate data with overdispersion:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "overdispersed-simulation",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def simulate_overdispersed(n_snps: int, read_depth: int, rho: float = 0.05) -> NDArray[np.int64]:\n",
-    "    \"\"\"\n",
-    "    Simulate overdispersed allele counts using beta-binomial.\n",
-    "\n",
-    "    Args:\n",
-    "        n_snps: Number of SNPs to simulate\n",
-    "        read_depth: Total read depth per SNP\n",
-    "        rho: Dispersion parameter (0 = binomial, higher = more overdispersed)\n",
-    "\n",
-    "    Returns:\n",
-    "        Array of reference allele counts\n",
-    "\n",
-    "    Raises:\n",
-    "        ValueError: If parameters are out of valid range\n",
-    "    \"\"\"\n",
-    "    # Input validation\n",
-    "    if n_snps <= 0:\n",
-    "        raise ValueError(f\"n_snps must be positive, got {n_snps}\")\n",
-    "    if read_depth <= 0:\n",
-    "        raise ValueError(f\"read_depth must be positive, got {read_depth}\")\n",
-    "    if not 0 < rho < 1:\n",
-    "        raise ValueError(f\"rho must be in (0, 1), got {rho}\")\n",
-    "\n",
-    "    mu = 0.5  # Balanced (no true imbalance)\n",
-    "    rho = clamp_rho(rho)  # Apply WASP2's clamping\n",
-    "\n",
-    "    alpha = mu * (1 - rho) / rho\n",
-    "    beta = (1 - mu) * (1 - rho) / rho\n",
-    "\n",
-    "    return betabinom.rvs(n=read_depth, a=alpha, b=beta, size=n_snps)\n",
-    "\n",
-    "\n",
-    "# Simulate with different dispersion levels\n",
-    "rho_values = [0.001, 0.02, 0.05, 0.10]\n",
-    "overdispersed_data = {rho: simulate_overdispersed(n_snps, read_depth, rho) for rho in rho_values}\n",
-    "\n",
-    "# Compare variances\n",
-    "print(f\"Expected binomial variance: {expected_var:.2f}\")\n",
-    "print(\"\\nObserved variances by dispersion (rho):\")\n",
-    "for rho, data in overdispersed_data.items():\n",
-    "    obs_var = np.var(data)\n",
-    "    print(f\"  rho={rho:.3f}: variance={obs_var:.2f} (ratio={obs_var / expected_var:.2f}x)\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "overdispersion-visual",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Visualize the effect of overdispersion\n",
-    "fig, axes = plt.subplots(2, 2, figsize=(12, 10))\n",
-    "\n",
-    "for idx, (rho, data) in enumerate(overdispersed_data.items()):\n",
-    "    ax = axes.flat[idx]\n",
-    "    ratios = data / read_depth\n",
-    "\n",
-    "    # Histogram of observed ratios\n",
-    "    ax.hist(\n",
-    "        ratios,\n",
-    "        bins=30,\n",
-    "        density=True,\n",
-    "        alpha=0.7,\n",
-    "        color=\"steelblue\",\n",
-    "        edgecolor=\"black\",\n",
-    "        label=\"Observed\",\n",
-    "    )\n",
-    "\n",
-    "    # Overlay expected binomial distribution\n",
-    "    x = np.arange(0, read_depth + 1)\n",
-    "    binomial_pmf = binom.pmf(x, read_depth, 0.5)\n",
-    "    ax.plot(x / read_depth, binomial_pmf * read_depth, \"r-\", lw=2, label=\"Expected (Binomial)\")\n",
-    "\n",
-    "    obs_var = np.var(data)\n",
-    "    ax.set_title(f\"rho = {rho:.3f}\\nVariance ratio: {obs_var / expected_var:.2f}x\", fontsize=11)\n",
-    "    ax.set_xlabel(\"Reference Allele Frequency\")\n",
-    "    ax.set_ylabel(\"Density\")\n",
-    "    ax.legend(fontsize=9)\n",
-    "    ax.set_xlim(0, 1)\n",
-    "\n",
-    "plt.suptitle(\"Effect of Overdispersion on Allele Ratio Distributions\", fontsize=14, y=1.02)\n",
-    "plt.tight_layout()\n",
-    "plt.show()\n",
-    "\n",
-    "print(\"\\nKey observation: As rho increases, the distribution becomes wider\")\n",
-    "print(\"than expected from binomial sampling alone.\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "consequences-header",
-   "metadata": {},
-   "source": [
-    "### 1.3 Consequences of Ignoring Overdispersion\n",
-    "\n",
-    "If we use a binomial model on overdispersed data, we will:\n",
-    "\n",
-    "1. **Underestimate variance** → p-values too small\n",
-    "2. **Inflate false positive rate** → Many spurious \"significant\" results\n",
-    "3. **Poor calibration** → QQ plots show massive inflation\n",
-    "\n",
-    "Let's demonstrate this problem:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "false-positive-demo",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def binomial_test_pvalue(ref_count: int, total_count: int) -> float:\n",
-    "    \"\"\"\n",
-    "    Two-sided binomial test for deviation from 0.5.\n",
-    "\n",
-    "    Args:\n",
-    "        ref_count: Number of reference allele reads\n",
-    "        total_count: Total number of reads\n",
-    "\n",
-    "    Returns:\n",
-    "        Two-sided p-value\n",
-    "    \"\"\"\n",
-    "    # Input validation\n",
-    "    if ref_count < 0 or ref_count > total_count:\n",
-    "        raise ValueError(f\"Invalid counts: ref={ref_count}, total={total_count}\")\n",
-    "    if total_count <= 0:\n",
-    "        raise ValueError(f\"total_count must be positive, got {total_count}\")\n",
-    "\n",
-    "    result = stats.binomtest(ref_count, total_count, p=0.5, alternative=\"two-sided\")\n",
-    "    return result.pvalue\n",
-    "\n",
-    "\n",
-    "# Test on overdispersed data (rho=0.05) - no TRUE imbalance!\n",
-    "test_data = overdispersed_data[0.05]\n",
-    "pvalues_binomial = [binomial_test_pvalue(int(k), read_depth) for k in test_data]\n",
-    "\n",
-    "# Count \"significant\" results at different thresholds\n",
-    "pvalues_binomial = np.array(pvalues_binomial)\n",
-    "\n",
-    "print(\"False positive rates using binomial test on overdispersed data:\")\n",
-    "print(\"(Remember: there is NO true imbalance in this data!)\\n\")\n",
-    "for alpha in [0.05, 0.01, 0.001]:\n",
-    "    fp_rate = (pvalues_binomial < alpha).mean()\n",
-    "    print(f\"  Alpha = {alpha:.3f}: {fp_rate * 100:.1f}% significant (expected: {alpha * 100:.1f}%)\")\n",
-    "\n",
-    "print(f\"\\nThis represents a {(pvalues_binomial < 0.05).mean() / 0.05:.1f}x inflation!\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "section2-header",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "## Section 2: The Beta-Binomial Distribution\n",
-    "\n",
-    "### 2.1 Mathematical Foundation\n",
-    "\n",
-    "The **beta-binomial distribution** extends the binomial by allowing the success probability $p$ to vary according to a Beta distribution:\n",
-    "\n",
-    "$$p \\sim \\text{Beta}(\\alpha, \\beta)$$\n",
-    "$$X | p \\sim \\text{Binomial}(N, p)$$\n",
-    "\n",
-    "The marginal distribution of $X$ is the beta-binomial.\n",
-    "\n",
-    "**WASP2's parameterization** uses mean ($\\mu$) and dispersion ($\\rho$):\n",
-    "\n",
-    "$$\\alpha = \\mu \\cdot \\frac{1 - \\rho}{\\rho}, \\quad \\beta = (1 - \\mu) \\cdot \\frac{1 - \\rho}{\\rho}$$\n",
-    "\n",
-    "**Key properties:**\n",
-    "- Mean: $E[X] = N \\cdot \\mu$ (same as binomial)\n",
-    "- Variance: $\\text{Var}(X) = N \\cdot \\mu \\cdot (1-\\mu) \\cdot [1 + (N-1) \\cdot \\rho]$ (inflated!)\n",
-    "\n",
-    "When $\\rho \\to 0$, the beta-binomial converges to the binomial."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "betabinom-params",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def mu_rho_to_alpha_beta(mu: float, rho: float) -> tuple[float, float]:\n",
-    "    \"\"\"\n",
-    "    Convert WASP2's (mu, rho) parameterization to (alpha, beta).\n",
-    "\n",
-    "    This mirrors WASP2's parameterization in src/analysis/as_analysis.py:104-105.\n",
-    "\n",
-    "    Args:\n",
-    "        mu: Mean parameter (allele frequency), in (0, 1)\n",
-    "        rho: Dispersion parameter, in (0, 1)\n",
-    "\n",
-    "    Returns:\n",
-    "        Tuple of (alpha, beta) parameters for scipy.stats.betabinom\n",
-    "\n",
-    "    Warning:\n",
-    "        When rho is near 0 or 1, numerical instability can occur.\n",
-    "        Use clamp_rho() to ensure safe values.\n",
-    "    \"\"\"\n",
-    "    # Apply WASP2's clamping to prevent numerical issues\n",
-    "    rho = clamp_rho(rho)\n",
-    "\n",
-    "    # Validate mu\n",
-    "    if not 0 < mu < 1:\n",
-    "        raise ValueError(f\"mu must be in (0, 1), got {mu}\")\n",
-    "\n",
-    "    alpha = mu * (1 - rho) / rho\n",
-    "    beta = (1 - mu) * (1 - rho) / rho\n",
-    "    return alpha, beta\n",
-    "\n",
-    "\n",
-    "def betabinom_variance(n: int, mu: float, rho: float) -> float:\n",
-    "    \"\"\"\n",
-    "    Compute beta-binomial variance.\n",
-    "\n",
-    "    Args:\n",
-    "        n: Number of trials (total read count)\n",
-    "        mu: Mean parameter (allele frequency)\n",
-    "        rho: Dispersion parameter\n",
-    "\n",
-    "    Returns:\n",
-    "        Variance of the beta-binomial distribution\n",
-    "    \"\"\"\n",
-    "    return n * mu * (1 - mu) * (1 + (n - 1) * rho)\n",
-    "\n",
-    "\n",
-    "# Demonstrate variance inflation\n",
-    "N = 50\n",
-    "mu = 0.5\n",
-    "binomial_var = N * mu * (1 - mu)\n",
-    "\n",
-    "print(f\"Variance comparison (N={N}, mu={mu}):\\n\")\n",
-    "print(f\"Binomial variance: {binomial_var:.2f}\")\n",
-    "print(\"\\nBeta-binomial variance by rho:\")\n",
-    "for rho in [0.001, 0.01, 0.05, 0.10, 0.20]:\n",
-    "    bb_var = betabinom_variance(N, mu, rho)\n",
-    "    inflation = bb_var / binomial_var\n",
-    "    print(f\"  rho={rho:.3f}: {bb_var:.2f} ({inflation:.2f}x inflation)\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "edge-cases-header",
-   "metadata": {},
-   "source": [
-    "### 2.1.1 Edge Cases and Numerical Stability\n",
-    "\n",
-    "**Warning:** The beta-binomial parameterization has dangerous edge cases that can cause numerical errors. This section demonstrates why WASP2's `clamp_rho()` function is essential."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "edge-cases-demo",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Demonstrate edge cases that clamp_rho() prevents\n",
-    "print(\"Edge Cases in Beta-Binomial Parameterization\")\n",
-    "print(\"=\" * 50)\n",
-    "\n",
-    "\n",
-    "def unsafe_mu_rho_to_alpha_beta(mu, rho):\n",
-    "    \"\"\"Unsafe version WITHOUT clamping - for demonstration only.\"\"\"\n",
-    "    alpha = mu * (1 - rho) / rho\n",
-    "    beta = (1 - mu) * (1 - rho) / rho\n",
-    "    return alpha, beta\n",
-    "\n",
-    "\n",
-    "# Case 1: rho = 0 causes division by zero\n",
-    "print(\"\\n1. rho = 0 (division by zero):\")\n",
-    "try:\n",
-    "    with warnings.catch_warnings():\n",
-    "        warnings.simplefilter(\"ignore\")\n",
-    "        alpha, beta = unsafe_mu_rho_to_alpha_beta(0.5, 0.0)\n",
-    "    print(f\"   alpha = {alpha}, beta = {beta}\")\n",
-    "    print(\"   Result: inf values - causes downstream errors!\")\n",
-    "except Exception as e:\n",
-    "    print(f\"   Error: {e}\")\n",
-    "\n",
-    "# Case 2: rho = 1 produces zero alpha/beta\n",
-    "print(\"\\n2. rho = 1 (zero parameters):\")\n",
-    "alpha, beta = unsafe_mu_rho_to_alpha_beta(0.5, 1.0)\n",
-    "print(f\"   alpha = {alpha}, beta = {beta}\")\n",
-    "print(\"   Result: zero values - invalid for Beta distribution!\")\n",
-    "\n",
-    "# Case 3: Very small rho produces huge alpha/beta\n",
-    "print(\"\\n3. rho = 1e-15 (numerical overflow risk):\")\n",
-    "alpha, beta = unsafe_mu_rho_to_alpha_beta(0.5, 1e-15)\n",
-    "print(f\"   alpha = {alpha:.2e}, beta = {beta:.2e}\")\n",
-    "print(\"   Result: huge values - may overflow in gamma functions!\")\n",
-    "\n",
-    "# Safe version with clamping\n",
-    "print(\"\\n\" + \"=\" * 50)\n",
-    "print(\"With WASP2's clamp_rho():\")\n",
-    "print(\"=\" * 50)\n",
-    "\n",
-    "for rho_input in [0.0, 1.0, 1e-15]:\n",
-    "    rho_safe = clamp_rho(rho_input)\n",
-    "    alpha, beta = mu_rho_to_alpha_beta(0.5, rho_safe)\n",
-    "    print(f\"\\nInput rho={rho_input} -> clamped to {rho_safe}\")\n",
-    "    print(f\"   alpha = {alpha:.4f}, beta = {beta:.4f}\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "betabinom-visual",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Visualize beta-binomial vs binomial PMFs\n",
-    "fig, axes = plt.subplots(1, 3, figsize=(14, 4))\n",
-    "\n",
-    "N = 50\n",
-    "x = np.arange(0, N + 1)\n",
-    "\n",
-    "# Panel 1: Different rho values (mu=0.5)\n",
-    "ax = axes[0]\n",
-    "ax.plot(x, binom.pmf(x, N, 0.5), \"k-\", lw=2, label=\"Binomial\")\n",
-    "for rho, color in [(0.02, \"blue\"), (0.05, \"orange\"), (0.10, \"red\")]:\n",
-    "    alpha, beta = mu_rho_to_alpha_beta(0.5, rho)\n",
-    "    ax.plot(x, betabinom.pmf(x, N, alpha, beta), \"--\", lw=2, color=color, label=f\"rho={rho}\")\n",
-    "ax.set_xlabel(\"Reference Count\")\n",
-    "ax.set_ylabel(\"Probability\")\n",
-    "ax.set_title(\"Effect of Dispersion (mu=0.5)\")\n",
-    "ax.legend()\n",
-    "\n",
-    "# Panel 2: Different mu values (rho=0.05)\n",
-    "ax = axes[1]\n",
-    "rho = 0.05\n",
-    "for mu, color in [(0.5, \"gray\"), (0.6, \"blue\"), (0.7, \"orange\"), (0.8, \"red\")]:\n",
-    "    alpha, beta = mu_rho_to_alpha_beta(mu, rho)\n",
-    "    ax.plot(x, betabinom.pmf(x, N, alpha, beta), \"-\", lw=2, color=color, label=f\"mu={mu}\")\n",
-    "ax.set_xlabel(\"Reference Count\")\n",
-    "ax.set_ylabel(\"Probability\")\n",
-    "ax.set_title(f\"Effect of Imbalance (rho={rho})\")\n",
-    "ax.legend()\n",
-    "\n",
-    "# Panel 3: Log-scale tails comparison\n",
-    "ax = axes[2]\n",
-    "ax.semilogy(x, binom.pmf(x, N, 0.5), \"k-\", lw=2, label=\"Binomial\")\n",
-    "for rho, color in [(0.02, \"blue\"), (0.10, \"red\")]:\n",
-    "    alpha, beta = mu_rho_to_alpha_beta(0.5, rho)\n",
-    "    ax.semilogy(x, betabinom.pmf(x, N, alpha, beta), \"--\", lw=2, color=color, label=f\"rho={rho}\")\n",
-    "ax.set_xlabel(\"Reference Count\")\n",
-    "ax.set_ylabel(\"Probability (log scale)\")\n",
-    "ax.set_title(\"Tail Behavior (log scale)\")\n",
-    "ax.legend()\n",
-    "ax.set_ylim(1e-15, 1)\n",
-    "\n",
-    "plt.tight_layout()\n",
-    "plt.show()\n",
-    "\n",
-    "print(\"Key insight: Beta-binomial has heavier tails than binomial,\")\n",
-    "print(\"making extreme counts more likely under the null hypothesis.\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "lrt-header",
-   "metadata": {},
-   "source": [
-    "### 2.2 Likelihood Ratio Test for Imbalance\n",
-    "\n",
-    "WASP2 uses a **likelihood ratio test (LRT)** to detect allelic imbalance:\n",
-    "\n",
-    "- **Null hypothesis ($H_0$):** $\\mu = 0.5$ (balanced)\n",
-    "- **Alternative ($H_1$):** $\\mu \\neq 0.5$ (imbalanced)\n",
-    "\n",
-    "The test statistic is:\n",
-    "\n",
-    "$$\\Lambda = -2 \\left[ \\log L(H_0) - \\log L(H_1) \\right] \\sim \\chi^2_1$$\n",
-    "\n",
-    "This follows a chi-squared distribution with 1 degree of freedom under $H_0$.\n",
-    "\n",
-    "**WASP2 Implementation:** See `src/analysis/as_analysis.py:322-323`"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "lrt-implementation",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def betabinom_loglik(ref_count: int, total_count: int, mu: float, rho: float) -> float:\n",
-    "    \"\"\"\n",
-    "    Compute beta-binomial log-likelihood.\n",
-    "\n",
-    "    Mirrors WASP2's src/analysis/as_analysis.py:81-112 (opt_prob function).\n",
-    "\n",
-    "    Args:\n",
-    "        ref_count: Reference allele count\n",
-    "        total_count: Total read count\n",
-    "        mu: Mean parameter (allele frequency)\n",
-    "        rho: Dispersion parameter\n",
-    "\n",
-    "    Returns:\n",
-    "        Log-likelihood value\n",
-    "    \"\"\"\n",
-    "    alpha, beta = mu_rho_to_alpha_beta(mu, rho)\n",
-    "    return betabinom.logpmf(ref_count, total_count, alpha, beta)\n",
-    "\n",
-    "\n",
-    "def find_mle_mu(ref_count: int, total_count: int, rho: float) -> float:\n",
-    "    \"\"\"\n",
-    "    Find MLE of mu given fixed rho.\n",
-    "\n",
-    "    Mirrors WASP2's src/analysis/as_analysis.py:184-248 (parse_opt function).\n",
-    "\n",
-    "    Args:\n",
-    "        ref_count: Reference allele count\n",
-    "        total_count: Total read count\n",
-    "        rho: Fixed dispersion parameter\n",
-    "\n",
-    "    Returns:\n",
-    "        Maximum likelihood estimate of mu\n",
-    "    \"\"\"\n",
-    "\n",
-    "    def neg_loglik(mu):\n",
-    "        return -betabinom_loglik(ref_count, total_count, mu, rho)\n",
-    "\n",
-    "    # Use bounded optimization with safe mu range\n",
-    "    result = minimize_scalar(neg_loglik, bounds=(0.001, 0.999), method=\"bounded\")\n",
-    "    return result.x\n",
-    "\n",
-    "\n",
-    "def likelihood_ratio_test(\n",
-    "    ref_count: int, total_count: int, rho: float\n",
-    ") -> tuple[float, float, float]:\n",
-    "    \"\"\"\n",
-    "    Perform LRT for allelic imbalance.\n",
-    "\n",
-    "    Mirrors WASP2's calculation in src/analysis/as_analysis.py:322-323.\n",
-    "\n",
-    "    Args:\n",
-    "        ref_count: Reference allele count\n",
-    "        total_count: Total read count\n",
-    "        rho: Dispersion parameter\n",
-    "\n",
-    "    Returns:\n",
-    "        Tuple of (lrt_statistic, p_value, mle_mu)\n",
-    "    \"\"\"\n",
-    "    # Input validation\n",
-    "    if ref_count < 0 or ref_count > total_count:\n",
-    "        raise ValueError(f\"Invalid: ref_count={ref_count}, total_count={total_count}\")\n",
-    "    if total_count <= 0:\n",
-    "        raise ValueError(f\"total_count must be positive, got {total_count}\")\n",
-    "\n",
-    "    # Null likelihood (mu = 0.5)\n",
-    "    null_ll = betabinom_loglik(ref_count, total_count, 0.5, rho)\n",
-    "\n",
-    "    # Alternative likelihood (mu = MLE)\n",
-    "    mle_mu = find_mle_mu(ref_count, total_count, rho)\n",
-    "    alt_ll = betabinom_loglik(ref_count, total_count, mle_mu, rho)\n",
-    "\n",
-    "    # LRT statistic (matches WASP2: -2 * (null_ll - alt_ll))\n",
-    "    lrt = -2 * (null_ll - alt_ll)\n",
-    "    lrt = max(0, lrt)  # Ensure non-negative due to numerical precision\n",
-    "\n",
-    "    # P-value from chi-squared distribution (df=1)\n",
-    "    pvalue = chi2.sf(lrt, df=1)\n",
-    "\n",
-    "    return lrt, pvalue, mle_mu\n",
-    "\n",
-    "\n",
-    "# Example calculation\n",
-    "ref, total = 35, 50  # Observed: 35 ref out of 50 total\n",
-    "rho = 0.05\n",
-    "\n",
-    "lrt, pval, mu_hat = likelihood_ratio_test(ref, total, rho)\n",
-    "\n",
-    "print(\"Example LRT calculation:\")\n",
-    "print(f\"  Observed: {ref} ref / {total} total (ratio = {ref / total:.2f})\")\n",
-    "print(f\"  Dispersion (rho): {rho}\")\n",
-    "print(f\"  MLE of mu: {mu_hat:.3f}\")\n",
-    "print(f\"  LRT statistic: {lrt:.3f}\")\n",
-    "print(f\"  P-value: {pval:.4f}\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "section3-header",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "## Section 3: Dispersion Estimation\n",
-    "\n",
-    "A critical step is estimating the dispersion parameter $\\rho$ from the data. WASP2 offers two approaches:\n",
-    "\n",
-    "1. **Single dispersion model** - One $\\rho$ for the entire dataset\n",
-    "2. **Linear dispersion model** - $\\rho$ varies with read depth: $\\text{logit}(\\rho) = \\beta_0 + \\beta_1 \\cdot N$\n",
-    "\n",
-    "### 3.1 Maximum Likelihood Estimation (MLE)\n",
-    "\n",
-    "MLE finds the $\\rho$ that maximizes the likelihood under $H_0$ (balanced).\n",
-    "\n",
-    "**WASP2 Implementation:** See `src/analysis/as_analysis.py:251-325` (single_model function)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "mle-dispersion",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def estimate_dispersion_mle(\n",
-    "    ref_counts: NDArray[np.int64], total_counts: NDArray[np.int64]\n",
-    ") -> float:\n",
-    "    \"\"\"\n",
-    "    Estimate single dispersion parameter via MLE.\n",
-    "\n",
-    "    Mirrors WASP2's single_model in src/analysis/as_analysis.py:251-325.\n",
-    "    Assumes mu=0.5 (null hypothesis) for all observations.\n",
-    "\n",
-    "    Args:\n",
-    "        ref_counts: Array of reference allele counts\n",
-    "        total_counts: Array of total read counts\n",
-    "\n",
-    "    Returns:\n",
-    "        MLE estimate of dispersion parameter rho\n",
-    "\n",
-    "    Raises:\n",
-    "        ValueError: If arrays have different lengths or invalid values\n",
-    "    \"\"\"\n",
-    "    # Input validation\n",
-    "    ref_counts = np.asarray(ref_counts)\n",
-    "    total_counts = np.asarray(total_counts)\n",
-    "\n",
-    "    if len(ref_counts) != len(total_counts):\n",
-    "        raise ValueError(\"ref_counts and total_counts must have same length\")\n",
-    "    if len(ref_counts) == 0:\n",
-    "        raise ValueError(\"Arrays must not be empty\")\n",
-    "    if np.any(ref_counts < 0) or np.any(ref_counts > total_counts):\n",
-    "        raise ValueError(\"Invalid counts: ref_count must be in [0, total_count]\")\n",
-    "\n",
-    "    def neg_loglik(rho):\n",
-    "        rho = clamp_rho(rho)  # Apply WASP2's clamping\n",
-    "        alpha, beta = mu_rho_to_alpha_beta(0.5, rho)\n",
-    "        ll = np.sum(betabinom.logpmf(ref_counts, total_counts, alpha, beta))\n",
-    "        return -ll\n",
-    "\n",
-    "    result = minimize_scalar(\n",
-    "        neg_loglik,\n",
-    "        bounds=(RHO_EPSILON, 0.5),  # Use RHO_EPSILON as lower bound\n",
-    "        method=\"bounded\",\n",
-    "    )\n",
-    "    return result.x\n",
-    "\n",
-    "\n",
-    "# Estimate dispersion from our simulated data\n",
-    "true_rho = 0.05\n",
-    "test_data = overdispersed_data[true_rho]\n",
-    "total_counts = np.full(len(test_data), read_depth)\n",
-    "\n",
-    "estimated_rho = estimate_dispersion_mle(test_data, total_counts)\n",
-    "\n",
-    "print(\"Dispersion estimation (MLE):\")\n",
-    "print(f\"  True rho: {true_rho:.4f}\")\n",
-    "print(f\"  Estimated rho: {estimated_rho:.4f}\")\n",
-    "print(f\"  Relative error: {abs(estimated_rho - true_rho) / true_rho * 100:.1f}%\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "mom-header",
-   "metadata": {},
-   "source": [
-    "### 3.2 Method of Moments (MoM)\n",
-    "\n",
-    "An alternative is the **Method of Moments**, which matches the observed variance to the expected beta-binomial variance:\n",
-    "\n",
-    "$$\\hat{\\rho}_{\\text{MoM}} = \\frac{\\text{Var}(X/N) - \\mu(1-\\mu)/\\bar{N}}{\\mu(1-\\mu)(1 - 1/\\bar{N})}$$\n",
-    "\n",
-    "MoM is faster but may be less efficient than MLE."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "mom-dispersion",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def estimate_dispersion_mom(\n",
-    "    ref_counts: NDArray[np.int64], total_counts: NDArray[np.int64]\n",
-    ") -> float:\n",
-    "    \"\"\"\n",
-    "    Estimate dispersion using Method of Moments.\n",
-    "\n",
-    "    Args:\n",
-    "        ref_counts: Array of reference allele counts\n",
-    "        total_counts: Array of total read counts\n",
-    "\n",
-    "    Returns:\n",
-    "        MoM estimate of dispersion parameter rho\n",
-    "    \"\"\"\n",
-    "    # Input validation\n",
-    "    ref_counts = np.asarray(ref_counts)\n",
-    "    total_counts = np.asarray(total_counts)\n",
-    "\n",
-    "    if len(ref_counts) != len(total_counts):\n",
-    "        raise ValueError(\"Arrays must have same length\")\n",
-    "\n",
-    "    ratios = ref_counts / total_counts\n",
-    "    mu = 0.5  # Assume balanced under null\n",
-    "\n",
-    "    # Observed variance of ratios\n",
-    "    obs_var = np.var(ratios)\n",
-    "\n",
-    "    # Expected binomial variance (if rho=0)\n",
-    "    mean_n = np.mean(total_counts)\n",
-    "    binom_var = mu * (1 - mu) / mean_n\n",
-    "\n",
-    "    # Solve for rho with safeguards\n",
-    "    numerator = obs_var - binom_var\n",
-    "    denominator = mu * (1 - mu) * (1 - 1 / mean_n)\n",
-    "\n",
-    "    # Handle edge case where denominator is near zero\n",
-    "    if abs(denominator) < 1e-10:\n",
-    "        return RHO_EPSILON\n",
-    "\n",
-    "    rho_mom = numerator / denominator\n",
-    "\n",
-    "    # Clamp to valid range using WASP2's constant\n",
-    "    return float(clamp_rho(rho_mom))\n",
-    "\n",
-    "\n",
-    "# Compare MLE vs MoM\n",
-    "rho_mom = estimate_dispersion_mom(test_data, total_counts)\n",
-    "\n",
-    "print(\"Comparison of estimation methods:\")\n",
-    "print(f\"  True rho: {true_rho:.4f}\")\n",
-    "print(f\"  MLE estimate: {estimated_rho:.4f}\")\n",
-    "print(f\"  MoM estimate: {rho_mom:.4f}\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "mle-mom-comparison",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Systematic comparison across different true rho values\n",
-    "comparison_results = []\n",
-    "\n",
-    "for true_rho in [0.01, 0.02, 0.05, 0.10, 0.15]:\n",
-    "    # Simulate data\n",
-    "    alpha, beta = mu_rho_to_alpha_beta(0.5, true_rho)\n",
-    "    sim_data = betabinom.rvs(n=50, a=alpha, b=beta, size=1000)\n",
-    "    sim_totals = np.full(1000, 50)\n",
-    "\n",
-    "    # Estimate with both methods\n",
-    "    mle_est = estimate_dispersion_mle(sim_data, sim_totals)\n",
-    "    mom_est = estimate_dispersion_mom(sim_data, sim_totals)\n",
-    "\n",
-    "    comparison_results.append(\n",
-    "        {\n",
-    "            \"true_rho\": true_rho,\n",
-    "            \"mle_estimate\": mle_est,\n",
-    "            \"mom_estimate\": mom_est,\n",
-    "            \"mle_error\": abs(mle_est - true_rho) / true_rho * 100,\n",
-    "            \"mom_error\": abs(mom_est - true_rho) / true_rho * 100,\n",
-    "        }\n",
-    "    )\n",
-    "\n",
-    "comparison_df = pd.DataFrame(comparison_results)\n",
-    "print(\"MLE vs MoM Comparison:\")\n",
-    "print(comparison_df.round(4).to_string(index=False))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "linear-model-header",
-   "metadata": {},
-   "source": [
-    "### 3.3 Linear Dispersion Model\n",
-    "\n",
-    "In some datasets, dispersion varies with read depth. The **linear model** parameterizes:\n",
-    "\n",
-    "$$\\text{logit}(\\rho_i) = \\beta_0 + \\beta_1 \\cdot N_i$$\n",
-    "\n",
-    "This allows higher-depth regions to have different dispersion than lower-depth regions.\n",
-    "\n",
-    "**WASP2 Implementation:** See `src/analysis/as_analysis.py:53-78` (opt_linear function)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "linear-model-demo",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def estimate_linear_dispersion(\n",
-    "    ref_counts: NDArray[np.int64], total_counts: NDArray[np.int64]\n",
-    ") -> tuple[float, float]:\n",
-    "    \"\"\"\n",
-    "    Estimate depth-dependent dispersion: logit(rho) = beta0 + beta1 * N\n",
-    "\n",
-    "    Mirrors WASP2's opt_linear in src/analysis/as_analysis.py:53-78.\n",
-    "\n",
-    "    Args:\n",
-    "        ref_counts: Array of reference allele counts\n",
-    "        total_counts: Array of total read counts\n",
-    "\n",
-    "    Returns:\n",
-    "        Tuple of (beta0, beta1) parameters\n",
-    "    \"\"\"\n",
-    "\n",
-    "    def neg_loglik(params):\n",
-    "        beta0, beta1 = params\n",
-    "\n",
-    "        # Compute rho for each observation\n",
-    "        linear_pred = beta0 + beta1 * total_counts\n",
-    "        # Clip to prevent overflow (matches WASP2's approach)\n",
-    "        linear_pred = np.clip(linear_pred, -10, 10)\n",
-    "        rho = expit(linear_pred)\n",
-    "        rho = clamp_rho(rho)  # Apply WASP2's clamping\n",
-    "\n",
-    "        # Compute likelihood\n",
-    "        alpha = 0.5 * (1 - rho) / rho\n",
-    "        beta = 0.5 * (1 - rho) / rho\n",
-    "\n",
-    "        ll = np.sum(betabinom.logpmf(ref_counts, total_counts, alpha, beta))\n",
-    "        return -ll\n",
-    "\n",
-    "    result = minimize(neg_loglik, x0=[-3, 0], method=\"Nelder-Mead\")\n",
-    "    return tuple(result.x)\n",
-    "\n",
-    "\n",
-    "# Simulate data with depth-dependent dispersion\n",
-    "np.random.seed(42)\n",
-    "n_snps = 2000\n",
-    "depths = np.random.choice([20, 50, 100, 200], size=n_snps)\n",
-    "\n",
-    "# True model: logit(rho) = -3 + 0.01 * N\n",
-    "true_beta0, true_beta1 = -3, 0.01\n",
-    "true_rhos = expit(true_beta0 + true_beta1 * depths)\n",
-    "\n",
-    "# Generate counts with proper clamping\n",
-    "ref_counts_linear = np.array(\n",
-    "    [\n",
-    "        betabinom.rvs(\n",
-    "            n=n,\n",
-    "            a=0.5 * (1 - clamp_rho(rho)) / clamp_rho(rho),\n",
-    "            b=0.5 * (1 - clamp_rho(rho)) / clamp_rho(rho),\n",
-    "        )\n",
-    "        for n, rho in zip(depths, true_rhos)\n",
-    "    ]\n",
-    ")\n",
-    "\n",
-    "# Estimate\n",
-    "est_beta0, est_beta1 = estimate_linear_dispersion(ref_counts_linear, depths)\n",
-    "\n",
-    "print(\"Linear dispersion model estimation:\")\n",
-    "print(f\"  True: logit(rho) = {true_beta0:.2f} + {true_beta1:.4f} * N\")\n",
-    "print(f\"  Est:  logit(rho) = {est_beta0:.2f} + {est_beta1:.4f} * N\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "linear-model-visual",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Visualize the linear dispersion model\n",
-    "fig, axes = plt.subplots(1, 2, figsize=(12, 5))\n",
-    "\n",
-    "# Panel 1: Rho vs depth\n",
-    "ax = axes[0]\n",
-    "depth_range = np.linspace(10, 250, 100)\n",
-    "true_curve = expit(true_beta0 + true_beta1 * depth_range)\n",
-    "est_curve = expit(est_beta0 + est_beta1 * depth_range)\n",
-    "\n",
-    "ax.plot(depth_range, true_curve, \"b-\", lw=2, label=\"True\")\n",
-    "ax.plot(depth_range, est_curve, \"r--\", lw=2, label=\"Estimated\")\n",
-    "ax.set_xlabel(\"Read Depth (N)\")\n",
-    "ax.set_ylabel(\"Dispersion (rho)\")\n",
-    "ax.set_title(\"Linear Dispersion Model\")\n",
-    "ax.legend()\n",
-    "\n",
-    "# Panel 2: Show effect on variance\n",
-    "ax = axes[1]\n",
-    "binom_var = depth_range * 0.5 * 0.5\n",
-    "bb_var_true = depth_range * 0.5 * 0.5 * (1 + (depth_range - 1) * true_curve)\n",
-    "bb_var_const = depth_range * 0.5 * 0.5 * (1 + (depth_range - 1) * 0.05)\n",
-    "\n",
-    "ax.plot(depth_range, binom_var, \"k-\", lw=2, label=\"Binomial\")\n",
-    "ax.plot(depth_range, bb_var_const, \"g--\", lw=2, label=\"Constant rho=0.05\")\n",
-    "ax.plot(depth_range, bb_var_true, \"b-\", lw=2, label=\"Linear model\")\n",
-    "ax.set_xlabel(\"Read Depth (N)\")\n",
-    "ax.set_ylabel(\"Variance of Ref Count\")\n",
-    "ax.set_title(\"Variance Scaling with Depth\")\n",
-    "ax.legend()\n",
-    "\n",
-    "plt.tight_layout()\n",
-    "plt.show()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "section4-header",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "## Section 4: QQ Plots for Model Calibration\n",
-    "\n",
-    "**Quantile-Quantile (QQ) plots** compare observed p-values to their expected distribution under the null. If the model is well-calibrated:\n",
-    "\n",
-    "- P-values should be uniformly distributed under $H_0$\n",
-    "- QQ plot should follow the diagonal line\n",
-    "\n",
-    "### 4.1 Constructing QQ Plots"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "qq-function",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def qq_plot(\n",
-    "    pvalues: NDArray[np.float64], ax=None, label: str = \"Observed\", color: str = \"steelblue\"\n",
-    "):\n",
-    "    \"\"\"\n",
-    "    Create a QQ plot of -log10(p-values).\n",
-    "\n",
-    "    Args:\n",
-    "        pvalues: Array of p-values\n",
-    "        ax: Matplotlib axes (created if None)\n",
-    "        label: Label for the scatter plot\n",
-    "        color: Color for the points\n",
-    "\n",
-    "    Returns:\n",
-    "        Matplotlib axes object\n",
-    "    \"\"\"\n",
-    "    if ax is None:\n",
-    "        fig, ax = plt.subplots()\n",
-    "\n",
-    "    # Input validation and cleaning\n",
-    "    pvalues = np.asarray(pvalues, dtype=np.float64)\n",
-    "    pvalues = pvalues[~np.isnan(pvalues)]  # Remove NaN\n",
-    "    pvalues = pvalues[(pvalues > 0) & (pvalues <= 1)]  # Valid p-values only\n",
-    "\n",
-    "    if len(pvalues) == 0:\n",
-    "        raise ValueError(\"No valid p-values after filtering\")\n",
-    "\n",
-    "    pvalues = np.sort(pvalues)\n",
-    "\n",
-    "    # Expected p-values under uniform distribution\n",
-    "    n = len(pvalues)\n",
-    "    expected = (np.arange(1, n + 1) - 0.5) / n\n",
-    "\n",
-    "    # Transform to -log10 scale with safe clipping\n",
-    "    obs_log = -np.log10(np.clip(pvalues, 1e-300, 1))\n",
-    "    exp_log = -np.log10(expected)\n",
-    "\n",
-    "    # Plot\n",
-    "    ax.scatter(exp_log, obs_log, s=10, alpha=0.6, color=color, label=label)\n",
-    "\n",
-    "    # Diagonal line\n",
-    "    max_val = max(exp_log.max(), obs_log.max())\n",
-    "    ax.plot([0, max_val], [0, max_val], \"r--\", lw=1, label=\"Expected\")\n",
-    "\n",
-    "    ax.set_xlabel(\"Expected -log10(p)\")\n",
-    "    ax.set_ylabel(\"Observed -log10(p)\")\n",
-    "\n",
-    "    return ax\n",
-    "\n",
-    "\n",
-    "def genomic_inflation_factor(pvalues: NDArray[np.float64]) -> float:\n",
-    "    \"\"\"\n",
-    "    Calculate genomic inflation factor (lambda).\n",
-    "\n",
-    "    Lambda = 1 indicates perfect calibration.\n",
-    "    Lambda > 1 indicates inflation (too many false positives).\n",
-    "    Lambda < 1 indicates deflation (too conservative).\n",
-    "\n",
-    "    Args:\n",
-    "        pvalues: Array of p-values\n",
-    "\n",
-    "    Returns:\n",
-    "        Genomic inflation factor\n",
-    "    \"\"\"\n",
-    "    pvalues = np.asarray(pvalues, dtype=np.float64)\n",
-    "    pvalues = pvalues[~np.isnan(pvalues)]\n",
-    "    pvalues = pvalues[(pvalues > 0) & (pvalues <= 1)]\n",
-    "\n",
-    "    if len(pvalues) == 0:\n",
-    "        return np.nan\n",
-    "\n",
-    "    chi2_stats = chi2.ppf(1 - pvalues, df=1)\n",
-    "    return np.median(chi2_stats) / chi2.ppf(0.5, df=1)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "qq-comparison",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Generate p-values using binomial vs beta-binomial tests on overdispersed data\n",
-    "true_rho = 0.05\n",
-    "test_data = overdispersed_data[true_rho]\n",
-    "n_total = read_depth\n",
-    "\n",
-    "# Binomial test (wrong model)\n",
-    "pvals_binomial = [stats.binomtest(int(k), n_total, 0.5).pvalue for k in test_data]\n",
-    "\n",
-    "# Beta-binomial test (correct model)\n",
-    "est_rho = estimate_dispersion_mle(test_data, np.full(len(test_data), n_total))\n",
-    "pvals_betabinom = [likelihood_ratio_test(int(k), n_total, est_rho)[1] for k in test_data]\n",
-    "\n",
-    "# Create QQ plots\n",
-    "fig, axes = plt.subplots(1, 2, figsize=(12, 5))\n",
-    "\n",
-    "ax = axes[0]\n",
-    "qq_plot(np.array(pvals_binomial), ax=ax, label=\"Binomial\", color=\"firebrick\")\n",
-    "lambda_binom = genomic_inflation_factor(np.array(pvals_binomial))\n",
-    "ax.set_title(f\"Binomial Test\\n(lambda = {lambda_binom:.2f})\")\n",
-    "ax.legend()\n",
-    "\n",
-    "ax = axes[1]\n",
-    "qq_plot(np.array(pvals_betabinom), ax=ax, label=\"Beta-binomial\", color=\"steelblue\")\n",
-    "lambda_bb = genomic_inflation_factor(np.array(pvals_betabinom))\n",
-    "ax.set_title(f\"Beta-Binomial Test\\n(lambda = {lambda_bb:.2f})\")\n",
-    "ax.legend()\n",
-    "\n",
-    "plt.suptitle(f\"QQ Plots on Null Data (true rho = {true_rho})\", fontsize=14, y=1.02)\n",
-    "plt.tight_layout()\n",
-    "plt.show()\n",
-    "\n",
-    "print(\"Genomic inflation factor (lambda):\")\n",
-    "print(f\"  Binomial test: {lambda_binom:.3f} (expected: 1.0)\")\n",
-    "print(f\"  Beta-binomial: {lambda_bb:.3f} (expected: 1.0)\")\n",
-    "print(\"\\nA lambda >> 1 indicates systematic inflation (too many false positives).\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "qq-interpretation",
-   "metadata": {},
-   "source": [
-    "### 4.2 Interpreting QQ Plots\n",
-    "\n",
-    "| Pattern | Interpretation |\n",
-    "|---------|----------------|\n",
-    "| Points on diagonal | Well-calibrated (good!) |\n",
-    "| Points above diagonal | Inflation (too many small p-values) |\n",
-    "| Points below diagonal | Deflation (test too conservative) |\n",
-    "| Lift at the tail only | True signal mixed with null |\n",
-    "\n",
-    "The **genomic inflation factor (lambda)** quantifies deviation:\n",
-    "- $\\lambda = 1$: Perfect calibration\n",
-    "- $\\lambda > 1$: Inflation\n",
-    "- $\\lambda < 1$: Deflation"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "qq-with-signal",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Demonstrate QQ plot with true signal\n",
-    "np.random.seed(42)\n",
-    "\n",
-    "# Generate mixed data: 90% null + 10% truly imbalanced\n",
-    "n_snps = 1000\n",
-    "n_signal = 100\n",
-    "rho = 0.05\n",
-    "\n",
-    "# Null data (mu = 0.5)\n",
-    "alpha_null, beta_null = mu_rho_to_alpha_beta(0.5, rho)\n",
-    "null_counts = betabinom.rvs(n=50, a=alpha_null, b=beta_null, size=n_snps - n_signal)\n",
-    "\n",
-    "# Signal data (mu = 0.7, strong imbalance)\n",
-    "alpha_sig, beta_sig = mu_rho_to_alpha_beta(0.7, rho)\n",
-    "signal_counts = betabinom.rvs(n=50, a=alpha_sig, b=beta_sig, size=n_signal)\n",
-    "\n",
-    "# Combine\n",
-    "mixed_counts = np.concatenate([null_counts, signal_counts])\n",
-    "is_signal = np.concatenate([np.zeros(n_snps - n_signal), np.ones(n_signal)]).astype(bool)\n",
-    "\n",
-    "# Test all SNPs\n",
-    "est_rho = estimate_dispersion_mle(mixed_counts, np.full(len(mixed_counts), 50))\n",
-    "mixed_pvals = np.array([likelihood_ratio_test(int(k), 50, est_rho)[1] for k in mixed_counts])\n",
-    "\n",
-    "# QQ plot\n",
-    "fig, ax = plt.subplots(figsize=(8, 8))\n",
-    "qq_plot(mixed_pvals, ax=ax)\n",
-    "ax.set_title(f\"QQ Plot with {n_signal} True Signals\\n(10% imbalanced, mu=0.7)\")\n",
-    "ax.legend()\n",
-    "plt.show()\n",
-    "\n",
-    "# Check detection power\n",
-    "detected = mixed_pvals < 0.05\n",
-    "print(\"Detection results (alpha = 0.05):\")\n",
-    "print(f\"  True positives: {detected[is_signal].sum()} / {n_signal}\")\n",
-    "print(f\"  False positives: {detected[~is_signal].sum()} / {n_snps - n_signal}\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "section5-header",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "## Section 5: False Discovery Rate (FDR) Correction\n",
-    "\n",
-    "When testing thousands of SNPs, multiple testing correction is essential. WASP2 uses **Benjamini-Hochberg (BH)** FDR correction via `scipy.stats.false_discovery_control()`.\n",
-    "\n",
-    "**WASP2 Implementation:** See `src/analysis/as_analysis.py:492` which uses `false_discovery_control(pvals, method=\"bh\")`\n",
-    "\n",
-    "### 5.1 The Multiple Testing Problem\n",
-    "\n",
-    "If we test 10,000 SNPs at $\\alpha = 0.05$, we expect 500 false positives even if there's no true signal!"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "fdr-motivation",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Demonstrate multiple testing problem\n",
-    "n_tests = 10000\n",
-    "alpha = 0.05\n",
-    "\n",
-    "# All null (no true signal)\n",
-    "null_pvalues = np.random.uniform(0, 1, n_tests)\n",
-    "\n",
-    "# Without correction\n",
-    "false_positives_raw = (null_pvalues < alpha).sum()\n",
-    "\n",
-    "print(f\"Multiple testing with {n_tests} tests (no true signal):\")\n",
-    "print(f\"  Expected false positives at alpha={alpha}: {n_tests * alpha:.0f}\")\n",
-    "print(f\"  Observed false positives: {false_positives_raw}\")\n",
-    "print(\"\\nThis is why we need FDR correction!\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "bh-header",
-   "metadata": {},
-   "source": [
-    "### 5.2 Benjamini-Hochberg Procedure\n",
-    "\n",
-    "The BH procedure controls the **false discovery rate** - the expected proportion of false positives among all discoveries.\n",
-    "\n",
-    "**Algorithm:**\n",
-    "1. Sort p-values: $p_{(1)} \\leq p_{(2)} \\leq \\ldots \\leq p_{(m)}$\n",
-    "2. Find largest $k$ such that $p_{(k)} \\leq \\frac{k}{m} \\cdot q$ where $q$ is target FDR\n",
-    "3. Reject all hypotheses with $p_{(i)} \\leq p_{(k)}$\n",
-    "\n",
-    "**Note:** WASP2 uses `scipy.stats.false_discovery_control(pvals, method=\"bh\")` for this."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "bh-implementation",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def benjamini_hochberg(pvalues: NDArray[np.float64]) -> NDArray[np.float64]:\n",
-    "    \"\"\"\n",
-    "    Apply Benjamini-Hochberg FDR correction.\n",
-    "\n",
-    "    Note: WASP2 uses scipy.stats.false_discovery_control(pvals, method=\"bh\")\n",
-    "    which is equivalent. This implementation is for educational purposes.\n",
-    "\n",
-    "    Args:\n",
-    "        pvalues: Array of p-values\n",
-    "\n",
-    "    Returns:\n",
-    "        Array of adjusted p-values (q-values)\n",
-    "    \"\"\"\n",
-    "    pvalues = np.asarray(pvalues, dtype=np.float64)\n",
-    "    n = len(pvalues)\n",
-    "\n",
-    "    if n == 0:\n",
-    "        return np.array([])\n",
-    "\n",
-    "    # Sort indices\n",
-    "    sorted_idx = np.argsort(pvalues)\n",
-    "    sorted_pvals = pvalues[sorted_idx]\n",
-    "\n",
-    "    # BH adjustment: p_adj[i] = p[i] * n / rank[i]\n",
-    "    adjusted = sorted_pvals * n / np.arange(1, n + 1)\n",
-    "\n",
-    "    # Ensure monotonicity (cumulative minimum from the end)\n",
-    "    adjusted = np.minimum.accumulate(adjusted[::-1])[::-1]\n",
-    "\n",
-    "    # Cap at 1\n",
-    "    adjusted = np.minimum(adjusted, 1)\n",
-    "\n",
-    "    # Restore original order\n",
-    "    result = np.empty(n)\n",
-    "    result[sorted_idx] = adjusted\n",
-    "\n",
-    "    return result\n",
-    "\n",
-    "\n",
-    "# Compare our implementation with scipy's\n",
-    "print(\"Comparing BH implementations:\")\n",
-    "test_pvals = np.array([0.001, 0.01, 0.02, 0.05, 0.1, 0.5])\n",
-    "our_adjusted = benjamini_hochberg(test_pvals)\n",
-    "scipy_adjusted = false_discovery_control(test_pvals, method=\"bh\")\n",
-    "\n",
-    "print(f\"  Raw p-values:     {test_pvals}\")\n",
-    "print(f\"  Our BH adjusted:  {our_adjusted.round(4)}\")\n",
-    "print(f\"  SciPy BH adjusted:{scipy_adjusted.round(4)}\")\n",
-    "print(f\"  Match: {np.allclose(our_adjusted, scipy_adjusted)}\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "fdr-application",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Apply FDR correction to mixed data\n",
-    "# Using scipy's implementation (same as WASP2)\n",
-    "fdr_pvals = false_discovery_control(mixed_pvals, method=\"bh\")\n",
-    "\n",
-    "# Compare results\n",
-    "print(\"FDR correction comparison (alpha/FDR = 0.05):\")\n",
-    "print(\"\\nRaw p-values:\")\n",
-    "print(f\"  Significant: {(mixed_pvals < 0.05).sum()}\")\n",
-    "print(f\"  True positives: {((mixed_pvals < 0.05) & is_signal).sum()}\")\n",
-    "print(f\"  False positives: {((mixed_pvals < 0.05) & ~is_signal).sum()}\")\n",
-    "\n",
-    "print(\"\\nBH-adjusted p-values (scipy.stats.false_discovery_control):\")\n",
-    "print(f\"  Significant: {(fdr_pvals < 0.05).sum()}\")\n",
-    "print(f\"  True positives: {((fdr_pvals < 0.05) & is_signal).sum()}\")\n",
-    "print(f\"  False positives: {((fdr_pvals < 0.05) & ~is_signal).sum()}\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "fdr-alternatives",
-   "metadata": {},
-   "source": [
-    "### 5.3 Alternative FDR Methods\n",
-    "\n",
-    "WASP2 primarily uses BH, but other options exist:\n",
-    "\n",
-    "| Method | Description | Use Case |\n",
-    "|--------|-------------|----------|\n",
-    "| **Benjamini-Hochberg (BH)** | Standard FDR control | Default choice |\n",
-    "| **Benjamini-Yekutieli (BY)** | Controls FDR under dependency | Correlated tests |\n",
-    "| **Storey's q-value** | Estimates proportion of true nulls | Large-scale testing |\n",
-    "| **mid-p correction** | Less conservative binomial test | Discrete distributions |"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "fdr-visual",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Visualize FDR correction effect\n",
-    "fig, axes = plt.subplots(1, 2, figsize=(12, 5))\n",
-    "\n",
-    "# Panel 1: Raw vs adjusted p-values\n",
-    "ax = axes[0]\n",
-    "sorted_idx = np.argsort(mixed_pvals)\n",
-    "ax.plot(mixed_pvals[sorted_idx], label=\"Raw p-value\", lw=2)\n",
-    "ax.plot(fdr_pvals[sorted_idx], label=\"BH-adjusted\", lw=2)\n",
-    "ax.axhline(0.05, color=\"red\", linestyle=\"--\", label=\"Threshold (0.05)\")\n",
-    "ax.set_xlabel(\"Rank\")\n",
-    "ax.set_ylabel(\"P-value\")\n",
-    "ax.set_title(\"Raw vs FDR-Adjusted P-values\")\n",
-    "ax.set_xlim(0, 200)  # Focus on top hits\n",
-    "ax.legend()\n",
-    "\n",
-    "# Panel 2: Number of discoveries at different thresholds\n",
-    "ax = axes[1]\n",
-    "thresholds = np.linspace(0.001, 0.2, 50)\n",
-    "raw_discoveries = [(mixed_pvals < t).sum() for t in thresholds]\n",
-    "fdr_discoveries = [(fdr_pvals < t).sum() for t in thresholds]\n",
-    "\n",
-    "ax.plot(thresholds, raw_discoveries, label=\"Raw p-value\", lw=2)\n",
-    "ax.plot(thresholds, fdr_discoveries, label=\"FDR-adjusted\", lw=2)\n",
-    "ax.axhline(n_signal, color=\"green\", linestyle=\":\", label=f\"True signals ({n_signal})\")\n",
-    "ax.axvline(0.05, color=\"red\", linestyle=\"--\", alpha=0.5)\n",
-    "ax.set_xlabel(\"Threshold\")\n",
-    "ax.set_ylabel(\"Number of Discoveries\")\n",
-    "ax.set_title(\"Discoveries at Different Thresholds\")\n",
-    "ax.legend()\n",
-    "\n",
-    "plt.tight_layout()\n",
-    "plt.show()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "summary-header",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "## Summary\n",
-    "\n",
-    "In this tutorial, you learned:\n",
-    "\n",
-    "### Key Concepts\n",
-    "\n",
-    "1. **Overdispersion** - Sequencing data shows more variance than the binomial model predicts\n",
-    "   - Sources: PCR bias, library prep effects, technical variability\n",
-    "   - Consequence: Binomial test has inflated false positive rate\n",
-    "\n",
-    "2. **Beta-binomial distribution** - Naturally handles overdispersion\n",
-    "   - Parameters: mean ($\\mu$) and dispersion ($\\rho$)\n",
-    "   - Variance: $N\\mu(1-\\mu)[1 + (N-1)\\rho]$\n",
-    "   - Reduces to binomial when $\\rho \\to 0$\n",
-    "   - **Critical:** Use `clamp_rho()` to prevent numerical instability at boundaries\n",
-    "\n",
-    "3. **Dispersion estimation**\n",
-    "   - **MLE**: More efficient, used by WASP2\n",
-    "   - **MoM**: Faster, closed-form solution\n",
-    "   - **Linear model**: Allows depth-dependent dispersion\n",
-    "\n",
-    "4. **QQ plots** - Diagnostic tool for model calibration\n",
-    "   - Points on diagonal = well-calibrated\n",
-    "   - Inflation factor ($\\lambda$) quantifies deviation\n",
-    "\n",
-    "5. **FDR correction** - Essential for multiple testing\n",
-    "   - Benjamini-Hochberg controls false discovery rate\n",
-    "   - WASP2 uses `scipy.stats.false_discovery_control(pvals, method=\"bh\")`\n",
-    "\n",
-    "### WASP2 Implementation Reference\n",
-    "\n",
-    "| Tutorial Function | WASP2 Source |\n",
-    "|-------------------|-------------|\n",
-    "| `clamp_rho()` | `src/analysis/as_analysis.py:36-50` |\n",
-    "| `mu_rho_to_alpha_beta()` | `src/analysis/as_analysis.py:104-105` |\n",
-    "| `betabinom_loglik()` | `src/analysis/as_analysis.py:81-112` (opt_prob) |\n",
-    "| `likelihood_ratio_test()` | `src/analysis/as_analysis.py:322-323` |\n",
-    "| `estimate_dispersion_mle()` | `src/analysis/as_analysis.py:251-325` (single_model) |\n",
-    "| `estimate_linear_dispersion()` | `src/analysis/as_analysis.py:53-78` (opt_linear) |\n",
-    "| FDR correction | `src/analysis/as_analysis.py:492` |\n",
-    "\n",
-    "### Further Reading\n",
-    "\n",
-    "- [Statistical Models Documentation](../methods/statistical_models.rst)\n",
-    "- [Dispersion Estimation Methods](../methods/dispersion_estimation.rst)\n",
-    "- Skelly et al. (2011) - Beta-binomial framework for allelic imbalance\n",
-    "- Benjamini & Hochberg (1995) - Controlling false discovery rate"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "final-cell",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "print(\"Tutorial complete!\")\n",
-    "print(\"\\nYou now understand the statistical foundations of WASP2's\")\n",
-    "print(\"allelic imbalance detection framework.\")\n",
-    "print(\"\\nKey takeaways:\")\n",
-    "print(f\"  - Always use RHO_EPSILON ({RHO_EPSILON}) to prevent numerical instability\")\n",
-    "print(\"  - Beta-binomial handles overdispersion that binomial cannot\")\n",
-    "print(\"  - QQ plots diagnose model calibration\")\n",
-    "print(\"  - FDR correction is essential for genome-wide testing\")"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.0"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}