SPLIT: Spatial Purification of Layered Intracellular Transcripts

🚧 This package is under active development. ❗ Make sure you use the latest version (i.e., v0.2.0).

⚡ Use the Quick Start guide below to get up and running quickly.

🆕🔥 A comprehensive tutorial of running SPLIT on ATERA (full-transcriptome spatial) data and its comparison to Xenium is available as .Rmd and .html (SPLIT v0.2.0 runs in minutes on full-transcriptome data). ❗ Requires SPLIT v0.2.0 or later.

📖 A comprehensive tutorial of running SPLIT on Xenium data is available as .Rmd and .html (<30 min total runtime on a standard PC, incl. 4 min for SPLIT with a peak memory usage of ~21 GB).

🆕🔥 A comprehensive tutorial of running SPLIT on VisiumHD data is available as .Rmd and .html (~30 min total runtime on a standard PC, incl. 10 min for SPLIT with a peak memory usage of ~52 GB). ❗ Requires SPLIT v0.1.2 or later.

What's new in v0.2.0

Annotation-method agnostic: SPLIT no longer depends on RCTD output. It now accepts any deconvolution result — all you need is a cells x cell-types weight matrix, a genes x cell-types reference, and optionally a primary cell-type vector (otherwise inferred as the argmax of the weights).
Full-transcriptome compatible: SPLIT now scales to large full-transcriptome platforms such as ATERA (~18,000 genes) and runs to completion in minutes.
VisiumHD and Xenium support remains fully intact and backward compatible.
Faster and leaner: improved chunked sparse-matrix computation reduces peak memory usage and runtime across all platforms.

📦 Installation

To install SPLIT from GitHub:

remotes::install_github("bdsc-tds/SPLIT")

🚀 Quick Start

RCTD-based (legacy, backward compatible)

If you already have your dataset as a Seurat object (xe) and RCTD results from doublet-mode decomposition, you can run SPLIT as before:

library(SPLIT)
library(Seurat)

# Post-process RCTD output
RCTD <- SPLIT::run_post_process_RCTD(RCTD)

# Run SPLIT purification
res_split <- SPLIT::purify(
  counts = GetAssayData(xe, assay = "Xenium", layer = "counts"),
  rctd   = RCTD,
  DO_purify_singlets = TRUE
)

# Create a purified Seurat object
xe_purified <- CreateSeuratObject(
  counts    = res_split$purified_counts,
  meta.data = res_split$cell_meta,
  assay     = "Xenium"
)

# Optional: filter, normalize and visualize
xe_purified <- subset(xe_purified, subset = nCount_Xenium > 5)
xe_purified <- xe_purified %>%
  SCTransform(assay = "Xenium") %>%
  RunPCA() %>%
  RunUMAP(dims = 1:20)

UMAPPlot(xe_purified, group.by = "first_type", label = TRUE, repel = TRUE) +
  theme(aspect.ratio = 1)

Annotation-method agnostic (v0.2.0+)

Provide deconvolution weights, a reference matrix, and primary cell-type labels from any annotation tool:

library(SPLIT)
library(Seurat)

# Extract deconvolution weights, primary cell-type vector and reference matrix
# from any deconvolution result (shown here with RCTD for illustration)
new_input <- SPLIT::convert_rctd_result_to_purify_input(rctd = RCTD)

# Run SPLIT purification — returns a SingleCellExperiment object
res_split <- SPLIT::purify(
  counts                = GetAssayData(xe, assay = "Xenium", layer = "counts"),
  reference             = t(new_input$reference),          # genes x cell-types
  primary_cell_type     = new_input$primary_cell_type,     # named character vector
  deconvolution_weights = new_input$deconvolution_weights, # cells x cell-types
  DO_output_sce         = TRUE   # set FALSE to get a plain list instead
)

# Create a purified Seurat object from the SCE output
xe_purified <- CreateSeuratObject(
  counts    = assay(res_split, "purified_counts"),
  meta.data = as.data.frame(colData(res_split)),
  assay     = "Xenium"
)

# Optional: filter, normalize and visualize
xe_purified <- subset(xe_purified, subset = nCount_Xenium > 5)
xe_purified <- xe_purified %>%
  SCTransform(assay = "Xenium") %>%
  RunPCA() %>%
  RunUMAP(dims = 1:20)

UMAPPlot(xe_purified, group.by = "first_type", label = TRUE, repel = TRUE) +
  theme(aspect.ratio = 1)

Citation

If you use SPLIT in your work, please cite:

Resolving sensitivity, specificity and signal contamination in Xenium spatial transcriptomics
Mariia Bilous, Daria Buszta, Jonathan Bac, Senbai Kang, Yixing Dong, Stephanie Tissot, Sylvie Andre, Marina Alexandre-Gaveta, Christel Voize, Solange Peters, Krisztian Homicsko, Raphael Gottardo
Nature Methods (2026). https://doi.org/10.1038/s41592-026-03089-8

Contact

If you have any questions about the package, feel free to open an issue or contact Mariia Bilous at Mariia.Bilous@chuv.ch.

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
.github		.github
R		R
doc		doc
inst		inst
vignettes		vignettes
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SPLIT: Spatial Purification of Layered Intracellular Transcripts

What's new in v0.2.0

📦 Installation

🚀 Quick Start

RCTD-based (legacy, backward compatible)

Annotation-method agnostic (v0.2.0+)

Citation

Contact

About

Uh oh!

Releases 4

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SPLIT: Spatial Purification of Layered Intracellular Transcripts

What's new in v0.2.0

📦 Installation

🚀 Quick Start

RCTD-based (legacy, backward compatible)

Annotation-method agnostic (v0.2.0+)

Citation

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages