xenium_analysis_pipeline

Reproducibility

This workflow is developed with reproducibility bearing in mind. Please refer to the following section for more details.

Note: The pipeline we used in the paper is available with tag v1.0.

Installation

Snakemake

We use Snakemake (v9+) as the backend to this workflow. Thus, a conda environment for Snakemake must be created in the first step. We recommend mamba as a replacement to conda for environment management.

Using reproducibility/environment.yml, we can create an environment for Snakemake:

# the current working directory is the root of this repo
# Alternative: `conda`
mamba env create --use-uv -y -f reproducibility/environment.yml

Note: If you are using mamba < 2.3.3 or conda, please drop --use-uv.

Singularity containers

We use multiple singularity containers for different methods and / or environments. To ensure reproducibility, please build these containers before executing the workflow.

R

The R version we use for this workflow is 4.4.2, and renv is used to track specific versions of packages. Please find files related to renv in reproducibility/r/metadata, and use r.def in reproducibility/r to build the corresponding container:

# the current working directory is the root of this repo
cd reproducibility/r
singularity build --fakeroot --force /path/to/the/built/container r.def

10X Xenium Ranger

The 10X Xenium Ranger version we use here is 4.0.0. A link is used to download the software from the 10X website. Since 10X regularly updates this link, users should replace it with the most recent one if the container fails to be built:

# the current working directory is the root of this repo
singularity build --fakeroot --force /path/to/the/built/container reproducibility/10x.def

Baysor

The Baysor version we use here is 0.7.0.

# the current working directory is the root of this repo
singularity build --fakeroot --force /path/to/the/built/container reproducibility/baysor.def

Proseg

The Proseg version we use here is 3.0.10.

# the current working directory is the root of this repo
singularity build --fakeroot --force /path/to/the/built/container reproducibility/proseg.def

Segger

The Segger version we use here is a fix by us.

# the current working directory is the root of this repo
cd reproducibility/segger
singularity build --fakeroot --force /path/to/the/built/container segger.def

Configuration

Configuration for the workflow

Please edit config/config.yml for the configuration of the workflow. A detailed guideline can be found in config/README.md.

Configuration for execution

We have developed a bash script, run.sh, to make the execution easy for users. Before execution, users need to configure a few entries in it under USER SETUP section.

MODULES: Modules to be loaded prior to execution on clusters.
CONDA_BIN: The name of or path to either mamba or conda.
ENV_NAME: The name of or path to the conda environment (by default xenium_analysis_pipeline).
LOCAL_PROFILE and CLUSTER_PROFILE: The execution of this workflow is controlled by profiles. Please refer to the Snakemake manuel for the details.

We have provided two examples of profiles under profiles. One is for local execution, which locates in profiles/local; the other is for cluster execution, specifically slurm, which locates in profiles/slurm. Users can edit these profiles according to their specific needs.

Besides, users can define their own profiles for execution, e.g., when they use a cluster other than slurm. They only need to specify in proper places the paths to their customised profiles.
SINGULARITY_BIND_DIRS: An array of directories to bind to containers. Each element should be in the following form: LOCAL_DIR:SINGULARITY_DIR. Inexistent local directories will be filtered out.
SNAKEMAKE_CACHE_DIR: A directory used to store cache of Snakemake. Internally it overwrite the XDG_CACHE_HOME environment variable if it is set by users.

Execution

The workflow should be executed from the root directory of this repo. To get a self-explanatory help message, type

# the current working directory is the root of this repo
./run.sh --help

which prints

Usage: [ -m | --mode MODE ] [ -c | --core CORE ] [ -j | --jobs JOBS ] [ --retries RETRIES ] [ -n | --dry-run ] [ -R | --forcerun RULE ] [ -U | --until RULE ] [ --dag OUTPUT ] [ --unlock ] [ -v | --verbose ] [ -h | --help ]
        -m,--mode MODE: the pipeline will be run on 'local' (default) or on 'cluster'.
        -c,--core CORE: the number of cores to be used when -m,--mode is unset or 'local' (default: 1); ignored when -m,--mode is 'cluster'.
        -j,--jobs JOBS: the number of jobs submitted to the cluster at the same time when -m,--mode is 'cluster'. (default: 500).
        --retries RETRIES: the number of retries for failed jobs. (default: 0).
        -n,--dry-run: dry run.
        -R,--forcerun RULE: force the re-execution or creation of the given rule or file. Repeat this option multile times for multiple rules or files.
        -U,--until RULE: runs the pipeline until it finishes the specified rule or generated the file. Repeat this option multile times for multiple rules or files.
        --dag OUTPUT: draw dag and save to OUTPUT.pdf.
        --unlock: unlock the working directory.
        -v,--verbose: print more information.
        -h,--help: print this message.

Note: Execution logging via snkmt is always enabled. The log database is stored at ${output_path}/snkmt.db (where output_path is read from config/config.yml). To monitor the pipeline execution, run:
mamba run -n xenium_analysis_pipeline snkmt console --db-path ${output_path}/snkmt.db

Typical commands

Draw the DAG

./run.sh --dag dag

Dry-run

./run.sh -n

Run on cluster

./run.sh -m cluster

Run until a specific rule

./run.sh -m cluster -U rule_name

Run on cluster with retries

./run.sh -m cluster --retries 2

Other files included in this repo

There are some other files related to Xenium data analysis, residing in notebooks. Please refer to the documentation inside for more details.

Solutions to known problems

Snakemake fails to create conda environments due to the lack of writing permission of /tmp/conda.

Such an issue occurs most often on devices with multiple users, such as HPCs and servers. In this case, the reason is simply that /tmp/conda is possessed by other users. A possible solution is to firstly create a folder in /tmp, such as /tmp/your_id, and then add /tmp/your_id:/tmp to SINGULARITY_BIND_DIRS inside run.sh. After this you can rerun the command, and those environments should be correctly created.

Additionally, for HPCs, users might need to remove /tmp/your_id:/tmp from SINGULARITY_BIND_DIRS after creating environments as directory /tmp/your_id is not likely present in compute nodes.
For those steps involving 10X xeniumranger, sometimes I get the folowing error: "PermissionError: [Errno 13] Permission denied".

10X xeniumranger copies files from raw data during processing. This error could be because the user, as the owner of the raw data, deprives him-/herself of write permission to it. When 10X xeniumranger conducts copy operation, it also copies the modes of files, and hence this error when it needs to write to the copied files. Although it is a safe behaviour to prevent from accidental change of the raw data, users have to have write permission to the raw data when they are also the owner.

Original manuscript analyses

Code to reproduce analyses from the original manuscript can be found at https://github.com/bdsc-tds/Bilous2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

xenium_analysis_pipeline

Reproducibility

Installation

Snakemake

Singularity containers

R

10X Xenium Ranger

Baysor

Proseg

Segger

Configuration

Configuration for the workflow

Configuration for execution

Execution

Typical commands

Other files included in this repo

Solutions to known problems

Original manuscript analyses

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 433 Commits
config		config
notebooks		notebooks
profiles		profiles
reproducibility		reproducibility
workflow		workflow
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
run.sh		run.sh

Folders and files

Latest commit

History

Repository files navigation

xenium_analysis_pipeline

Reproducibility

Installation

Snakemake

Singularity containers

R

10X Xenium Ranger

Baysor

Proseg

Segger

Configuration

Configuration for the workflow

Configuration for execution

Execution

Typical commands

Other files included in this repo

Solutions to known problems

Original manuscript analyses

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages