The discovery of rare and novel splice junctions in cancer can lead to new knowledge and treatments, yet the mere absence from a reference annotation does not ensure a splice junction is truly novel. Efficiently querying publicly available data for evidence of rare splice junctions is challenging due to the cost of processing raw data and the substantial technical and biological variability across samples.
LemonSplice is an interactive application that lets researchers easily answer questions about whether junctions are real, artefacts, novel or interesting by combing uniformly processed RNA-seq data (Recount and Snaptron) with an interface for rapid access and querying of splicing information (RangeDSummarisedExperiments) and interactive rich visualizations (Shiny, ggtranscript).
Interactive features:
- Transcript and exon filtering
- Zoom to region of interest
- Filter and select junctions
- Apply labels and colours to exons
- Scale introns and junctions for emphasis
- Extract literature evidence for junctions via metadata
The Recount and Snaptron (Wilks et al.) packages allow convenient access to pre-processed RNA-seq from all genes in a whole experiment, or a single genomic region across many samples, respectively.
References are used to align the reads to a splicing-aware aligner, and summarised at the gene, exon and junction level. Critical for splicing, junctions are agnostic to the reference used and allow for novel isoforms not in reference transcriptomes to be analysed. Available resources include Sequence Read Archive (SRA) , Genotype-Tissue Expression (GTEx), The Cancer Genome Atlas (TCGA) and Cancer Cell Line Encyclopedia (CCLE), totally over 350K samples.
- Tsai et al. Outlier Expression of Isoforms by Targeted or Total RNA Sequencing Identifies Clinically Significant Genomic Variants in Hematolymphoid Tumors, DOI: 10.1016/j.jmoldx.2023.06.007
- Lonsdale et al. Toblerone: detecting exon deletion events in cancer using RNA-seq [version 1; peer review: 2 approved]. DOI: 10.12688/f1000research.129490.1
- Wilks et al. Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples, Bioinformatics, (2017), btx547, DOI: 10.1093/bioinformatics/btx547
- Wilks et al. recount3: summaries and queries for large-scale RNA-seq expression and splicing. Genome Biol, 2021. DOI: 10.1186/s13059-021-02533-6.