An R tool for automated probe design for the 10x Genomics Flex platform. Flexify supports two probe design workflows:
- Fusion probes — 50 bp probes spanning a fusion gene junction, scored and ranked across all possible junction offsets.
- Non-fusion probes — 50 bp antisense probes tiled along a wild-type transcript sequence (e.g. GFP, CRISPR reporter, or any exogenous/custom gene).
Both workflows support Chromium Flex (v1) and GEM-X Flex (v2) assay formats, automated BLAST-based off-target screening, Flex probeset competition checking, and generate full synthesis-ready LHS and RHS probe sequences including the required 10x Genomics handle sequences.
This section walks you through launching the Flexify app with no command-line experience required.
Download and install R from https://cran.r-project.org. Choose the version for your operating system (Windows, macOS, or Linux) and follow the installer prompts.
RStudio is a user-friendly interface for running R. Download the free Desktop version from https://posit.co/download/rstudio-desktop and install it.
Open RStudio. In the Console panel at the bottom, paste the following and press Enter:
install.packages(c("shiny", "tidyverse", "DT", "stringr", "optparse"))This only needs to be done once. It may take a few minutes to complete.
Click the green Code button at the top of this page and select Download ZIP. Unzip the downloaded folder somewhere convenient on your computer.
- In RStudio, go to File → Open File and navigate to the unzipped Flexify folder.
- Open
flexify_app.R. - Click the Run App button that appears in the top-right corner of the editor panel.
The Flexify app will open in a new window (or in your browser). You can now upload your input file and start designing probes.
Note: The off-target BLAST check (Tab 2) is optional but recommended. It requires additional software (BLAST+) and a pre-built reference transcriptome database — see the Installation section below. Alternatively, probe sequences can be screened manually using NCBI BLAST or any other BLAST interface, and any off-target hits should be taken into account when selecting probes in Tab 3.
install.packages(c("shiny", "tidyverse", "DT", "stringr", "optparse"))The easiest way to install BLAST+ is via conda or mamba, which handles all platforms automatically:
# With conda:
conda install -c bioconda blast
# With mamba (faster):
mamba install -c bioconda blastAlternatively, platform-specific binaries are available from NCBI: https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/
Verify installation:
blastn -versionBuild a nucleotide database from your reference transcriptome FASTA:
makeblastdb -in transcriptome.fa -dbtype nucl -out transcriptome_dbFlexify/
├── flexify_app.R # Shiny app (main entry point for interactive use)
├── flexify_cli.R # Command-line interface
├── flexify_core.R # Core fusion probe design functions
├── flexify_nonfusion.R # Non-fusion probe design and competition check functions
├── flexify_offtarget.R # BLAST off-target checking functions (fusion and non-fusion)
├── flexify_handles.R # Handle/barcode appending functions (fusion and non-fusion)
└── README.md
All .R files must be in the same directory.
Flexify supports two fusion input formats in both the Shiny app and the CLI.
Upload the TSV file produced directly by Arriba. Flexify parses the fusion_transcript column automatically, extracting gene names and splitting the sequence at the | breakpoint marker. Rows where the fusion transcript is absent or lacks a | separator are skipped with a warning.
For output from any other fusion caller (e.g. STAR-Fusion, FusionCatcher), prepare a CSV with the following four columns:
| Column | Description |
|---|---|
gene1 |
Name of the first fusion partner gene |
gene2 |
Name of the second fusion partner gene |
gene1_transcript |
mRNA sequence of gene1 ending at the breakpoint (5'→3', at least 25 bp) |
gene2_transcript |
mRNA sequence of gene2 starting at the breakpoint (5'→3', at least 25 bp) |
Column names are case-insensitive. Sequences should be in DNA alphabet (A/T/G/C). Both sequences must contribute at least 25 nucleotides to allow probe enumeration across the full junction offset range.
Example:
gene1,gene2,gene1_transcript,gene2_transcript
BCR,ABL1,ATGCGT...CCAGTA,TTAGCC...GAATTC
EML4,ALK,CCGTAA...TTAGCA,AACGGT...CCTGAA
Prepare a CSV with the following two columns:
| Column | Description |
|---|---|
gene |
Gene or construct name (used to label outputs) |
sequence |
Full mRNA/transcript sequence in DNA alphabet (A/T/G/C); at least 50 bp |
Column names are case-insensitive. Probes are generated as 50 bp reverse-complement windows tiled across the sequence. Windows with GC content outside 44–72% on either half are excluded.
Example:
gene,sequence
GFP,ATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAAC...
TagBFP,ATGGTGAGCGTGATCAAGCCCGACATGCCCATCGTGGAGGGCGGCATGGACGGCTACGTGCTGGAGCCCTTC...
Launch the interactive app from RStudio or the command line:
# From RStudio: open flexify_app.R and click 'Run App'
# From the command line:
Rscript -e "shiny::runApp('flexify_app.R')"The app has three tabs:
Select the probe design mode at the top of the sidebar:
- Fusion probes: Upload an Arriba TSV or a generic 4-column CSV and configure design parameters (restraint, asterix/halves/mRNA markers, penalise left-half junction).
- Non-fusion probes: Upload a 2-column gene/sequence CSV and configure output options (halves marker, mRNA sequence).
Click Design Probes to run. Results are shown as a summary box (top-ranked probe per fusion or per gene) and a full sortable/filterable table. Two downloads are available:
- Download All Probes: complete ranked output as CSV.
- Download Selection Template: same output with empty
Selected(FALSE) andBarcode(NA) columns — fill these in externally and re-upload in Tab 3.
- Min. bases per probe half: the minimum number of bases each fusion partner must contribute to a candidate probe (default: 5).
- Mark fusion point with *: inserts an asterisk at the exact fusion breakpoint position.
- Mark probe halves with |: inserts a pipe character at the LHS/RHS boundary (position 25|26).
- Include mRNA target sequence: appends the reverse complement of the probe.
- Penalise left-half junction probes: applies a 0.7× score multiplier to probes where the junction falls in the left half.
- Mark probe halves with |: inserts a pipe character at the LHS/RHS boundary.
- Include mRNA target sequence: appends the mRNA window the probe targets.
Both checks are optional and can be run in either order after Tab 1.
Screens probe sequences against a reference transcriptome database using BLAST.
Fusion probes: only the junction-spanning half (the half crossing the fusion breakpoint) is queried. The non-junction half binds a wild-type sequence by design and is not screened. A probe is flagged as off-target if its junction half has a hit with fewer than min_mismatches effective mismatches to any transcript.
Non-fusion probes: both the LHS (bases 1–25) and RHS (bases 26–50) halves are queried independently. Because each half will always have a perfect match to its intended target gene, a half is flagged as off-target only if close hits (fewer than min_mismatches effective mismatches) are found to more than one unique transcript. This flags probes where either half may bind unintended targets while accepting the expected on-target hit.
Effective mismatches = aligned_mismatches + (query_length − alignment_length). Provide the BLAST database path (the path prefix used with makeblastdb, without file extension).
Checks whether any probe half closely matches a sequence already present in the standard 10x Genomics Flex whole-transcriptome probeset, which could compete for the same target and reduce signal.
Fusion probes: only the non-junction (wild-type) half is checked, since the junction half spans a novel sequence not present in the standard probeset.
Non-fusion probes: both halves are checked independently, since both bind wild-type sequence.
A probe half is flagged as competing if its Hamming distance to any standard probeset sequence is ≤ max_mismatches (default: 2). Provide the bundled probeset CSV or select from the pre-loaded v1/v2 probesets.
Results from both checks are carried forward to Tab 3.
Select your assay version (v1 or v2) in the sidebar before generating final probes.
Fusion probes: each fusion (GENE1::GENE2) is shown as a panel with radio buttons listing all ranked candidate probes and a barcode dropdown (v1 only). Select one probe per fusion, assign barcodes (v1), then click Generate Final Probes.
Non-fusion probes: each gene is shown as a panel with radio buttons. Select one probe per gene, assign barcodes (v1 only), then click Generate Final Probes.
CSV upload path: upload a CSV with columns GENE1/GENE2 (fusion) or GENE (non-fusion), probe, and Barcode (v1 only, integer 1–16 or string BC001–BC016). If a Selected column is present, only rows with Selected == TRUE are processed.
The output table contains the full LHS and RHS probe sequences ready for submission to an oligonucleotide synthesis provider.
Barcode-embedded format. Each RHS probe encodes one of 16 Probe Barcodes (BC001–BC016). The barcode must match the corresponding whole transcriptome probe in the hybridisation mix for that sample.
Barcoding is handled by kit reagents and is not embedded in the custom probe sequence. Select the RHS configuration:
- Multiplex (CCCATATAAGAAA): standard v2 tail for multiplexed experiments.
- Singleplex (CGGTCCTAGCAA): tail for the 4-sample singleplex kit.
# Fusion probes from generic CSV:
Rscript flexify_cli.R --input fusions.csv --output probes.csv
# Fusion probes from Arriba TSV:
Rscript flexify_cli.R --arriba --input fusions.tsv --output probes.csv
# With optional flags:
Rscript flexify_cli.R \
--input fusions.csv \
--output probes.csv \
--restraint 5 \
--mrna \
--prioritise-rhsFlags:
| Flag | Default | Description |
|---|---|---|
--input / -i |
required | Input fusion CSV or Arriba TSV (with --arriba) |
--output / -o |
required | Output ranked probe CSV |
--arriba |
FALSE | Parse input as an Arriba TSV file |
--restraint |
5 | Minimum bases per probe half from each gene |
--mrna |
FALSE | Include mRNA target sequence column |
--prioritise-rhs |
FALSE | Penalise left-half junction probes |
--no-asterix |
FALSE | Omit fusion breakpoint marker (*) |
--no-halves |
FALSE | Omit probe half boundary marker ( |
Rscript flexify_cli.R \
--mode blast \
--input probes.csv \
--blast-db /path/to/transcriptome_db \
--output probes_filtered.csv
# Keep failed probes in output:
Rscript flexify_cli.R --mode blast -i probes.csv \
--blast-db /path/db --output probes_flagged.csv --keep-failsAdditional flags:
| Flag | Default | Description |
|---|---|---|
--blast-db |
required | Path to BLAST database (no extension) |
--min-mismatches |
5 | Minimum effective mismatches for a hit to be considered off-target |
--threads |
1 | Number of BLAST threads |
--keep-fails |
FALSE | Retain failed probes with flag columns rather than removing |
# Chromium Flex v1 (default) — barcode embedded in probe:
Rscript flexify_cli.R \
--mode finalise \
--input selected_probes.csv \
--output final_probes.csv
# GEM-X Flex v2 multiplex — no barcode in probe:
Rscript flexify_cli.R \
--mode finalise \
--assay-version v2 \
--input selected_probes.csv \
--output final_probes.csv
# GEM-X Flex v2 singleplex (4-sample kit):
Rscript flexify_cli.R \
--mode finalise \
--assay-version v2 --rhs-mode singleplex \
--input selected_probes.csv \
--output final_probes.csvv1: Input CSV must contain GENE1, GENE2, probe, and Barcode (integer 1–16 or string BC001–BC016).
v2: Input CSV requires only GENE1, GENE2, and probe — no Barcode column.
If a Selected column is present in either case, only rows marked TRUE are processed.
Additional flags:
| Flag | Default | Description |
|---|---|---|
--assay-version |
v1 | Assay version: v1 (Chromium Flex) or v2 (GEM-X Flex) |
--rhs-mode |
multiplex | v2 RHS tail: multiplex (CCCATATAAGAAA) or singleplex (CGGTCCTAGCAA) |
Each candidate probe is scored on four criteria. The composite score is the product of all four components; any zero score excludes the probe from the output.
| Criterion | Method | Score range |
|---|---|---|
| GC content (LHS half) | Tiered, based on 10x guidelines | 0–5 (0 if outside 44–72%) |
| GC content (RHS half) | Tiered, based on 10x guidelines | 0–5 (0 if outside 44–72%) |
| Junction position | Gaussian, optimum at 12.5 bp from centre of each half | 0–5 |
| Ligation dinucleotide | Preferred set: AT, CA, CT, TA, TC, TG, TT | 1 or 3 |
| Homopolymer runs | Penalised for runs of 4+ identical bases | 1–2 |
Non-fusion probes are scored on GC content and homopolymer content only (there is no junction position score, since all probes tile uniformly across the transcript). Probes with GC content outside 44–72% on either half receive a score of zero and are excluded.
The LHS handle sequence is identical for both assay versions and both probe types. The RHS structure depends on assay version.
CCTTGGCACCCGAGAATTCCA [21 bp constant handle]
+ [bases 1–25 of the 50 bp probe]
Total length: 46 bp.
/5Phos/ [5-prime phosphorylation]
+ [bases 26–50 of the 50 bp probe]
+ ACGCGGTTAGCACGTA [16 bp linker / Constant Sequence]
+ NN [2 bp spacer]
+ [8 bp Probe Barcode] [BC001–BC016, unique per sample pool]
+ CGGTCCTAGCAA [12 bp constant tail]
Total length: 70 characters (excluding /5Phos/).
/5Phos/ [5-prime phosphorylation]
+ [bases 26–50 of the 50 bp probe]
+ CCCATATAAGAAA [13 bp constant tail]
Total length: 38 characters (excluding /5Phos/). No barcode in probe.
/5Phos/ [5-prime phosphorylation]
+ [bases 26–50 of the 50 bp probe]
+ CGGTCCTAGCAA [12 bp constant tail]
Total length: 37 characters (excluding /5Phos/). No barcode in probe.
| Barcode | Sequence | Pool |
|---|---|---|
| BC001 | ACTTTAGG | poolOne |
| BC002 | AACGGGAA | poolTwo |
| BC003 | AGTAGGCT | poolThree |
| BC004 | ATGTTGAC | poolFour |
| BC005 | ACAGACCT | poolFive |
| BC006 | ATCCCAAC | poolSix |
| BC007 | AAGTAGAG | poolSeven |
| BC008 | AGCTGTGA | poolEight |
| BC009 | ACAGTCTG | poolNine |
| BC010 | AGTGAGTG | poolTen |
| BC011 | AGAGGCAA | poolEleven |
| BC012 | ACTACTCA | poolTwelve |
| BC013 | ATACGTCA | poolThirteen |
| BC014 | ATCATGTG | poolFourteen |
| BC015 | AACGCCGA | poolFifteen |
| BC016 | ATTCGGTT | poolSixteen |
Each v1 RHS probe must use the same barcode as the corresponding whole transcriptome probe in the hybridisation mix for that sample.
Off-target specificity is assessed by BLAST alignment against the reference transcriptome. The strategy differs by probe type.
Fusion probes: only the junction-spanning half is screened. The non-junction half binds a wild-type sequence by design and is expected to have a perfect match to one of the fusion partner genes. A probe is flagged as off-target if the junction half has a hit with fewer than min_mismatches effective mismatches to any transcript.
Non-fusion probes: both halves (LHS: bases 1–25; RHS: bases 26–50) are screened independently. Because each half will always have a 0-mismatch hit to its intended target gene, a half is only flagged if close hits are found to more than one unique transcript. A probe passes if both halves pass. Note: multiple isoforms of the same gene may generate multiple hits that are each acceptable; the check is intentionally conservative and targets truly off-target binding to unrelated transcripts.
Effective mismatches = aligned_mismatches + (query_length − alignment_length).
This can be run:
- In the app: using Tab 2 with a locally installed BLAST+ and pre-built database.
- Via the CLI: using
--mode blast. - Manually: by exporting the probe CSV from Tab 1, running BLAST externally, and filtering before re-importing in Tab 3.
If you use Flexify in your research, please cite:
[Citation to be added upon publication]
For issues or questions, please open an issue on GitHub: https://github.com/Oshlack/flexify