-
Notifications
You must be signed in to change notification settings - Fork 12
Description
-
Ensembl all exons from protein-coding genes and isoforms for all human builds in GTF format
-
Ensembl coding exons from protein-coding genes and isoforms for all human builds in GTF format
-
Ensembl introns from protein-coding genes and isoforms for all human builds in GTF format
-
Microsatellites (excluding GRCh38 because of mapping file)
-
Cytobands (excluding GRCh38 because of mapping file)
-
chromosome sizes
-
CCRs
-
gnomAD SVs
-
CCDG SVs
-
ClinVar
-
Recombination Maps
-
Vista Enhancers
-
Scores used in pathoscore (see the recipes there)
- Truth Sets: https://github.com/quinlan-lab/pathoscore/tree/master/truth-sets/GRCh37
- Gene Sets: https://github.com/quinlan-lab/pathoscore/tree/master/gene-sets/GRCh37 -
Conservation tracks
-
ENCODE datasets
-
GTeX datasets
-
HCA datasets
-
pext gnomAD
-
reference transcriptome
-
reference proteome
-
clinvar
-
gencode
-
cosmic
-
dbsnp
-
dbscnv
-
dbnsfp
-
hapmap
-
radar
-
giab (?)
-
CADD
-
BRAVO/Top Med
-
UCSC Tracks
-
Ensembl
-
dbnsfp (with postprocessing as suggested by Ensembl: https://www.ensembl.org/info/docs/tools/vep/script/vep_example.html)
-
NCBI pipelines genomes, in particular the gold standard reference as suggested by Heng Li: ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids/GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.fna.gz
-
GTF annotation track for each reference genome
-
BWA, Bowtie and STAR index for each reference genome
-
Reference transcriptomes for use with kallisto/salmon. Best from Ensembl
-
MIG (Medically Interpretable Genes)
-
ACMG regions
-
The Global Alliance for Genomics and Health genomic file (?)
-
Mills indels
- ftp://gsapubftp-anonymous:[email protected]/bundle/hg19/
-
small RNA seq annotation
- http://mirbase.org/
- https://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgRna
- https://genome.ucsc.edu/cgi-bin/hgTables?db=hg19&hgta_group=genes&hgta_track=wgRna&hgta_table=wgRna&hgta_doSchema=describe+table+schema
- wgRna.txt.gz http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/wgRna.txt.gz - http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/tRNAs.txt.gz
-
platinum genome NA12878
-
RNA seq data
-
Ribosom profiling data
-
chip seq
-
Splice AI
-
Gene Splicer
-
qsignature
-
CCDS
-
Exome Sequencing Project (ESP) liftover to hg38
-
Genotype2Phenotype (G2P) annotations