Data Annotations in FAVOR

FAVOR provides a comprehensive set of functional annotations and annotation PCs (aPCs) for genomic variants, including clinical significance, gene information, and various functional categories.


Functional annotations and annotation PCs (aPCs)

The functional annotations provided in the FAVOR web portal are as follows:

Block NameAnnotation NameExplanationTypeSource
BasicVariantThe unique identifier of the given variant. Reported as chr-pos-ref-alt format.String
BasicrsIDThe rsID of the given variant (if exists).String
BasicTOPMed DepthTOPMed depth of the given variant.String
BasicTOPMed QC StatusTOPMed QC status of the given variant.String
ClinVarClinical SignificanceClinical significance for this single variant. [@landrum2013clinvar; @landrum2017clinvar]String[Source][Ref1,2]
ClinVarClinical significance (genotype includes)Clinical significance for a haplotype or genotype that includes this variant. Reported as pairs of VariationID:clinical significance. [@landrum2013clinvar; @landrum2017clinvar]String[Source][Ref1,2]
ClinVarDisease NameClinVar's preferred disease name for the concept specified by disease identifiers in CLNDISDB. [@landrum2013clinvar; @landrum2017clinvar]String[Source][Ref1,2]
ClinVarDisease Name (included variant)For included variant: ClinVar's preferred disease name for the concept specified by disease identifiers in CLNDISDB. [@landrum2013clinvar; @landrum2017clinvar]String[Source][Ref1,2]
ClinVarReview StatusClinVar review status for the Variation ID. [@landrum2013clinvar; @landrum2017clinvar]String[Source][Ref1,2]
ClinVarAllele OriginAllele origin: 0 - unknown; 1 - germline; 2 - somatic; 4 - inherited; 8 - paternal; 16 - maternal; 32 - de-novo; 64 - biparental; 128 - uniparental; 256 - not-tested; 512 - tested-inconclusive. [@landrum2013clinvar; @landrum2017clinvar]String[Source][Ref1,2]
ClinVarDisease Database IDTag-value pairs of disease database name and identifier, e.g. OMIM:NNNNNN. [@landrum2013clinvar; @landrum2017clinvar]String[Source][Ref1,2]
ClinVarDisease Database ID (includeded variant)For included variant: Tag-value pairs of disease database name and identifier, e.g. OMIM:NNNNNN. [@landrum2013clinvar; @landrum2017clinvar]String[Source][Ref1,2]
ClinVarGene ReportedGene(s) for the variant reported as gene symbol:gene id. The gene symbol and id are delimited by a colon (:) and each pair is delimited by a vertical bar (|). [@landrum2013clinvar; @landrum2017clinvar]String[Source][Ref1,2]
Variant CategoryGencode Comprehensive InfoIdentify whether variants cause protein coding changes using Gencode genes definition systems, it will label the gene name of the variants has impact, if it is intergenic region, the nearby gene name will be labeled in the annotation. [@harrow2012gencode; @frankish2018gencode]String[Source1,2][Ref1,2]
Variant CategoryGencode Comprehensive CategoryIdentify whether variants cause protein coding changes using Gencode genes definition systems. It will label the gene name of the variants has impact, if it is intergenic region, the nearby gene name will be labeled in the annotation. [@harrow2012gencode; @frankish2018gencode]String[Source1,2][Ref1,2]
Variant CategoryDisruptive MissenseIdentify whether the variant is a disruptive missense variant, defined as "disruptive" by the ensemble MetaSVM annotation. [@dong2014comparison]Factor[Source1,2][Ref]
Variant CategoryCAGE PromoterCAGE defined promoter sites from Fantom 5. [@forrest2014promoter]String[Source][Ref]
Variant CategoryCAGE EnhancerCAGE defined permissive Enhancer sites from Fantom 5. [@andersson2014atlas]String[Source][Ref]
Variant CategoryGeneHancerPredicted human enhancer sites from the GeneHancer database. [@fishilevich2017genehancer]String[Ref]
Variant CategorySuperEnhancerPredicted super-enhancer sites and targets in a range of human cell types. [@hnisz2013super]String[Source][Ref]
Variant CategoryGencode Comprehensive Exonic CategoryIdentify variants impact using Gencode exonic definition, and only label exonic categorical information like, synonymous, non-synonymous, frame-shifts indels, etc. [@harrow2012gencode; @frankish2018gencode]String[Source1,2][Ref1,2]
Variant CategoryGencode Comprehensive Exonic InfoIdentify variants cause protein coding changes using Gencode genes definition, and gives out detail annotation information of which exons of the variant has impacts on and how the impacts causes changes in amino acid changes. [@harrow2012gencode; @frankish2018gencode]String[Source1,2][Ref1,2]
Variant CategoryUCSC InfoIdentify whether variants cause protein coding changes using UCSC genes definition systems, it will label the gene name of the variants has impact. If it is intergenic region, the nearby gene name will be labeled in the annotation.String[Source]
Variant CategoryUCSC Exonic InfoIdentify variants cause protein coding changes using UCSC genes definition, and give out detail annotation information of which exons of the variant has impacts on and how the impacts causes changes in amino acid changes.String[Source]
Variant CategoryRefSeq InfoIdentify whether variants cause protein coding changes using RefSeq genes definition systems, it will label the gene name of the variants has impact, if it is intergenic region, the nearby gene name will be labeled in the annotation.String[Source]
Variant CategoryRefSeq Exonic InfoIdentify variants cause protein coding changes using RefSeq genes definition, and give out detailed annotation information of which exons of the variant have impacts on and how the impacts cause changes in amino acid changes.String[Source]
Allele FrequenciesTOPMed Bravo AFTOPMed Bravo Genome Allele Frequency. [@taliun2019sequencing; @nhlbi2018bravo]num[Source][Ref]
Allele FrequenciesGNOMAD Total AFGNOMAD v3 Genome Allele Frequency using all the samples. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesAFR GNOMAD AFGNOMAD v3 Genome African population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesAMR GNOMAD AFGNOMAD v3 Genome Ad Mixed American population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesEAS GNOMAD AFGNOMAD v3 Genome East Asian population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesNFE GNOMAD AFGNOMAD v3 Genome Non-Finnish European population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesFIN GNOMAD AFGNOMAD v3 Genome Finnish European population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesSAS GNOMAD AFGNOMAD v3 Genome South Asian population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesAMI GNOMAD AFGNOMAD v3 Genome Amish population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesASJ GNOMAD AFGNOMAD v3 Genome Ashkenazi Jewish population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesOTH GNOMAD AFGNOMAD v3 Genome Other (population not assigned) frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesMale GNOMAD AFGNOMAD v3 Genome Male Allele Frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesAFR Male GNOMAD AFGNOMAD v3 Genome African Male population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesAMI Male GNOMAD AFGNOMAD v3 Genome Amish Male population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesAMR Male GNOMAD AFGNOMAD v3 Genome Ad Mixed American Male population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesASJ Male GNOMAD AFGNOMAD v3 Genome Ashkenazi Jewish Male population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesEAS Male GNOMAD AFGNOMAD v3 Genome East Asian Male population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesFIN Male GNOMAD AFGNOMAD v3 Genome Finnish European Male population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesNFE Male GNOMAD AFGNOMAD v3 Genome Non-Finnish European Male population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesOTH Male GNOMAD AFGNOMAD v3 Genome Other (population not assigned) Male frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesSAS Male GNOMAD AFGNOMAD v3 Genome South Asian Male population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesFemale GNOMAD AFGNOMAD v3 Genome Female Allele Frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesAFR Female GNOMAD AFGNOMAD v3 Genome African Female population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesAMI Female GNOMAD AFGNOMAD v3 Genome Amish Female population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesAMR Female GNOMAD AFGNOMAD v3 Genome Ad Mixed American Female population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesASJ Female GNOMAD AFGNOMAD v3 Genome Ashkenazi Jewish Female population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesEAS Female GNOMAD AFGNOMAD v3 Genome East Asian Female population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesFIN Female GNOMAD AFGNOMAD v3 Genome Finnish European Female population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesNFE Female GNOMAD AFGNOMAD v3 Genome Non-Finnish European Female population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesOTH Female GNOMAD AFGNOMAD v3 Genome Other (population not assigned) Female frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesSAS Female GNOMAD AFGNOMAD v3 Genome South Asian Female population frequency. [@karczewski2020mutational; @gnomad2019browser]num[Source][Ref]
Allele FrequenciesALL 1000G AF1000 Genome Allele Frequency (Whole genome allele frequencies from the 1000 Genomes Project phase 3 data).num[Source]
Allele FrequenciesAFR 1000G AF1000 Genomes African population frequency.num[Source]
Allele FrequenciesAMR 1000G AF1000 Genomes Ad Mixed American population frequency.num[Source]
Allele FrequenciesEAS 1000G AF1000 Genomes East Asian population frequency.num[Source]
Allele FrequenciesEUR 1000G AF1000 Genomes European population frequency.num[Source]
Allele FrequenciesSAS 1000G AF1000 Genomes South Asian population frequency.num[Source]
Integrative ScoreaPC-Protein-FunctionProtein function annotation PC: the first PC of the standardized scores of "SIFTval, PolyPhenVal, Grantham, Polyphen2_HDIV_score, Polyphen2_HVAR_score, MutationTaster_score, MutationAssessor_score" in PHRED scale. Range: [2.970, 97.690]. [@li2020dynamic]num (+)Individual annotation channels in the FAVOR database.
Integrative ScoreaPC-ConservationConservation annotation PC: the first PC of the standardized scores of "GerpN, GerpS, priPhCons, mamPhCons, verPhCons, priPhyloP, mamPhyloP, verPhyloP" in PHRED scale. Range: [1.478E-09, 99.451]. [@li2020dynamic]num (+)Individual annotation channels in the FAVOR database.
Integrative ScoreaPC-Epigenetics-ActiveActive Epigenetic annotation PC: the first PC of the standardized scores of “EncodeH3K4me1.max, EncodeH3K4me2.max, EncodeH3K4me3.max, EncodeH3K9ac.max, EncodeH3K27ac.max, EncodeH4K20me1.max,EncodeH2AFZ.max,” in PHRED scale.Range: [0, 99.451].[@li2020dynamic]num (+)Individual annotation channels in the FAVOR database.
Integrative ScoreaPC-Epigenetics-RepressedRepressed Epigenetic annotation PC: the first PC of the standardized scores of “EncodeH3K9me3.max, EncodeH3K27me3.max” in PHRED scale. Range: [0, 99.451]. (Li et al., 2020). [@li2020dynamic]num (+)Individual annotation channels in the FAVOR database.
Integrative ScoreaPC-Epigenetics-TranscriptionTranscription Epigenetic annotation PC: the first PC of the standardized scores of “EncodeH3K36me3.max, EncodeH3K79me2.max” in PHRED scale. Range: [0, 99.451]. [@li2020dynamic]num (+)Individual annotation channels in the FAVOR database.
Integrative ScoreaPC-Local-Nucleotide-DiversityLocal nucleotide diversity annotation PC: the first PC of the standardized scores of "bStatistic, RecombinationRate, NuclearDiversity" in PHRED scale. Range: [0, 99.451]. [@li2020dynamic]numIndividual annotation channels in the FAVOR database.
Integrative ScoreaPC-Mutation-DensityMutation density annotation PC: the first PC of the standardized scores of "Common100bp, Rare100bp, Sngl100bp, Common1000bp, Rare1000bp, Sngl1000bp, Common10000bp, Rare10000bp, Sngl10000bp" in PHRED scale. Range: [0, 99.451]. [@li2020dynamic]numIndividual annotation channels in the FAVOR database.
Integrative ScoreaPC-Transcription-FactorTranscription factor annotation PC: the first PC of the standardized scores of "RemapOverlapTF, RemapOverlapCL" in PHRED scale. Range: [1.185, 99.451]. [@li2020dynamic]num (+)Individual annotation channels in the FAVOR database.
Integrative ScoreaPC-MappabilityMappability annotation PC: the first PC of the standardized scores of "umap_k100, bismap_k100, umap_k50, bismap_k50, umap_k36, bismap_k36, umap_k24, bismap_k24" in PHRED scale. Range: [0.185, 99.451]. [@li2020dynamic]num (+)Individual annotation channels in the FAVOR database.
Integrative ScoreaPC-Proximity-To-TSS-TESProximity to TSS (Transcription Starting Site) and TES (Transcription Ending Site) annotation PC: the first PC of "minDistTSS, minDistTSE" in PHRED scale. Range: [0, 99.451]. [@li2020dynamic]num (+)Individual annotation channels in the FAVOR database.
Integrative ScoreCADD RawScoreThe CADD raw score (integrative score). A higher CADD score indicates more deleterious. Range: [-237.102, 22.763]. [@kircher2014general; @rentzsch2018cadd]num (+)[Source][Ref1,2]
Integrative ScoreCADD PHREDThe CADD score in PHRED scale (integrative score). A higher CADD score indicates more deleterious. Range: [0, 99]. [@kircher2014general; @rentzsch2018cadd]num (+)[Source][Ref1,2]
Integrative ScoreLINSIGHTThe LINSIGHT score (integrative score). A higher LINSIGHT score indicates more functionality. Range: [0.215, 0.995]. [@huang2017fast]num (+)[Source][Ref]
Integrative ScoreFATHMM-XFThe FATHMM-XF score (integrative score). A higher FATHMM-XF score indicates more functionality. Range: [0.405, 99.451]. [@rogers2017fathmm]num (+)[Source][Ref]
Integrative ScoreFunseq Value (impact score)A flexible framework to prioritize regulatory mutations from cancer genome sequencing (integrative score). [@fu2014funseq2]num (+)[Source][Ref]
Integrative ScoreFunseq Description (annotation)Funseq annotation pints out whether given mutation falls in coding or non-coding region (integrative score). [@fu2014funseq2]String[Source][Ref]
Integrative ScoreAloft Value (impact score)ALoFT provides extensive annotations to putative loss-of-function variants (LoF) in protein-coding genes including functional, evolutionary and network features (integrative score). [@balasubramanian2017using]num (+)[Source][Ref]
Integrative ScoreAloft Description (annotation)ALoFT annotation can predict the impact of premature stop variants and classify them as dominant disease-causing, recessive disease-causing and benign variants (integrative score). [@balasubramanian2017using]String[Source][Ref]
Protein FunctionPolyPhenCatPolyPhen category of change. [@adzhubei2010method]Factor[Source][Ref]
Protein FunctionPolyPhenValPolyPhen score: It predicts the functional significance of an allele replacement from its individual features. Range: [0, 1] (default: 0). [@adzhubei2010method]num (+)[Source][Ref]
Protein FunctionPolyphen2_HDIVPredicts possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations. HumDiv is Mendelian disease variants vs. divergence from close mammalian homologs of human proteins (>=95% sequence identity). Range: [0, 1] (default: 0). [@adzhubei2010method]num (+)[Source1,2,3][Ref]
Protein FunctionPolyphen2_HVARPredicts possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations. HumVar is all human variants associated with some disease (except cancer mutations) or loss of activity/function vs. common (minor allele frequency >1%) human polymorphism with no reported association with a disease of other effect. Range: [0, 1] (default: 0). [@adzhubei2010method]num (+)[Source1,2,3][Ref]
Protein FunctionGranthamGrantham score: oAA, nAA. It attempts to predict the distance between two amino acids, in an evolutionary sense. A lower Grantham score reflects less evolutionary distance. A higher Grantham score reflects a greater evolutionary distance, and is considered more deleterious. Range: [0, 215] (default: 0). [@grantham1974amino]num (+)[Source1,2][Ref]
Protein FunctionMutationTasterMutationTaster is a free web-based application to evaluate DNA sequence variants for their disease-causing potential. The software performs a battery of in silico tests to estimate the impact of the variant on the gene product/protein. Range: [0, 1] (default: 0). [@schwarz2014mutationtaster2]num (+)[Source1,2,3][Ref]
Protein FunctionMutationAssessorPredicts the functional impact of amino-acid substitutions in proteins, such as mutations discovered in cancer or missense polymorphisms. Range: [-5.135, 6.490] (default: -5.545). [@reva2011predicting]num (+)[Source1,2,3][Ref]
Protein FunctionSIFTcatSIFT category of change. [@ng2003sift]Factor[Source][Ref]
Protein FunctionSIFTvalSIFT score, ranges from 0.0 (deleterious) to 1.0 (tolerated). Range: [0, 1] (default: 1). [@ng2003sift]num (-)[Source][Ref]
ConservationpriPhConsPrimate phastCons conservation score (excl. human). A higher score means the region is more conserved. PhastCons considers n species rather than two. It considers the phylogeny by which these species are related, and instead of measuring similarity/divergence simply in terms of percent identity. It uses statistical models of nucleotide substitution that allow for multiple substitutions per site and for unequal rates of substitution between different pairs of bases. Range: [0, 0.999] (default: 0.0). [@siepel2005evolutionarily]num (+)[Source][Ref]
ConservationmamPhConsMammalian phastCons conservation score (excl. human). A higher score means the region is more conserved. PhastCons considers n species rather than two. It considers the phylogeny by which these species are related, and instead of measuring similarity/divergence simply in terms of percent identity. It uses statistical models of nucleotide substitution that allow for multiple substitutions per site and for unequal rates of substitution between different pairs of bases. Range: [0, 1] (default: 0.0). [@siepel2005evolutionarily]num (+)[Source][Ref]
ConservationverPhConsVertebrate phastCons conservation score (excl. human). A higher score means the region is more conserved. PhastCons considers n species rather than two. It considers the phylogeny by which these species are related, and instead of measuring similarity/divergence simply in terms of percent identity. It uses statistical models of nucleotide substitution that allow for multiple substitutions per site and for unequal rates of substitution between different pairs of bases. Range: [0, 1] (default: 0.0). [@siepel2005evolutionarily]num (+)[Source][Ref]
ConservationpriPhyloPPrimate phyloP score (excl. human). A higher score means the region is more conserved. PhyloP scores measure evolutionary conservation at individual alignment sites. The scores are calculated by comparing with the evolution expected under neutral drift. Positive scores: measure conservation, i.e., slower evolution than expected, at sites that are predicted to be conserved. Negative scores: measure acceleration, i.e., faster evolution than expected, at sites that are predicted to be fast-evolving. Range: [-10.761, 0.595] (default: -0.029). [@pollard2010detection]num (+)[Source][Ref]
ConservationmamPhyloPMammalian phyloP score (excl. human). A higher score means the region is more conserved. PhyloP scores measure evolutionary conservation at individual alignment sites. The scores are calculated by comparing with the evolution expected under neutral drift. Positive scores: measure conservation, i.e., slower evolution than expected, at sites that are predicted to be conserved. Negative scores: measure acceleration, i.e., faster evolution than expected, at sites that are predicted to be fast-evolving. Range: [-20, 4.494] (default: -0.005). [@pollard2010detection]num (+)[Source][Ref]
ConservationverPhyloPVertebrate phyloP score (excl. human). A higher score means the region is more conserved. PhyloP scores measure evolutionary conservation at individual alignment sites. The scores are calculated by comparing with the evolution expected under neutral drift. Positive scores: measure conservation, i.e., slower evolution than expected, at sites that are predicted to be conserved. Negative scores: measure acceleration, i.e., faster evolution than expected, at sites that are predicted to be fast-evolving. Range: [-20, 11.295] (default: 0.042). [@pollard2010detection]num (+)[Source][Ref]
ConservationGerpNNeutral evolution score defined by GERP++. A higher score means the region is more conserved. Range: [0, 19.8] (default: 3.0). [@davydov2010identifying]num (+)[Source][Ref]
ConservationGerpSRejected Substitution score defined by GERP++. A higher score means the region is more conserved. GERP (Genomic Evolutionary Rate Profiling) identifies constrained elements in multiple alignments by quantifying substitution deficits. These deficits represent substitutions that would have occurred if the element were neutral DNA, but did not occur because the element has been under functional constraint. These deficits are referred to as "Rejected Substitutions". Rejected substitutions are a natural measure of constraint that reflects the strength of past purifying selection on the element. GERP estimates constraint for each alignment column; elements are identified as excess aggregations of constrained columns. Positive scores (fewer than expected) indicate that a site is under evolutionary constraint. Negative scores may be weak evidence of accelerated rates of evolution. Range: [-39.5, 19.8] (default: -0.2). [@davydov2010identifying]num (+)[Source][Ref]
EpigeneticsEncodeDNaseMaximum Encode DNase-seq level over 12 cell lines. Range: [0, 118672] (default: 0.0). [@encode2012integrated]num (+)[Source][Ref]
EpigeneticsEncodeH3K27acMaximum Encode H3K27ac level over 14 cell lines. Range: [0.010, 1442.690] (default: 0.36). [@encode2012integrated]num (+)[Source][Ref]
EpigeneticsEncodeH3K4me1Maximum Encode H3K4me1 level over 13 cell lines. Range: [0.010, 227.81] (default: 0.37). [@encode2012integrated]num (+)[Source][Ref]
EpigeneticsEncodeH3K4me2Maximum Encode H3K4me2 level over 14 cell lines. Range: [0.010, 774.99] (default: 0.37). [@encode2012integrated]num (+)[Source][Ref]
EpigeneticsEncodeH3K4me3Maximum Encode H3K4me3 level over 14 cell lines. Range: [0.010, 1093.75] (default: 0.38). [@encode2012integrated]num (+)[Source][Ref]
EpigeneticsEncodeH3K9acMaximum Encode H3K9ac level over 13 cell lines. Range: [0.010, 1340.42] (default: 0.41). [@encode2012integrated]num (+)[Source][Ref]
EpigeneticsEncodeH4K20me1Maximum Encode H4K20me1 level over 11 cell lines. Range: [0.010, 226.64] (default: 0.47). [@encode2012integrated]num (+)[Source][Ref]
EpigeneticsEncodeH2AFZMaximum Encode H2AFZ level over 13 cell lines. Range: [0.020, 468.98] (default: 0.42). [@encode2012integrated]num (+)[Source][Ref]
EpigeneticsEncodeH3K9me3Maximum Encode H3K9me3 level over 14 cell lines. Range: [0.010, 226.64] (default: 0.38). [@encode2012integrated]num (+)[Source][Ref]
EpigeneticsEncodeH3K27me3Maximum Encode H3K27me3 level over 14 cell lines. Range: [0.010, 193.38] (default: 0.47). [@encode2012integrated]num (+)[Source][Ref]
EpigeneticsEncodeH3K36me3Maximum Encode H3K36me3 level over 10 cell lines. Range: [0.020, 246.88] (default: 0.39). [@encode2012integrated]num (+)[Source][Ref]
EpigeneticsEncodeH3K79me2Maximum Encode H3K79me2 level over 13 cell lines. Range: [0.020, 553.06] (default: 0.34). [@encode2012integrated]num (+)[Source][Ref]
EpigeneticsEncodetotalRNAMaximum Encode totalRNA-seq level over 10 cell lines (minus and plus strand separately). Range: [0, 385096] (default: 0.0). [@encode2012integrated]num (+)[Source][Ref]
EpigeneticsGCPercent GC in a window of +/- 75bp. Range: [0, 1] (default: 0.42).num (+)[Source]
EpigeneticsCpGPercent CpG in a window of +/- 75bp. Range: [0, 0.604] (default: 0.02).num (+)[Source]
Transcription FactorsRemapOverlapTFRemap number of different transcription factors binding. Range: [1, 350] (default: -0.5).int (+)[Source]
Transcription FactorsRemapOverlapCLRemap number of different transcription factor - cell line combinations binding. Range: [1, 1068] (default: -0.5).int (+)[Source]
Chromatin StatescHmm E1Number of 48 cell types in chromHMM state E1_poised. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E2Number of 48 cell types in chromHMM state E2_repressed. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E3Number of 48 cell types in chromHMM state E3_dead. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E4Number of 48 cell types in chromHMM state E4_dead. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E5Number of 48 cell types in chromHMM state E5_repressed. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E6Number of 48 cell types in chromHMM state E6_repressed. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E7Number of 48 cell types in chromHMM state E7_weak. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E8Number of 48 cell types in chromHMM state E8_gene. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E9Number of 48 cell types in chromHMM state E9_gene. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E10Number of 48 cell types in chromHMM state E10_gene. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E11Number of 48 cell types in chromHMM state E11_gene. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E12Number of 48 cell types in chromHMM state E12_distal. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E13Number of 48 cell types in chromHMM state E13_distal. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E14Number of 48 cell types in chromHMM state E14_distal. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E15Number of 48 cell types in chromHMM state E15_weak. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E16Number of 48 cell types in chromHMM state E16_tss. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E17Number of 48 cell types in chromHMM state E17_proximal. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E18Number of 48 cell types in chromHMM state E18_proximal. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E19Number of 48 cell types in chromHMM state E19_tss. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E20Number of 48 cell types in chromHMM state E20_poised. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E21Number of 48 cell types in chromHMM state E21_dead. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E22Number of 48 cell types in chromHMM state E22_repressed. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E23Number of 48 cell types in chromHMM state E23_weak. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E24Number of 48 cell types in chromHMM state E24_distal. (default: 1.92). [@ernst2015large]num[Source][Ref]
Chromatin StatescHmm E25Number of 48 cell types in chromHMM state E25_distal. (default: 1.92). [@ernst2015large]num[Source][Ref]
Local Nucleotide DiversityRecombinationRateRecombination rate measures the probability of how likely the region tends to undergo recombination. Range: [0, 54.96] (default: 0). [@gazal2017linkage]num (+)[Ref]
Local Nucleotide DiversityNuclearDiversityNuclear diversity measures the probability of how likely the region diversify. Range: [0.05, 60.25] (default: 0). [@gazal2017linkage]num (+)[Ref]
Local Nucleotide DiversitybStatisticBackground selection score. A background selection (B) value for each position in the genome. B indicates the expected fraction of neutral diversity that is present at a site, with values close to 0 representing near complete removal of diversity as a result of selection and values near 1000 indicating little effect of selection. Range: [0, 1000] (default: 800). [@mcvicker2009widespread]int (+)[Source][Ref]
Mutation DensityCommon100bpNumber of common (MAF > 0.05) BRAVO SNVs in the nearby 100 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 100. Range: [0, 14] (default: 0).int (+)[Source]
Mutation DensityRare100bpNumber of rare (MAF < 0.05) BRAVO SNVs in the nearby 100 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 100. Range: [0, 31] (default: 0).int (+)[Source]
Mutation DensitySngl100bpNumber of single occurrence of BRAVO SNVs in the nearby 100 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutation. Scores range from 0 to 100. Range: [0, 99] (default: 0).int (+)[Source]
Mutation DensityCommon1000bpNumber of common (MAF > 0.05) BRAVO SNVs in the nearby1000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 1000. Range: [0, 73] (default: 0).int (+)[Source]
Mutation DensityRare1000bpNumber of rare (MAF < 0.05) BRAVO SNVs in the nearby 1000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 1000. Range: [0, 74] (default: 0).int (+)[Source]
Mutation DensitySngl1000bpNumber of single occurrence of BRAVO SNVs in the nearby 1000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutation. Scores range from 0 to 1000. Range: [0, 658] (default: 0).int (+)[Source]
Mutation DensityCommon10000bpNumber of common (MAF > 0.05) BRAVO SNVs in the nearby 10000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 10000. Range: [0, 443] (default: 0).int (+)[Source]
Mutation DensityRare10000bpNumber of rare (MAF < 0.05) BRAVO SNVs in the nearby 10000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 10000. Range: [0, 355] (default: 0).int (+)[Source]
Mutation DensitySngl10000bpNumber of single occurrence of BRAVO SNVs in the nearby 10000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutation. Scores range from 0 to 10000. Range: [0, 4750] (default: 0).int (+)[Source]
MappabilityUmap (k100, k50, k36, k24)Mappability of unconverted genome. It measures the extent to which a position can be uniquely mapped by sequence reads. Lower mappability means the estimates of genomic and epigenomic characteristics from sequencing assays are less reliable, and the region has increased susceptibility to spurious mapping from reads from other regions of the genome with sequencing errors or unexpected genetic variation. Range: [0, 1] (default: 0). [@karimzadeh2018umap]num (+)[Source][Ref]
MappabilityBismap (k100, k50, k36, k24)Mappability of the bisulfite-converted genome. Bisulfite sequencing approaches used to identify DNA methylation introduce large numbers of reads that map to multiple regions. This annotation identifies mappability of the bisulfite-converted genome. Range: [0, 1] (default: 0). [@karimzadeh2018umap]num (+)[Source][Ref]
Proximity TableminDistTSSDistance to closest Transcribed Sequence Start (TSS). Range: [1, 3604063] (default: 1e7).num (-)[Source]
Proximity TableminDistTSEDistance to closest Transcribed Sequence End (TSE). Range: [1, 3608885] (default: 1e7).num (-)[Source]
Alphamissenseprotein_variantAmino acid change induced by the alternative allele, in the format POS_aa Alternative amino acid (e.g. V2L). POS_aa is the 1-based position of the residue within the protein amino acid sequence.String[Source]
AlphamissenseAM_pathogenicityCalibrated AlphaMissense pathogenicity scores (ranging between 0 and 1), which canbe interpreted as the predicted probability of a variant being clinically pathogenic.String[Source]
AlphamissenseAM_classClassification of the protein_variant into one of three discrete categories: 'likely_benign','likely_pathogenic', or 'ambiguous'. These are derived using the following thresholds:'likely_benign' if alphamissense_pathogenicity < 0.34; 'likely_pathogenic' ifalphamissense_pathogenicity > 0.564; and 'ambiguous' otherwise.String[Source]
Mutation RatefilterLow: Low quality regions as determined by gnomAD sequencing metrics. Mappability(0.5;overlap with 50nt simple repeat;ReadPosRankSum)1;0 SNVs in 100bp window. SFS_bump: Pentamer context with abnormal SFS. The fraction of high-frequency SNVS Range [0.0005, 0.2] is greater than 1.5x mutation rate controlled average. Tends to be repetitive contexts. TFBS: Transcription factor binding site as determined by overlap with ChIP-seq peaks.String[Source]
Mutation RatePNPentanucleotide contextnum (+)[Source]
Mutation RateMRRoulette mutation rate estimatenum (+)[Source]
Mutation RateMGgnomAD mutation rate estimate (Karczewski et al. 2020)num (+)[Source]
Mutation RateMCCarlson mutation rate estimate (Carlson et al. 2018)num (+)[Source]
cCREsaccessionAccession number of the cCREString[Source]
cCREsannotationPromoter-like (PLS)String[Source]
cCREsannotationAll Candidate Enhancers (pELS & dELS)String[Source]
cCREsannotationProximal enhancer-like (pELS)String[Source]
cCREsannotationDistal enhancer-like (dELS)String[Source]
cCREsannotationChromatin Accessible with CTCF (CA-CTCF)String[Source]
cCREsannotationChromatin Accessible with H3K4me3 (CA-H3K4me3)String[Source]
cCREsannotationChromatin Accessible with TF (CA-TF)String[Source]
cCREsannotationChromatin Accessible Only (CA)String[Source]
cCREsannotationTF Only (TF)String[Source]
cCREsannotationCTCF-Bound cCREsString[Source]
CATlasSignal_ValueActivity signal strength measured in the tissuenum[Source]
CATlasP_valueP-value of the signal significancenum[Source]
CATlasQ_valueQ-value (FDR adjusted P-value)num[Source]
CATlasPeakPeak ID or rank associated with signalnum[Source]
CATlasTissueTissue type in which the signal or linkage is observedString[Source]
CATlascCREs_RegionLinked candidate cis-regulatory element (cCRE) regionString[Source]
CATlasPromoter_RegionPromoter region linked to cCRE via ABC modelString[Source]
CATlasABC_ScoreABC score estimating enhancer–promoter interaction strengthnum[Source]
CATlasLinked GeneGene name linked to the cCRE region via promoterString[Source]
CATlasDistanceGenomic distance between cCRE and linked promoternum[Source]
EpiMapBSSIDUnique biosample state identifierString[Source]
EpiMapStateFull chromatin state name (e.g., EnhA1, TssA), describing regulatory roleString[Source]
EpiMapGroupBroad category grouping the sample (e.g., cancer, normal)String[Source]
EpiMapExtended_InfoExtended tissue/cell line description (e.g., CANCER PROSTATE)String[Source]
EpiMapSample_NameSpecific sample name with treatment condition (e.g., A549, 22Rv1 treated with 10 nM 17b-hydroxy)String[Source]
pgBoostLinked GeneGene symbol linked to the variantString[Source]
pgBoostpg_boostProbabilistic score of SNP-gene link from pgBoost (gradient boosting model trained on multiome fine-mapping data using SCENT, Signac, Cicero, distance)num (+)[Source]
pgBoostpg_boost_percentilePercentile ranking of the pgBoost score across all SNP-gene pairsnum (+)[Source]