Data Annotations in FAVOR

FAVOR provides a comprehensive set of functional annotations and annotation PCs (aPCs) for genomic variants, including clinical significance, gene information, and various functional categories.

Functional annotations and annotation PCs (aPCs)

The functional annotations provided in the FAVOR web portal are as follows:

Detailed descriptions of selected functional annotations and annotation pcs in the favor database. for numeric type of annotation marked as (+), a higher value indicates increased functionality according to that annotation. for numeric type of annotation marked as (-), a lower value indicates increased functionality according to that annotation.

Block Name	Annotation Name	Explanation	Type	Source
Basic	Variant	The unique identifier of the given variant. Reported as chr-pos-ref-alt format.	String
Basic	rsID	The rsID of the given variant (if exists).	String
Basic	TOPMed Depth	TOPMed depth of the given variant.	String
Basic	TOPMed QC Status	TOPMed QC status of the given variant.	String
ClinVar	Clinical Significance	Clinical significance for this single variant. [@landrum2013clinvar; @landrum2017clinvar]	String	[Source][Ref1,2]
ClinVar	Clinical significance (genotype includes)	Clinical significance for a haplotype or genotype that includes this variant. Reported as pairs of VariationID:clinical significance. [@landrum2013clinvar; @landrum2017clinvar]	String	[Source][Ref1,2]
ClinVar	Disease Name	ClinVar's preferred disease name for the concept specified by disease identifiers in CLNDISDB. [@landrum2013clinvar; @landrum2017clinvar]	String	[Source][Ref1,2]
ClinVar	Disease Name (included variant)	For included variant: ClinVar's preferred disease name for the concept specified by disease identifiers in CLNDISDB. [@landrum2013clinvar; @landrum2017clinvar]	String	[Source][Ref1,2]
ClinVar	Review Status	ClinVar review status for the Variation ID. [@landrum2013clinvar; @landrum2017clinvar]	String	[Source][Ref1,2]
ClinVar	Allele Origin	Allele origin: 0 - unknown; 1 - germline; 2 - somatic; 4 - inherited; 8 - paternal; 16 - maternal; 32 - de-novo; 64 - biparental; 128 - uniparental; 256 - not-tested; 512 - tested-inconclusive. [@landrum2013clinvar; @landrum2017clinvar]	String	[Source][Ref1,2]
ClinVar	Disease Database ID	Tag-value pairs of disease database name and identifier, e.g. OMIM:NNNNNN. [@landrum2013clinvar; @landrum2017clinvar]	String	[Source][Ref1,2]
ClinVar	Disease Database ID (includeded variant)	For included variant: Tag-value pairs of disease database name and identifier, e.g. OMIM:NNNNNN. [@landrum2013clinvar; @landrum2017clinvar]	String	[Source][Ref1,2]
ClinVar	Gene Reported	Gene(s) for the variant reported as gene symbol:gene id. The gene symbol and id are delimited by a colon (:) and each pair is delimited by a vertical bar (\|). [@landrum2013clinvar; @landrum2017clinvar]	String	[Source][Ref1,2]
Variant Category	Gencode Comprehensive Info	Identify whether variants cause protein coding changes using Gencode genes definition systems, it will label the gene name of the variants has impact, if it is intergenic region, the nearby gene name will be labeled in the annotation. [@harrow2012gencode; @frankish2018gencode]	String	[Source1,2][Ref1,2]
Variant Category	Gencode Comprehensive Category	Identify whether variants cause protein coding changes using Gencode genes definition systems. It will label the gene name of the variants has impact, if it is intergenic region, the nearby gene name will be labeled in the annotation. [@harrow2012gencode; @frankish2018gencode]	String	[Source1,2][Ref1,2]
Variant Category	Disruptive Missense	Identify whether the variant is a disruptive missense variant, defined as "disruptive" by the ensemble MetaSVM annotation. [@dong2014comparison]	Factor	[Source1,2][Ref]
Variant Category	CAGE Promoter	CAGE defined promoter sites from Fantom 5. [@forrest2014promoter]	String	[Source][Ref]
Variant Category	CAGE Enhancer	CAGE defined permissive Enhancer sites from Fantom 5. [@andersson2014atlas]	String	[Source][Ref]
Variant Category	GeneHancer	Predicted human enhancer sites from the GeneHancer database. [@fishilevich2017genehancer]	String	[Ref]
Variant Category	SuperEnhancer	Predicted super-enhancer sites and targets in a range of human cell types. [@hnisz2013super]	String	[Source][Ref]
Variant Category	Gencode Comprehensive Exonic Category	Identify variants impact using Gencode exonic definition, and only label exonic categorical information like, synonymous, non-synonymous, frame-shifts indels, etc. [@harrow2012gencode; @frankish2018gencode]	String	[Source1,2][Ref1,2]
Variant Category	Gencode Comprehensive Exonic Info	Identify variants cause protein coding changes using Gencode genes definition, and gives out detail annotation information of which exons of the variant has impacts on and how the impacts causes changes in amino acid changes. [@harrow2012gencode; @frankish2018gencode]	String	[Source1,2][Ref1,2]
Variant Category	UCSC Info	Identify whether variants cause protein coding changes using UCSC genes definition systems, it will label the gene name of the variants has impact. If it is intergenic region, the nearby gene name will be labeled in the annotation.	String	[Source]
Variant Category	UCSC Exonic Info	Identify variants cause protein coding changes using UCSC genes definition, and give out detail annotation information of which exons of the variant has impacts on and how the impacts causes changes in amino acid changes.	String	[Source]
Variant Category	RefSeq Info	Identify whether variants cause protein coding changes using RefSeq genes definition systems, it will label the gene name of the variants has impact, if it is intergenic region, the nearby gene name will be labeled in the annotation.	String	[Source]
Variant Category	RefSeq Exonic Info	Identify variants cause protein coding changes using RefSeq genes definition, and give out detailed annotation information of which exons of the variant have impacts on and how the impacts cause changes in amino acid changes.	String	[Source]
Allele Frequencies	TOPMed Bravo AF	TOPMed Bravo Genome Allele Frequency. [@taliun2019sequencing; @nhlbi2018bravo]	num	[Source][Ref]
Allele Frequencies	GNOMAD Total AF	GNOMAD v3 Genome Allele Frequency using all the samples. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	AFR GNOMAD AF	GNOMAD v3 Genome African population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	AMR GNOMAD AF	GNOMAD v3 Genome Ad Mixed American population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	EAS GNOMAD AF	GNOMAD v3 Genome East Asian population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	NFE GNOMAD AF	GNOMAD v3 Genome Non-Finnish European population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	FIN GNOMAD AF	GNOMAD v3 Genome Finnish European population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	SAS GNOMAD AF	GNOMAD v3 Genome South Asian population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	AMI GNOMAD AF	GNOMAD v3 Genome Amish population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	ASJ GNOMAD AF	GNOMAD v3 Genome Ashkenazi Jewish population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	OTH GNOMAD AF	GNOMAD v3 Genome Other (population not assigned) frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	Male GNOMAD AF	GNOMAD v3 Genome Male Allele Frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	AFR Male GNOMAD AF	GNOMAD v3 Genome African Male population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	AMI Male GNOMAD AF	GNOMAD v3 Genome Amish Male population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	AMR Male GNOMAD AF	GNOMAD v3 Genome Ad Mixed American Male population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	ASJ Male GNOMAD AF	GNOMAD v3 Genome Ashkenazi Jewish Male population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	EAS Male GNOMAD AF	GNOMAD v3 Genome East Asian Male population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	FIN Male GNOMAD AF	GNOMAD v3 Genome Finnish European Male population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	NFE Male GNOMAD AF	GNOMAD v3 Genome Non-Finnish European Male population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	OTH Male GNOMAD AF	GNOMAD v3 Genome Other (population not assigned) Male frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	SAS Male GNOMAD AF	GNOMAD v3 Genome South Asian Male population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	Female GNOMAD AF	GNOMAD v3 Genome Female Allele Frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	AFR Female GNOMAD AF	GNOMAD v3 Genome African Female population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	AMI Female GNOMAD AF	GNOMAD v3 Genome Amish Female population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	AMR Female GNOMAD AF	GNOMAD v3 Genome Ad Mixed American Female population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	ASJ Female GNOMAD AF	GNOMAD v3 Genome Ashkenazi Jewish Female population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	EAS Female GNOMAD AF	GNOMAD v3 Genome East Asian Female population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	FIN Female GNOMAD AF	GNOMAD v3 Genome Finnish European Female population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	NFE Female GNOMAD AF	GNOMAD v3 Genome Non-Finnish European Female population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	OTH Female GNOMAD AF	GNOMAD v3 Genome Other (population not assigned) Female frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	SAS Female GNOMAD AF	GNOMAD v3 Genome South Asian Female population frequency. [@karczewski2020mutational; @gnomad2019browser]	num	[Source][Ref]
Allele Frequencies	ALL 1000G AF	1000 Genome Allele Frequency (Whole genome allele frequencies from the 1000 Genomes Project phase 3 data).	num	[Source]
Allele Frequencies	AFR 1000G AF	1000 Genomes African population frequency.	num	[Source]
Allele Frequencies	AMR 1000G AF	1000 Genomes Ad Mixed American population frequency.	num	[Source]
Allele Frequencies	EAS 1000G AF	1000 Genomes East Asian population frequency.	num	[Source]
Allele Frequencies	EUR 1000G AF	1000 Genomes European population frequency.	num	[Source]
Allele Frequencies	SAS 1000G AF	1000 Genomes South Asian population frequency.	num	[Source]
Integrative Score	aPC-Protein-Function	Protein function annotation PC: the first PC of the standardized scores of "SIFTval, PolyPhenVal, Grantham, Polyphen2_HDIV_score, Polyphen2_HVAR_score, MutationTaster_score, MutationAssessor_score" in PHRED scale. Range: [2.970, 97.690]. [@li2020dynamic]	num (+)	Individual annotation channels in the FAVOR database.
Integrative Score	aPC-Conservation	Conservation annotation PC: the first PC of the standardized scores of "GerpN, GerpS, priPhCons, mamPhCons, verPhCons, priPhyloP, mamPhyloP, verPhyloP" in PHRED scale. Range: [1.478E-09, 99.451]. [@li2020dynamic]	num (+)	Individual annotation channels in the FAVOR database.
Integrative Score	aPC-Epigenetics-Active	Active Epigenetic annotation PC: the first PC of the standardized scores of “EncodeH3K4me1.max, EncodeH3K4me2.max, EncodeH3K4me3.max, EncodeH3K9ac.max, EncodeH3K27ac.max, EncodeH4K20me1.max，EncodeH2AFZ.max,” in PHRED scale.Range: [0, 99.451].[@li2020dynamic]	num (+)	Individual annotation channels in the FAVOR database.
Integrative Score	aPC-Epigenetics-Repressed	Repressed Epigenetic annotation PC: the first PC of the standardized scores of “EncodeH3K9me3.max, EncodeH3K27me3.max” in PHRED scale. Range: [0, 99.451]. (Li et al., 2020). [@li2020dynamic]	num (+)	Individual annotation channels in the FAVOR database.
Integrative Score	aPC-Epigenetics-Transcription	Transcription Epigenetic annotation PC: the first PC of the standardized scores of “EncodeH3K36me3.max, EncodeH3K79me2.max” in PHRED scale. Range: [0, 99.451]. [@li2020dynamic]	num (+)	Individual annotation channels in the FAVOR database.
Integrative Score	aPC-Local-Nucleotide-Diversity	Local nucleotide diversity annotation PC: the first PC of the standardized scores of "bStatistic, RecombinationRate, NuclearDiversity" in PHRED scale. Range: [0, 99.451]. [@li2020dynamic]	num	Individual annotation channels in the FAVOR database.
Integrative Score	aPC-Mutation-Density	Mutation density annotation PC: the first PC of the standardized scores of "Common100bp, Rare100bp, Sngl100bp, Common1000bp, Rare1000bp, Sngl1000bp, Common10000bp, Rare10000bp, Sngl10000bp" in PHRED scale. Range: [0, 99.451]. [@li2020dynamic]	num	Individual annotation channels in the FAVOR database.
Integrative Score	aPC-Transcription-Factor	Transcription factor annotation PC: the first PC of the standardized scores of "RemapOverlapTF, RemapOverlapCL" in PHRED scale. Range: [1.185, 99.451]. [@li2020dynamic]	num (+)	Individual annotation channels in the FAVOR database.
Integrative Score	aPC-Mappability	Mappability annotation PC: the first PC of the standardized scores of "umap_k100, bismap_k100, umap_k50, bismap_k50, umap_k36, bismap_k36, umap_k24, bismap_k24" in PHRED scale. Range: [0.185, 99.451]. [@li2020dynamic]	num (+)	Individual annotation channels in the FAVOR database.
Integrative Score	aPC-Proximity-To-TSS-TES	Proximity to TSS (Transcription Starting Site) and TES (Transcription Ending Site) annotation PC: the first PC of "minDistTSS, minDistTSE" in PHRED scale. Range: [0, 99.451]. [@li2020dynamic]	num (+)	Individual annotation channels in the FAVOR database.
Integrative Score	CADD RawScore	The CADD raw score (integrative score). A higher CADD score indicates more deleterious. Range: [-237.102, 22.763]. [@kircher2014general; @rentzsch2018cadd]	num (+)	[Source][Ref1,2]
Integrative Score	CADD PHRED	The CADD score in PHRED scale (integrative score). A higher CADD score indicates more deleterious. Range: [0, 99]. [@kircher2014general; @rentzsch2018cadd]	num (+)	[Source][Ref1,2]
Integrative Score	LINSIGHT	The LINSIGHT score (integrative score). A higher LINSIGHT score indicates more functionality. Range: [0.215, 0.995]. [@huang2017fast]	num (+)	[Source][Ref]
Integrative Score	FATHMM-XF	The FATHMM-XF score (integrative score). A higher FATHMM-XF score indicates more functionality. Range: [0.405, 99.451]. [@rogers2017fathmm]	num (+)	[Source][Ref]
Integrative Score	Funseq Value (impact score)	A flexible framework to prioritize regulatory mutations from cancer genome sequencing (integrative score). [@fu2014funseq2]	num (+)	[Source][Ref]
Integrative Score	Funseq Description (annotation)	Funseq annotation pints out whether given mutation falls in coding or non-coding region (integrative score). [@fu2014funseq2]	String	[Source][Ref]
Integrative Score	Aloft Value (impact score)	ALoFT provides extensive annotations to putative loss-of-function variants (LoF) in protein-coding genes including functional, evolutionary and network features (integrative score). [@balasubramanian2017using]	num (+)	[Source][Ref]
Integrative Score	Aloft Description (annotation)	ALoFT annotation can predict the impact of premature stop variants and classify them as dominant disease-causing, recessive disease-causing and benign variants (integrative score). [@balasubramanian2017using]	String	[Source][Ref]
Protein Function	PolyPhenCat	PolyPhen category of change. [@adzhubei2010method]	Factor	[Source][Ref]
Protein Function	PolyPhenVal	PolyPhen score: It predicts the functional significance of an allele replacement from its individual features. Range: [0, 1] (default: 0). [@adzhubei2010method]	num (+)	[Source][Ref]
Protein Function	Polyphen2_HDIV	Predicts possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations. HumDiv is Mendelian disease variants vs. divergence from close mammalian homologs of human proteins (>=95% sequence identity). Range: [0, 1] (default: 0). [@adzhubei2010method]	num (+)	[Source1,2,3][Ref]
Protein Function	Polyphen2_HVAR	Predicts possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations. HumVar is all human variants associated with some disease (except cancer mutations) or loss of activity/function vs. common (minor allele frequency >1%) human polymorphism with no reported association with a disease of other effect. Range: [0, 1] (default: 0). [@adzhubei2010method]	num (+)	[Source1,2,3][Ref]
Protein Function	Grantham	Grantham score: oAA, nAA. It attempts to predict the distance between two amino acids, in an evolutionary sense. A lower Grantham score reflects less evolutionary distance. A higher Grantham score reflects a greater evolutionary distance, and is considered more deleterious. Range: [0, 215] (default: 0). [@grantham1974amino]	num (+)	[Source1,2][Ref]
Protein Function	MutationTaster	MutationTaster is a free web-based application to evaluate DNA sequence variants for their disease-causing potential. The software performs a battery of in silico tests to estimate the impact of the variant on the gene product/protein. Range: [0, 1] (default: 0). [@schwarz2014mutationtaster2]	num (+)	[Source1,2,3][Ref]
Protein Function	MutationAssessor	Predicts the functional impact of amino-acid substitutions in proteins, such as mutations discovered in cancer or missense polymorphisms. Range: [-5.135, 6.490] (default: -5.545). [@reva2011predicting]	num (+)	[Source1,2,3][Ref]
Protein Function	SIFTcat	SIFT category of change. [@ng2003sift]	Factor	[Source][Ref]
Protein Function	SIFTval	SIFT score, ranges from 0.0 (deleterious) to 1.0 (tolerated). Range: [0, 1] (default: 1). [@ng2003sift]	num (-)	[Source][Ref]
Conservation	priPhCons	Primate phastCons conservation score (excl. human). A higher score means the region is more conserved. PhastCons considers n species rather than two. It considers the phylogeny by which these species are related, and instead of measuring similarity/divergence simply in terms of percent identity. It uses statistical models of nucleotide substitution that allow for multiple substitutions per site and for unequal rates of substitution between different pairs of bases. Range: [0, 0.999] (default: 0.0). [@siepel2005evolutionarily]	num (+)	[Source][Ref]
Conservation	mamPhCons	Mammalian phastCons conservation score (excl. human). A higher score means the region is more conserved. PhastCons considers n species rather than two. It considers the phylogeny by which these species are related, and instead of measuring similarity/divergence simply in terms of percent identity. It uses statistical models of nucleotide substitution that allow for multiple substitutions per site and for unequal rates of substitution between different pairs of bases. Range: [0, 1] (default: 0.0). [@siepel2005evolutionarily]	num (+)	[Source][Ref]
Conservation	verPhCons	Vertebrate phastCons conservation score (excl. human). A higher score means the region is more conserved. PhastCons considers n species rather than two. It considers the phylogeny by which these species are related, and instead of measuring similarity/divergence simply in terms of percent identity. It uses statistical models of nucleotide substitution that allow for multiple substitutions per site and for unequal rates of substitution between different pairs of bases. Range: [0, 1] (default: 0.0). [@siepel2005evolutionarily]	num (+)	[Source][Ref]
Conservation	priPhyloP	Primate phyloP score (excl. human). A higher score means the region is more conserved. PhyloP scores measure evolutionary conservation at individual alignment sites. The scores are calculated by comparing with the evolution expected under neutral drift. Positive scores: measure conservation, i.e., slower evolution than expected, at sites that are predicted to be conserved. Negative scores: measure acceleration, i.e., faster evolution than expected, at sites that are predicted to be fast-evolving. Range: [-10.761, 0.595] (default: -0.029). [@pollard2010detection]	num (+)	[Source][Ref]
Conservation	mamPhyloP	Mammalian phyloP score (excl. human). A higher score means the region is more conserved. PhyloP scores measure evolutionary conservation at individual alignment sites. The scores are calculated by comparing with the evolution expected under neutral drift. Positive scores: measure conservation, i.e., slower evolution than expected, at sites that are predicted to be conserved. Negative scores: measure acceleration, i.e., faster evolution than expected, at sites that are predicted to be fast-evolving. Range: [-20, 4.494] (default: -0.005). [@pollard2010detection]	num (+)	[Source][Ref]
Conservation	verPhyloP	Vertebrate phyloP score (excl. human). A higher score means the region is more conserved. PhyloP scores measure evolutionary conservation at individual alignment sites. The scores are calculated by comparing with the evolution expected under neutral drift. Positive scores: measure conservation, i.e., slower evolution than expected, at sites that are predicted to be conserved. Negative scores: measure acceleration, i.e., faster evolution than expected, at sites that are predicted to be fast-evolving. Range: [-20, 11.295] (default: 0.042). [@pollard2010detection]	num (+)	[Source][Ref]
Conservation	GerpN	Neutral evolution score defined by GERP++. A higher score means the region is more conserved. Range: [0, 19.8] (default: 3.0). [@davydov2010identifying]	num (+)	[Source][Ref]
Conservation	GerpS	Rejected Substitution score defined by GERP++. A higher score means the region is more conserved. GERP (Genomic Evolutionary Rate Profiling) identifies constrained elements in multiple alignments by quantifying substitution deficits. These deficits represent substitutions that would have occurred if the element were neutral DNA, but did not occur because the element has been under functional constraint. These deficits are referred to as "Rejected Substitutions". Rejected substitutions are a natural measure of constraint that reflects the strength of past purifying selection on the element. GERP estimates constraint for each alignment column; elements are identified as excess aggregations of constrained columns. Positive scores (fewer than expected) indicate that a site is under evolutionary constraint. Negative scores may be weak evidence of accelerated rates of evolution. Range: [-39.5, 19.8] (default: -0.2). [@davydov2010identifying]	num (+)	[Source][Ref]
Epigenetics	EncodeDNase	Maximum Encode DNase-seq level over 12 cell lines. Range: [0, 118672] (default: 0.0). [@encode2012integrated]	num (+)	[Source][Ref]
Epigenetics	EncodeH3K27ac	Maximum Encode H3K27ac level over 14 cell lines. Range: [0.010, 1442.690] (default: 0.36). [@encode2012integrated]	num (+)	[Source][Ref]
Epigenetics	EncodeH3K4me1	Maximum Encode H3K4me1 level over 13 cell lines. Range: [0.010, 227.81] (default: 0.37). [@encode2012integrated]	num (+)	[Source][Ref]
Epigenetics	EncodeH3K4me2	Maximum Encode H3K4me2 level over 14 cell lines. Range: [0.010, 774.99] (default: 0.37). [@encode2012integrated]	num (+)	[Source][Ref]
Epigenetics	EncodeH3K4me3	Maximum Encode H3K4me3 level over 14 cell lines. Range: [0.010, 1093.75] (default: 0.38). [@encode2012integrated]	num (+)	[Source][Ref]
Epigenetics	EncodeH3K9ac	Maximum Encode H3K9ac level over 13 cell lines. Range: [0.010, 1340.42] (default: 0.41). [@encode2012integrated]	num (+)	[Source][Ref]
Epigenetics	EncodeH4K20me1	Maximum Encode H4K20me1 level over 11 cell lines. Range: [0.010, 226.64] (default: 0.47). [@encode2012integrated]	num (+)	[Source][Ref]
Epigenetics	EncodeH2AFZ	Maximum Encode H2AFZ level over 13 cell lines. Range: [0.020, 468.98] (default: 0.42). [@encode2012integrated]	num (+)	[Source][Ref]
Epigenetics	EncodeH3K9me3	Maximum Encode H3K9me3 level over 14 cell lines. Range: [0.010, 226.64] (default: 0.38). [@encode2012integrated]	num (+)	[Source][Ref]
Epigenetics	EncodeH3K27me3	Maximum Encode H3K27me3 level over 14 cell lines. Range: [0.010, 193.38] (default: 0.47). [@encode2012integrated]	num (+)	[Source][Ref]
Epigenetics	EncodeH3K36me3	Maximum Encode H3K36me3 level over 10 cell lines. Range: [0.020, 246.88] (default: 0.39). [@encode2012integrated]	num (+)	[Source][Ref]
Epigenetics	EncodeH3K79me2	Maximum Encode H3K79me2 level over 13 cell lines. Range: [0.020, 553.06] (default: 0.34). [@encode2012integrated]	num (+)	[Source][Ref]
Epigenetics	EncodetotalRNA	Maximum Encode totalRNA-seq level over 10 cell lines (minus and plus strand separately). Range: [0, 385096] (default: 0.0). [@encode2012integrated]	num (+)	[Source][Ref]
Epigenetics	GC	Percent GC in a window of +/- 75bp. Range: [0, 1] (default: 0.42).	num (+)	[Source]
Epigenetics	CpG	Percent CpG in a window of +/- 75bp. Range: [0, 0.604] (default: 0.02).	num (+)	[Source]
Transcription Factors	RemapOverlapTF	Remap number of different transcription factors binding. Range: [1, 350] (default: -0.5).	int (+)	[Source]
Transcription Factors	RemapOverlapCL	Remap number of different transcription factor - cell line combinations binding. Range: [1, 1068] (default: -0.5).	int (+)	[Source]
Chromatin States	cHmm E1	Number of 48 cell types in chromHMM state E1_poised. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E2	Number of 48 cell types in chromHMM state E2_repressed. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E3	Number of 48 cell types in chromHMM state E3_dead. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E4	Number of 48 cell types in chromHMM state E4_dead. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E5	Number of 48 cell types in chromHMM state E5_repressed. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E6	Number of 48 cell types in chromHMM state E6_repressed. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E7	Number of 48 cell types in chromHMM state E7_weak. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E8	Number of 48 cell types in chromHMM state E8_gene. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E9	Number of 48 cell types in chromHMM state E9_gene. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E10	Number of 48 cell types in chromHMM state E10_gene. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E11	Number of 48 cell types in chromHMM state E11_gene. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E12	Number of 48 cell types in chromHMM state E12_distal. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E13	Number of 48 cell types in chromHMM state E13_distal. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E14	Number of 48 cell types in chromHMM state E14_distal. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E15	Number of 48 cell types in chromHMM state E15_weak. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E16	Number of 48 cell types in chromHMM state E16_tss. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E17	Number of 48 cell types in chromHMM state E17_proximal. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E18	Number of 48 cell types in chromHMM state E18_proximal. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E19	Number of 48 cell types in chromHMM state E19_tss. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E20	Number of 48 cell types in chromHMM state E20_poised. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E21	Number of 48 cell types in chromHMM state E21_dead. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E22	Number of 48 cell types in chromHMM state E22_repressed. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E23	Number of 48 cell types in chromHMM state E23_weak. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E24	Number of 48 cell types in chromHMM state E24_distal. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Chromatin States	cHmm E25	Number of 48 cell types in chromHMM state E25_distal. (default: 1.92). [@ernst2015large]	num	[Source][Ref]
Local Nucleotide Diversity	RecombinationRate	Recombination rate measures the probability of how likely the region tends to undergo recombination. Range: [0, 54.96] (default: 0). [@gazal2017linkage]	num (+)	[Ref]
Local Nucleotide Diversity	NuclearDiversity	Nuclear diversity measures the probability of how likely the region diversify. Range: [0.05, 60.25] (default: 0). [@gazal2017linkage]	num (+)	[Ref]
Local Nucleotide Diversity	bStatistic	Background selection score. A background selection (B) value for each position in the genome. B indicates the expected fraction of neutral diversity that is present at a site, with values close to 0 representing near complete removal of diversity as a result of selection and values near 1000 indicating little effect of selection. Range: [0, 1000] (default: 800). [@mcvicker2009widespread]	int (+)	[Source][Ref]
Mutation Density	Common100bp	Number of common (MAF > 0.05) BRAVO SNVs in the nearby 100 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 100. Range: [0, 14] (default: 0).	int (+)	[Source]
Mutation Density	Rare100bp	Number of rare (MAF < 0.05) BRAVO SNVs in the nearby 100 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 100. Range: [0, 31] (default: 0).	int (+)	[Source]
Mutation Density	Sngl100bp	Number of single occurrence of BRAVO SNVs in the nearby 100 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutation. Scores range from 0 to 100. Range: [0, 99] (default: 0).	int (+)	[Source]
Mutation Density	Common1000bp	Number of common (MAF > 0.05) BRAVO SNVs in the nearby1000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 1000. Range: [0, 73] (default: 0).	int (+)	[Source]
Mutation Density	Rare1000bp	Number of rare (MAF < 0.05) BRAVO SNVs in the nearby 1000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 1000. Range: [0, 74] (default: 0).	int (+)	[Source]
Mutation Density	Sngl1000bp	Number of single occurrence of BRAVO SNVs in the nearby 1000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutation. Scores range from 0 to 1000. Range: [0, 658] (default: 0).	int (+)	[Source]
Mutation Density	Common10000bp	Number of common (MAF > 0.05) BRAVO SNVs in the nearby 10000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 10000. Range: [0, 443] (default: 0).	int (+)	[Source]
Mutation Density	Rare10000bp	Number of rare (MAF < 0.05) BRAVO SNVs in the nearby 10000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 10000. Range: [0, 355] (default: 0).	int (+)	[Source]
Mutation Density	Sngl10000bp	Number of single occurrence of BRAVO SNVs in the nearby 10000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutation. Scores range from 0 to 10000. Range: [0, 4750] (default: 0).	int (+)	[Source]
Mappability	Umap (k100, k50, k36, k24)	Mappability of unconverted genome. It measures the extent to which a position can be uniquely mapped by sequence reads. Lower mappability means the estimates of genomic and epigenomic characteristics from sequencing assays are less reliable, and the region has increased susceptibility to spurious mapping from reads from other regions of the genome with sequencing errors or unexpected genetic variation. Range: [0, 1] (default: 0). [@karimzadeh2018umap]	num (+)	[Source][Ref]
Mappability	Bismap (k100, k50, k36, k24)	Mappability of the bisulfite-converted genome. Bisulfite sequencing approaches used to identify DNA methylation introduce large numbers of reads that map to multiple regions. This annotation identifies mappability of the bisulfite-converted genome. Range: [0, 1] (default: 0). [@karimzadeh2018umap]	num (+)	[Source][Ref]
Proximity Table	minDistTSS	Distance to closest Transcribed Sequence Start (TSS). Range: [1, 3604063] (default: 1e7).	num (-)	[Source]
Proximity Table	minDistTSE	Distance to closest Transcribed Sequence End (TSE). Range: [1, 3608885] (default: 1e7).	num (-)	[Source]
Alphamissense	protein_variant	Amino acid change induced by the alternative allele, in the format POS_aa Alternative amino acid (e.g. V2L). POS_aa is the 1-based position of the residue within the protein amino acid sequence.	String	[Source]
Alphamissense	AM_pathogenicity	Calibrated AlphaMissense pathogenicity scores (ranging between 0 and 1), which canbe interpreted as the predicted probability of a variant being clinically pathogenic.	String	[Source]
Alphamissense	AM_class	Classification of the protein_variant into one of three discrete categories: 'likely_benign','likely_pathogenic', or 'ambiguous'. These are derived using the following thresholds:'likely_benign' if alphamissense_pathogenicity < 0.34; 'likely_pathogenic' ifalphamissense_pathogenicity > 0.564; and 'ambiguous' otherwise.	String	[Source]
Mutation Rate	filter	Low: Low quality regions as determined by gnomAD sequencing metrics. Mappability(0.5;overlap with 50nt simple repeat;ReadPosRankSum)1;0 SNVs in 100bp window. SFS_bump: Pentamer context with abnormal SFS. The fraction of high-frequency SNVS Range [0.0005, 0.2] is greater than 1.5x mutation rate controlled average. Tends to be repetitive contexts. TFBS: Transcription factor binding site as determined by overlap with ChIP-seq peaks.	String	[Source]
Mutation Rate	PN	Pentanucleotide context	num (+)	[Source]
Mutation Rate	MR	Roulette mutation rate estimate	num (+)	[Source]
Mutation Rate	MG	gnomAD mutation rate estimate (Karczewski et al. 2020)	num (+)	[Source]
Mutation Rate	MC	Carlson mutation rate estimate (Carlson et al. 2018)	num (+)	[Source]
cCREs	accession	Accession number of the cCRE	String	[Source]
cCREs	annotation	Promoter-like (PLS)	String	[Source]
cCREs	annotation	All Candidate Enhancers (pELS & dELS)	String	[Source]
cCREs	annotation	Proximal enhancer-like (pELS)	String	[Source]
cCREs	annotation	Distal enhancer-like (dELS)	String	[Source]
cCREs	annotation	Chromatin Accessible with CTCF (CA-CTCF)	String	[Source]
cCREs	annotation	Chromatin Accessible with H3K4me3 (CA-H3K4me3)	String	[Source]
cCREs	annotation	Chromatin Accessible with TF (CA-TF)	String	[Source]
cCREs	annotation	Chromatin Accessible Only (CA)	String	[Source]
cCREs	annotation	TF Only (TF)	String	[Source]
cCREs	annotation	CTCF-Bound cCREs	String	[Source]
CATlas	Signal_Value	Activity signal strength measured in the tissue	num	[Source]
CATlas	P_value	P-value of the signal significance	num	[Source]
CATlas	Q_value	Q-value (FDR adjusted P-value)	num	[Source]
CATlas	Peak	Peak ID or rank associated with signal	num	[Source]
CATlas	Tissue	Tissue type in which the signal or linkage is observed	String	[Source]
CATlas	cCREs_Region	Linked candidate cis-regulatory element (cCRE) region	String	[Source]
CATlas	Promoter_Region	Promoter region linked to cCRE via ABC model	String	[Source]
CATlas	ABC_Score	ABC score estimating enhancer–promoter interaction strength	num	[Source]
CATlas	Linked Gene	Gene name linked to the cCRE region via promoter	String	[Source]
CATlas	Distance	Genomic distance between cCRE and linked promoter	num	[Source]
EpiMap	BSSID	Unique biosample state identifier	String	[Source]
EpiMap	State	Full chromatin state name (e.g., EnhA1, TssA), describing regulatory role	String	[Source]
EpiMap	Group	Broad category grouping the sample (e.g., cancer, normal)	String	[Source]
EpiMap	Extended_Info	Extended tissue/cell line description (e.g., CANCER PROSTATE)	String	[Source]
EpiMap	Sample_Name	Specific sample name with treatment condition (e.g., A549, 22Rv1 treated with 10 nM 17b-hydroxy)	String	[Source]
pgBoost	Linked Gene	Gene symbol linked to the variant	String	[Source]
pgBoost	pg_boost	Probabilistic score of SNP-gene link from pgBoost (gradient boosting model trained on multiome fine-mapping data using SCENT, Signac, Cicero, distance)	num (+)	[Source]
pgBoost	pg_boost_percentile	Percentile ranking of the pgBoost score across all SNP-gene pairs	num (+)	[Source]