Variants
On this page, we'll dive into the /v1/variants
endpoint you can use to fetch variants programmatically. We'll look at how to query variants contacts.
Retrieve a variant
This endpoint allows you to retrieve a single variant. The variant_vcf
must be specified in
chromosome-position-ref-alt
format, e.g. 1-1000-A-T
. Refer to the list at the bottom of
of this page to see which properties are included with variant objects.
Path parameters
- Name
variant_vcf
- Type
- string
- Description
The variant in
chromosome-position-ref-alt
format, e.g.1-1000-A-T
.
Request
curl -G https://api.genohub.org/v1/variants/19-44908822-C-T
Response
{
"variant_vcf": "19-44908822-C-T",
"chromosome": "19",
"position": "44908822",
"bravo_an": 264690,
"bravo_ac": 20678,
"bravo_af": 0.0781216,
"filter_status": "PASS",
"rsid": "rs7412",
"genecode_comprehensive_category": "exonic",
"genecode_comprehensive_info": "APOE",
"genecode_comprehensive_exonic_category": "nonsynonymous SNV",
"genecode_comprehensive_exonic_info":
"APOE:ENST00000446996.5:exon4:c.C526T:p.R176C,APOE:ENST00000434152.5:exon4:c.C604T:p.R202C,APOE:ENST00000252486.9:exon4:c.C526T:p.R176C,APOE:ENST00000425718.1:exon3:c.C526T:p.R176C,",
"ucsc_info": "ENST00000252486.8,ENST00000425718.1,ENST00000434152.5,ENST00000446996.5",
"ucsc_exonic_info":
"ENST00000446996.5:ENST00000446996.5:exon4:c.C526T:p.R176C,ENST00000434152.5:ENST00000434152.5:exon4:c.C604T:p.R202C,ENST00000425718.1:ENST00000425718.1:exon3:c.C526T:p.R176C,ENST00000252486.8:ENST00000252486.8:exon4:c.C526T:p.R176C,",
"polyphen2_hdiv_score": 1,
// ...
}
Retrieve a variant using rsID
This endpoint allows you to retrieve variants using rsID. The rsid
must be specified in rs
format, e.g.
rs7412
. Refer to the list at the bottom of of this page to see which properties are
included with variant objects.
rsID does not always exist for a variant. If you cannot find a variant using rsID, try using Retrieve a variant endpoint.
Path parameters
- Name
rsid
- Type
- string
- Description
The rsID, e.g.
rs7412
.
Request
curl -G https://api.genohub.org/v1/rsids/rs7412
Response
[{
"variant_vcf": "19-44908822-C-T",
"chromosome": "19",
"position": "44908822",
"bravo_an": 264690,
"bravo_ac": 20678,
"bravo_af": 0.0781216,
"filter_status": "PASS",
"rsid": "rs7412",
"genecode_comprehensive_category": "exonic",
"genecode_comprehensive_info": "APOE",
"genecode_comprehensive_exonic_category": "nonsynonymous SNV",
// ...
}]
The response is a list of variants. However, we do not support pagination for this endpoint.
Retrieve multiple variants
This endpoint allows you to retrieve multiple variants.
We are currently in beta and only support up to 100,000 variants per request.
Request body
- Name
email
- Type
- string
- Description
Your email address.
- Name
organization
- Type
- string
- Description
Your organization.
- Name
file-upload
- Type
- text
- Description
A file containing a list of variants in
chromosome-position-ref-alt
format, e.g.1-1000-A-T
. The file must be in.txt
or.txt.gz
format and each variant must be on a separate line.variants.txt
1-1000-A-T 1-1001-A-T 1-1002-A-T
- Name
file-upload-type
- Type
- text
- Description
The content type of the file. It can be either
text/plain
orapplication/gzip
,application/x-gzip
.
- Name
coordinate-system
- Type
- string
- Description
The coordinate system of the variant. It can be either
1-base
or0-base
.
- Name
left-normalization
- Type
- boolean
- Description
Whether to left-normalize the variant. It can be either
true
orfalse
.
Request
curl --location 'api.genohub.org/v1/variants' \
--form 'email="your_email"' \
--form 'organization="your_organization"' \
--form 'file-upload=@"filename.txt"'
--form 'file-upload-type="text/plain"'
--form 'coordinate-system="1-base"'
--form 'left-normalization="false"'
You will receive an email with a link to download the results in CSV format.
The variant model
The variant model contains all the information about the variant, such as functional scores, genecode comprehensive info, etc.
Properties
- Name
variant_vcf
- Type
- string
- Description
The unique identifier of the given variant. Reported as chr-pos-ref-alt format.
- Name
chromosome
- Type
- string
- Description
The chromosome where the variant is located
- Name
position
- Type
- string
- Description
The position where the variant is located
- Name
bravo_an
- Type
- null.Int
- Description
TOPMed Bravo Genome Allele Number. (NHLBI TOPMed Consortium, 2018; Taliun et al., 2019)
- Name
bravo_ac
- Type
- null.Int
- Description
TOPMed Bravo Genome Allele Count.
- Name
bravo_af
- Type
- null.Float
- Description
TOPMed Bravo Genome Allele Frequency. (NHLBI TOPMed Consortium, 2018; Taliun et al., 2019)
- Name
filter_status
- Type
- string
- Description
TOPMed QC status of the given variant.
- Name
rsid
- Type
- string
- Description
The rsID of the given variant (if exists).
- Name
genecode_comprehensive_category
- Type
- string
- Description
Identify whether variants cause protein coding changes using Gencode genes definition systems. It will label the gene name of the variants has impact, if it is intergenic region, the nearby gene name will be labeled in the annotation.
- Name
genecode_comprehensive_info
- Type
- string
- Description
Identify whether variants cause protein coding changes using Gencode genes definition systems, it will label the gene name of the variants has impact, if it is intergenic region, the nearby gene name will be labeled in the annotation.
- Name
genecode_comprehensive_exonic_category
- Type
- string
- Description
Identify variants impact using Gencode exonic definition, and only label exonic categorical information like, synonymous, non-synonymous, frame-shifts indels, etc.
- Name
genecode_comprehensive_exonic_info
- Type
- string
- Description
Identify variants impact using Gencode exonic definition, and only label exonic categorical information like, synonymous, non-synonymous, frame-shifts indels, etc.
- Name
ucsc_info
- Type
- string
- Description
Identify whether variants cause protein coding changes using UCSC genes definition systems, it will label the gene name of the variants has impact, if it is intergenic region, the nearby gene name will be labeled in the annotation.
- Name
ucsc_exonic_info
- Type
- string
- Description
Identify variants cause protein coding changes using UCSC genes definition, and gives out detail annotation information of which exons of the variant has impacts on and how the impacts causes changes in amino acid changes.
- Name
polyphen2_hdiv_score
- Type
- null.Float
- Description
Predicts possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations. HumDiv is Mendelian disease variants vs. divergence from close mammalian homologs of human proteins (>=95% sequence identity). Range: [0, 1] (default: 0).
- Name
polyphen2_hvar_score
- Type
- null.Float
- Description
Predicts possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations. HumVar is all human variants associated with some disease (except cancer mutations) or loss of activity/function vs. common (minor allele frequency >1%) human polymorphism with no reported association with a disease of other effect. Range: [0, 1] (default: 0).
- Name
mutation_taster_score
- Type
- null.Float
- Description
MutationTaster is a free web-based application to evaluate DNA sequence variants for their disease-causing potential. The software performs a battery of in silico tests to estimate the impact of the variant on the gene product/protein. Range: [0, 1] (default: 0).
- Name
mutation_assessor_score
- Type
- null.Float
- Description
Predicts the functional impact of amino-acid substitutions in proteins, such as mutations discovered in cancer or missense polymorphisms. Range: [-5.135, 6.490] (default: -5.545).
- Name
metasvm_pred
- Type
- string
- Description
Description for MetasvmPred
- Name
refseq_info
- Type
- string
- Description
Identify whether variants cause protein coding changes using RefSeq genes definition systems, it will label the gene name of the variants has impact, if it is intergenic region, the nearby gene name will be labeled in the annotation.
- Name
refseq_exonic_info
- Type
- string
- Description
Identify variants cause protein coding changes using RefSeq genes definition, and give out detailed annotation information of which exons of the variant have impacts on and how the impacts cause changes in amino acid changes.
- Name
cage_enhancer
- Type
- string
- Description
CAGE defined permissive Enhancer sites from Fantom 5.
- Name
cage_promoter
- Type
- string
- Description
CAGE defined promoter sites from Fantom 5.
- Name
genehancer
- Type
- string
- Description
Predicted human enhancer sites from the GeneHancer database.
- Name
super_enhancer
- Type
- string
- Description
Predicted super-enhancer sites and targets in a range of human cell types.
- Name
clnsig
- Type
- string
- Description
Clinical significance for this single variant. (Landrum et al., 2017, 2013)
- Name
clnsigincl
- Type
- string
- Description
Clinical significance for a haplotype or genotype that includes this variant. Reported as pairs of VariationID:clinical significance. (Landrum et al., 2017, 2013)
- Name
clndn
- Type
- string
- Description
Clinical disease name
- Name
clndnincl
- Type
- string
- Description
Clinical significance for a haplotype or genotype that includes this variant. Reported as pairs of VariationID:clinical significance.
- Name
clnrevstat
- Type
- string
- Description
ClinVar review status for the Variation ID.
- Name
origin
- Type
- string
- Description
Allele origin. One or more of the following values may be added: 0 - unknown; 1 - germline; 2 - somatic; 4 - inherited; 8 - paternal; 16 - maternal; 32 - de-novo; 64 - biparental; 128 - uniparental; 256 - not-tested; 512
- tested-inconclusive.
- Name
clndisdb
- Type
- string
- Description
Tag-value pairs of disease database name and identifier, e.g. OMIM:NNNNNN.
- Name
clndisdbincl
- Type
- string
- Description
For included variant: Tag-value pairs of disease database name and identifier, e.g. OMIM:NNNNNN.
- Name
geneinfo
- Type
- string
- Description
Gene(s) for the variant reported as gene symbol:gene id. The gene symbol and id are delimited by a colon (:) and each pair is delimited by a vertical bar (|).
- Name
linsight
- Type
- null.Float
- Description
The LINSIGHT score (integrative score). A higher LINSIGHT score indicates more functionality. Range: [0.215, 0.995].
- Name
fathmm_xf
- Type
- null.Float
- Description
The FATHMM-XF score (integrative score). A higher FATHMM-XF score indicates more functionality. Range: [0.405, 99.451].
- Name
gc
- Type
- null.Float
- Description
Percent GC in a window of +/- 75bp. Range: [0, 1] (default: 0.42)
- Name
cpg
- Type
- null.Float
- Description
Percent CpG in a window of +/- 75bp. Range: [0, 0.6] (default: 0.02).
- Name
min_dist_tss
- Type
- null.Int
- Description
Distance to closest Transcribed Sequence Start (TSS). Range: [1, 3604058] (default: 1e7).
- Name
min_dist_tse
- Type
- null.Int
- Description
Distance to closest Transcribed Sequence End (TSE). Range: [1, 3610636] (default: 1e7).
- Name
sift_cat
- Type
- string
- Description
SIFT category of change.
- Name
sift_val
- Type
- null.Float
- Description
SIFT score, ranges from 0.0 (deleterious) to 1.0 (tolerated). Range: [0, 1] (default: 1).
- Name
polyphen_cat
- Type
- string
- Description
PolyPhen category of change.
- Name
polyphen_val
- Type
- null.Float
- Description
PolyPhen score: It predicts the functional significance of an allele replacement from its individual features. Range: [0, 1] (default: 0).
- Name
priphcons
- Type
- null.Float
- Description
Primate phastCons conservation score (excl. human). A higher score means the region is more conserved. PhastCons considers n species rather than two. It considers the phylogeny by which these species are related, and instead of measuring similarity/divergence simply in terms of percent identity. It uses statistical models of nucleotide substitution that allow for multiple substitutions per site and for unequal rates of substitution between different pairs of bases. Range: [0, 0.999] (default: 0.0).
- Name
mamphcons
- Type
- null.Float
- Description
Mammalian phastCons conservation score (excl. human). A higher score means the region is more conserved. PhastCons considers n species rather than two. It considers the phylogeny by which these species are related, and instead of measuring similarity/divergence simply in terms of percent identity. It uses statistical models of nucleotide substitution that allow for multiple substitutions per site and for unequal rates of substitution between different pairs of bases. Range: [0, 1] (default: 0.0).
- Name
verphcons
- Type
- null.Float
- Description
Vertebrate phastCons conservation score (excl. human). A higher score means the region is more conserved. PhastCons considers n species rather than two. It considers the phylogeny by which these species are related, and instead of measuring similarity/divergence simply in terms of percent identity. It uses statistical models of nucleotide substitution that allow for multiple substitutions per site and for unequal rates of substitution between different pairs of bases. Range: [0, 1] (default: 0.0).
- Name
priphylop
- Type
- null.Float
- Description
Primate phyloP score (excl. human). A higher score means the region is more conserved. PhyloP scores measure evolutionary conservation at individual alignment sites. The scores are calculated by comparing with the evolution expected under neutral drift. Positive scores: measure conservation, i.e., slower evolution than expected, at sites that are predicted to be conserved. Negative scores: measure acceleration, i.e., faster evolution than expected, at sites that are predicted to be fast-evolving. Range: [-10.761, 0.595] (default: -0.029)
- Name
mamphylop
- Type
- null.Float
- Description
Mammalian phyloP score (excl. human). A higher score means the region is more conserved. PhyloP scores measure evolutionary conservation at individual alignment sites. The scores are calculated by comparing with the evolution expected under neutral drift. Positive scores: measure conservation, i.e., slower evolution than expected, at sites that are predicted to be conserved. Negative scores: measure acceleration, i.e., faster evolution than expected, at sites that are predicted to be fast-evolving. Range: [-20, 4.494] (default: -0.005).
- Name
verphylop
- Type
- null.Float
- Description
Vertebrate phyloP score (excl. human). A higher score means the region is more conserved. PhyloP scores measure evolutionary conservation at individual alignment sites. The scores are calculated by comparing with the evolution expected under neutral drift. Positive scores: measure conservation, i.e., slower evolution than expected, at sites that are predicted to be conserved. Negative scores: measure acceleration, i.e., faster evolution than expected, at sites that are predicted to be fast-evolving. Range: [-20, 11.295] (default: 0.042).
- Name
bstatistic
- Type
- null.Float
- Description
Background selection score. A background selection (B) value for each position in the genome. B indicates the expected fraction of neutral diversity that is present at a site, with values close to 0 representing near complete removal of diversity as a result of selection and values near 1000 indicating little effect of selection. Range: [0, 1000] (default: 800).
- Name
chmm_e1
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E1_poised. (default: 1.92).
- Name
chmm_e2
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E2_repressed. (default: 1.92).
- Name
chmm_e3
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E3_dead. (default: 1.92).
- Name
chmm_e4
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E4_dead. (default: 1.92).
- Name
chmm_e5
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E5_repressed. (default: 1.92).
- Name
chmm_e6
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E6_repressed. (default: 1.92).
- Name
chmm_e7
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E7_weak. (default: 1.92).
- Name
chmm_e8
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E8_gene. (default: 1.92).
- Name
chmm_e9
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E9_gene. (default: 1.92).
- Name
chmm_e10
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E10_gene. (default: 1.92).
- Name
chmm_e11
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E11_gene. (default: 1.92).
- Name
chmm_e12
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E12_distal. (default: 1.92).
- Name
chmm_e13
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E13_distal. (default: 1.92).
- Name
chmm_e14
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E14_distal. (default: 1.92).
- Name
chmm_e15
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E15_weak. (default: 1.92).
- Name
chmm_e16
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E16_tss. (default: 1.92).
- Name
chmm_e17
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E17_proximal. (default: 1.92).
- Name
chmm_e18
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E18_proximal. (default: 1.92).
- Name
chmm_e19
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E19_tss. (default: 1.92).
- Name
chmm_e20
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E20_poised. (default: 1.92).
- Name
chmm_e21
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E21_dead. (default: 1.92).
- Name
chmm_e22
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E22_repressed. (default: 1.92).
- Name
chmm_e23
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E23_weak. (default: 1.92).
- Name
chmm_e24
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E24_distal. (default: 1.92).
- Name
chmm_e25
- Type
- null.Float
- Description
Number of 48 cell types in chromHMM state E25_distal. (default: 1.92).
- Name
gerp_n
- Type
- null.Float
- Description
Neutral evolution score defined by GERP++. A higher score means the region is more conserved. Range: [0, 19.8] (default: 3.0).
- Name
gerp_s
- Type
- null.Float
- Description
Rejected Substitution score defined by GERP++. A higher score means the region is more conserved. GERP (Genomic Evolutionary Rate Profiling) identifies constrained elements in multiple alignments by quantifying substitution deficits. These deficits represent substitutions that would have occurred if the element were neutral DNA, but did not occur because the element has been under functional constraint. These deficits are referred to as "Rejected Substitutions". Rejected substitutions are a natural measure of constraint that reflects the strength of past purifying selection on the element. GERP estimates constraint for each alignment column; elements are identified as excess aggregations of constrained columns. Positive scores (fewer than expected) indicate that a site is under evolutionary constraint. Negative scores may be weak evidence of accelerated rates of evolution. Range: [-39.5, 19.8] (default: -0.2).
- Name
encodeh3k4me1_sum
- Type
- null.Float
- Description
Maximum Encode H3K4me1 level over 13 cell lines. Range: [0.015, 91.954] (default: 0.37).
- Name
encodeh3k4me2_sum
- Type
- null.Float
- Description
Maximum Encode H3K4me2 level over 14 cell lines. Range: [0.024, 148.887] (default: 0.37).
- Name
encodeh3k4me3_sum
- Type
- null.Float
- Description
Maximum Encode H3K4me3 level over 14 cell lines. Range: [0.012, 239.512] (default: 0.38).
- Name
encodeh3k9ac_sum
- Type
- null.Float
- Description
Maximum Encode H3K9ac level over 13 cell lines. Range: [0.019, 281.187] (default: 0.41).
- Name
encodeh3k9me3_sum
- Type
- null.Float
- Description
Maximum Encode H3K9me3 level over 14 cell lines. Range: [0.011, 58.712] (default: 0.38).
- Name
encodeh3k27ac_sum
- Type
- null.Float
- Description
Maximum Encode H3K27ac level over 14 cell lines. Range: [0.013, 288.608] (default: 0.36).
- Name
encodeh3k27me3_sum
- Type
- null.Float
- Description
Maximum Encode H3K27me3 level over 14 cell lines. Range: [0.014, 87.122] (default: 0.47).
- Name
encodeh3k36me3_sum
- Type
- null.Float
- Description
Maximum Encode H3K36me3 level over 10 cell lines. Range: [0.009, 56.176] (default: 0.39).
- Name
encodeh3k79me2_sum
- Type
- null.Float
- Description
Maximum Encode H3K79me2 level over 13 cell lines. Range: [0.015, 118.706] (default: 0.34).
- Name
encodeh4k20me1_sum
- Type
- null.Float
- Description
Maximum Encode H4K20me1 level over 11 cell lines. Range: [0.054, 73.230] (default: 0.47).
- Name
encodeh2afz_sum
- Type
- null.Float
- Description
Maximum Encode H2AFZ level over 13 cell lines. Range: [0.031, 96.072] (default: 0.42).
- Name
encode_dnase_sum
- Type
- null.Float
- Description
Maximum Encode DNase-seq level over 12 cell lines. Range: [0.001, 118672] (default: 0.0).
- Name
encodetotal_rna_sum
- Type
- null.Float
- Description
Maximum Encode totalRNA-seq level over 10 cell lines (minus and plus strand separately). Range: [0, 92282.7]
- Name
grantham
- Type
- null.Float
- Description
Grantham score: oAA, nAA. It attempts to predict the distance between two amino acids, in an evolutionary sense. A lower Grantham score reflects less evolutionary distance. A higher Grantham score reflects a greater evolutionary distance, and is considered more deleterious. Range: [0, 215] (default: 0).
- Name
freq100bp
- Type
- null.Float
- Description
Number of common (MAF > 0.05) BRAVO SNVs in the nearby 100 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 100. Range: [0, 13]
- Name
rare100bp
- Type
- null.Float
- Description
Number of rare (MAF < 0.05) BRAVO SNVs in the nearby 100 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 100. Range: [0, 31] (default: 0).
- Name
sngl100bp
- Type
- null.Float
- Description
Number of single occurrence of BRAVO SNVs in the nearby 100 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutation. Scores range from 0 to 100. Range: [0, 99] (default: 0).
- Name
freq1000bp
- Type
- null.Float
- Description
Number of common (MAF > 0.05) BRAVO SNVs in the nearby1000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 1000. Range: [0, 73] (default: 0).
- Name
rare1000bp
- Type
- null.Float
- Description
Number of rare (MAF < 0.05) BRAVO SNVs in the nearby 1000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 1000. Range: [0, 74] (default: 0).
- Name
sngl1000bp
- Type
- null.Float
- Description
Number of single occurrence of BRAVO SNVs in the nearby 1000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutation. Scores range from 0 to 1000. Range: [0, 658] (default: 0).
- Name
freq10000bp
- Type
- null.Float
- Description
Number of common (MAF > 0.05) BRAVO SNVs in the nearby 10000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 10000. Range: [0, 443] (default: 0).
- Name
rare10000bp
- Type
- null.Float
- Description
Number of rare (MAF < 0.05) BRAVO SNVs in the nearby 10000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 10000. Range: [0, 355] (default: 0).
- Name
sngl10000bp
- Type
- null.Float
- Description
Number of single occurrence of BRAVO SNVs in the nearby 10000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutation. Scores range from 0 to 10000. Range: [0, 4750] (default: 0).
- Name
remap_overlap_tf
- Type
- null.Float
- Description
Remap number of different transcription factors binding. Range: [1, 350] (default: -0.5).
- Name
remap_overlap_cl
- Type
- null.Float
- Description
Remap number of different transcription factor - cell line combinations binding. Range: [1, 1068] (default: -0.5).
- Name
cadd_rawscore
- Type
- null.Float
- Description
The CADD raw score (integrative score). A higher CADD score indicates more deleterious. Range: [-237.102, 22.763].
- Name
cadd_phred
- Type
- null.Float
- Description
The CADD score in PHRED scale (integrative score). A higher CADD score indicates more deleterious. Range: [0, 99].
- Name
apc_conservation_v2
- Type
- null.Float
- Description
Conservation annotation PC: the first PC of the standardized scores of “GerpN, GerpS, priPhCons, mamPhCons, verPhCons, priPhyloP, mamPhyloP, verPhyloP” in PHRED scale. Range: [0, 75.824].
- Name
apc_epigenetics_active
- Type
- null.Float
- Description
Active Epigenetic annotation PC: the first PC of the standardized scores of “EncodeH3K4me1.max, EncodeH3K4me2.max, EncodeH3K4me3.max, EncodeH3K9ac.max, EncodeH3K27ac.max, EncodeH4K20me1.max,EncodeH2AFZ.max,” in PHRED scale.Range: [0, 86.238].
- Name
apc_epigenetics_repressed
- Type
- null.Float
- Description
Repressed Epigenetic annotation PC: the first PC of the standardized scores of “EncodeH3K9me3.max, EncodeH3K27me3.max” in PHRED scale. Range: [0, 86.238].
- Name
apc_epigenetics_transcription
- Type
- null.Float
- Description
Transcription Epigenetic annotation PC: the first PC of the standardized scores of “EncodeH3K36me3.max, EncodeH3K79me2.max” in PHRED scale. Range: [0, 86.238].
- Name
apc_local_nucleotide_diversity_v3
- Type
- null.Float
- Description
Local nucleotide diversity annotation PC: the first PC of the standardized scores of “bStatistic, RecombinationRate, NuclearDiversity” in PHRED scale. Range: [0, 86.238].
- Name
apc_mappability
- Type
- null.Float
- Description
Mappability annotation PC: the first PC of the standardized scores of “umap_k100, bismap_k100, umap_k50, bismap_k50, umap_k36, bismap_k36, umap_k24, bismap_k24” in PHRED scale. Range: [0.007, 22.966].
- Name
apc_mutation_density
- Type
- null.Float
- Description
Mutation density annotation PC: the first PC of the standardized scores of “Common100bp, Rare100bp, Sngl100bp, Common1000bp, Rare1000bp, Sngl1000bp, Common10000bp, Rare10000bp, Sngl10000bp” in PHRED scale. Range: [0, 84.477].
- Name
apc_protein_function_v3
- Type
- null.Float
- Description
Protein function annotation PC: the first PC of the standardized scores of “SIFTval, PolyPhenVal, Grantham, Polyphen2_HDIV_score, Polyphen2_HVAR_score, MutationTaster_score, MutationAssessor_score” in PHRED scale. Range: [2.974, 86.238].
- Name
apc_transcription_factor
- Type
- null.Float
- Description
Transcription factor annotation PC: the first PC of the standardized scores of “RemapOverlapTF, RemapOverlapCL” in PHRED scale. Range: [1.185, 86.238].
- Name
tg_afr
- Type
- null.Float
- Description
1000 Genomes African population frequency.
- Name
tg_all
- Type
- null.Float
- Description
GNOMAD v3 Genome African population frequency. (GNOMAD Consortium, 2019; Karczewski et al., 2020)
- Name
tg_amr
- Type
- null.Float
- Description
1000 Genomes Ad Mixed American population frequency.
- Name
tg_eas
- Type
- null.Float
- Description
1000 Genomes East Asian population frequency.
- Name
tg_eur
- Type
- null.Float
- Description
1000 Genomes European population frequency.
- Name
tg_sas
- Type
- null.Float
- Description
1000 Genomes South Asian population frequency.
- Name
af_total
- Type
- null.Float
- Description
GNOMAD v3 Genome Allele Frequency using all the samples.
- Name
af_asj_female
- Type
- null.Float
- Description
GNOMAD v3 Genome Ashkenazi Jewish Female population frequency.
- Name
af_eas_female
- Type
- null.Float
- Description
Description for AfEasFemale
- Name
af_afr_male
- Type
- null.Float
- Description
Description for AfAfrMale
- Name
af_female
- Type
- null.Float
- Description
Description for AfFemale
- Name
af_fin_male
- Type
- null.Float
- Description
GNOMAD v3 Genome East Asian Female population frequency.
- Name
af_oth_female
- Type
- null.Float
- Description
GNOMAD v3 Genome Other (population not assigned) Female frequency.
- Name
af_ami
- Type
- null.Float
- Description
GNOMAD v3 Genome Amish population frequency.
- Name
af_oth
- Type
- null.Float
- Description
GNOMAD v3 Genome Other (population not assigned) frequency.
- Name
af_male
- Type
- null.Float
- Description
GNOMAD v3 Genome Male Allele Frequency.
- Name
af_ami_female
- Type
- null.Float
- Description
GNOMAD v3 Genome Amish Female population frequency.
- Name
af_afr
- Type
- null.Float
- Description
GNOMAD v3 Genome African population frequency.
- Name
af_eas_male
- Type
- null.Float
- Description
GNOMAD v3 Genome East Asian Male population frequency.
- Name
af_sas
- Type
- null.Float
- Description
GNOMAD v3 Genome South Asian population frequency.
- Name
af_nfe_female
- Type
- null.Float
- Description
GNOMAD v3 Genome Non-Finnish European Female population frequency.
- Name
af_asj_male
- Type
- null.Float
- Description
GNOMAD v3 Genome Ashkenazi Jewish Male population frequency.
- Name
af_oth_male
- Type
- null.Float
- Description
GNOMAD v3 Genome Other (population not assigned) Male frequency.
- Name
af_nfe_male
- Type
- null.Float
- Description
GNOMAD v3 Genome Non-Finnish European Male population frequency.
- Name
af_asj
- Type
- null.Float
- Description
GNOMAD v3 Genome Ashkenazi Jewish population frequency.
- Name
af_amr_male
- Type
- null.Float
- Description
GNOMAD v3 Genome Ad Mixed American Male population frequency.
- Name
af_amr_female
- Type
- null.Float
- Description
GNOMAD v3 Genome Ad Mixed American Female population frequency.
- Name
af_sas_female
- Type
- null.Float
- Description
GNOMAD v3 Genome South Asian Female population frequency.
- Name
af_fin
- Type
- null.Float
- Description
GNOMAD v3 Genome Finnish European population frequency.
- Name
af_afr_female
- Type
- null.Float
- Description
GNOMAD v3 Genome African Female population frequency.
- Name
af_sas_male
- Type
- null.Float
- Description
GNOMAD v3 Genome South Asian Male population frequency.
- Name
af_amr
- Type
- null.Float
- Description
GNOMAD v3 Genome Ad Mixed American population frequency.
- Name
af_nfe
- Type
- null.Float
- Description
GNOMAD v3 Genome Non-Finnish European population frequency.
- Name
af_eas
- Type
- null.Float
- Description
GNOMAD v3 Genome East Asian population frequency.
- Name
af_ami_male
- Type
- null.Float
- Description
GNOMAD v3 Genome Amish Male population frequency.
- Name
af_fin_female
- Type
- null.Float
- Description
GNOMAD v3 Genome Finnish European Female population frequency.
- Name
Bismap (k100, k50, k36, k24)
- Type
- null.Float
- Description
Mappability of the bisulfite-converted genome. Bisulfite sequencing approaches used to identify DNA methylation introduce large numbers of reads that map to multiple regions. This annotation identifies mappability of the bisulfite-converted genome. Range: [0, 1] (default: 0).
- Name
Umap (k100, k50, k36, k24)
- Type
- null.Float
- Description
Mappability of unconverted genome. It measures the extent to which a position can be uniquely mapped by sequence reads. Lower mappability means the estimates of genomic and epigenomic characteristics from sequencing assays are less reliable, and the region has increased susceptibility to spurious mapping from reads from other regions of the genome with sequencing errors or unexpected genetic variation. Range: [0, 1] (default: 0).
- Name
recombination_rate
- Type
- null.Float
- Description
Recombination rate measures the probability of how likely the region tends to undergo recombination. Range: [0, 54.96] (default: 0).
- Name
nucdiv
- Type
- null.Float
- Description
Nuclear diversity measures the probability of how likely the region diversify. Range: [0.05, 60.25] (default: 0).
- Name
aloft_value
- Type
- string
- Description
ALoFT provides extensive annotations to putative loss-of-function variants (LoF) in protein-coding genes including functional, evolutionary and network features (integrative score).
- Name
aloft_description
- Type
- string
- Description
ALoFT annotation can predict the impact of premature stop variants and classify them as dominant disease-causing, recessive disease-causing and benign variants (integrative score).
- Name
funseq_value
- Type
- string
- Description
A flexible framework to prioritize regulatory mutations from cancer genome sequencing (integrative score).
- Name
funseq_description
- Type
- string
- Description
Funseq annotation pints out whether given mutation falls in coding or non-coding region (integrative score).
- Name
filter_value
- Type
- string
- Description
Filter value. Low: Low quality regions as determined by gnomAD sequencing metrics. Mappability 0.5;overlap with 50nt simple repeat;ReadPosRankSum>1;0 SNVs in 100bp window. SFS_bump: Pentamer context with abnormal SFS. The fraction of high-frequency SNVS MAF between 0.2 and 0.0005 is greater than 1.5x mutation rate controlled average. Tends to be repetitive contexts. TFBS: Transcription factor binding site as determined by overlap with ChIP-seq peaks.
- Name
pn
- Type
- string
- Description
Pentanucleotide context.
- Name
mr
- Type
- null.Float
- Description
Roulette mutation rate estimate.
- Name
ar
- Type
- null.Float
- Description
Adjusted Roulette mutation rate estimate.
- Name
mg
- Type
- null.Float
- Description
gnomAD mutation rate estimate (Karczewski et al. 2020).
- Name
mc
- Type
- null.Float
- Description
Carlson mutation rate estimate (Carlson et al. 2018).