Regions

On this page, we’ll dive into the /v1/regions/ endpoint. This endpoint allows you to query for regions and their variants. The response is a list of variants, and pagination is supported.


GET/v1/regions/:region

Retrieve a region

This endpoint allows you to retrieve a region by specifying region in chr-startPosition-endPosition format. The response is a list of variants, and pagination is supported. Refer to the region model for more information on the properties returned.

Path parameters

  • Name
    region
    Type
    string
    Description

    The region to retrieve. The region should be specified in chr-startPosition-endPosition format.

Query parameters

  • Name
    page
    Type
    integer
    Description

    The page number to browse.

  • Name
    limit
    Type
    integer
    Description

    The number of objects to return per page.

Attributes

  • Name
    data
    Type
    list
    Description

    The list of objects returned by the query.

  • Name
    has_more
    Type
    bool
    Description

    Indicates whether there are more pages to browse.

  • Name
    count
    Type
    integer
    Description

    The total number of objects that match the query.

Request

GET
/v1/regions/:region
curl -G https://api.genohub.org/v1/regions/19-44851820-44908922

Paginated response

{
    "count": 16400,
    "has_more": true,
    "data": [
{
    "variant_vcf": "19-44908922-G-A",
    "chromosome": "19",
    "position": "44908922",
    "genecode_comprehensive_info": "APOE",
    "bravo_an": "264690",
    "bravo_ac": "2",
    "bravo_af": "0.00000755601",
    "filter_status": "PASS",
    "genecode_comprehensive_category": "exonic",
    "genecode_comprehensive_exonic_category": "nonsynonymous SNV",
    "metasvm_pred": "T",
    // ...
},
{
    "variant_vcf": "19-44908921-C-T",
    "chromosome": "19",
    "position": "44908921",
    "genecode_comprehensive_info": "APOE",
    "bravo_an": "264690",
    "bravo_ac": "6",
    "bravo_af": "0.000022668",
    "filter_status": "PASS",
    "genecode_comprehensive_category": "exonic",
    "genecode_comprehensive_exonic_category": "nonsynonymous SNV",
    "metasvm_pred": "T",
    // ...
},
    // ...
    ]}

GET/v1/regions/:region/snv

Retrieve a region's SNV variants

This endpoint allows you to retrieve a region SNV variants by specifying region in chr-startPosition-endPosition format. The response is a list of SNV variants, and pagination is supported. Refer to the region model for more information on the properties returned.

Path parameters

  • Name
    region
    Type
    string
    Description

    The region to retrieve. The region should be specified in chr-startPosition-endPosition format.

Query parameters

  • Name
    page
    Type
    integer
    Description

    The page number to browse.

  • Name
    limit
    Type
    integer
    Description

    The number of objects to return per page.

Attributes

  • Name
    data
    Type
    list
    Description

    The list of objects returned by the query.

  • Name
    has_more
    Type
    bool
    Description

    Indicates whether there are more pages to browse.

  • Name
    count
    Type
    integer
    Description

    The total number of objects that match the query.

Request

GET
/v1/regions/:region/snv
curl -G https://api.genohub.org/v1/regions/19-44851820-44908922/snv

Paginated response

{
    "count": 15104,
    "has_more": true,
    "data": [
{
    "variant_vcf": "19-44908922-G-A",
    "chromosome": "19",
    "position": "44908922",
    "genecode_comprehensive_info": "APOE",
    "bravo_an": "264690",
    "bravo_ac": "2",
    "bravo_af": "0.00000755601",
    "filter_status": "PASS",
    "genecode_comprehensive_category": "exonic",
    "genecode_comprehensive_exonic_category": "nonsynonymous SNV",
    "metasvm_pred": "T",
    // ...
},
{
    "variant_vcf": "19-44908921-C-T",
    "chromosome": "19",
    "position": "44908921",
    "genecode_comprehensive_info": "APOE",
    "bravo_an": "264690",
    "bravo_ac": "6",
    "bravo_af": "0.000022668",
    "filter_status": "PASS",
    "genecode_comprehensive_category": "exonic",
    "genecode_comprehensive_exonic_category": "nonsynonymous SNV",
    "metasvm_pred": "T",
    // ...
},
    // ...
    ]}

GET/v1/regions/:region/indel

Retrieve a region's InDEL variants

This endpoint allows you to retrieve a region InDEL variants by specifying region in chr-startPosition-endPosition format. The response is a list of InDEL variants, and pagination is supported. Refer to the region model for more information on the properties returned.

Path parameters

  • Name
    region
    Type
    string
    Description

    The region to retrieve. The region should be specified in chr-startPosition-endPosition format.

Query parameters

  • Name
    page
    Type
    integer
    Description

    The page number to browse.

  • Name
    limit
    Type
    integer
    Description

    The number of objects to return per page.

Attributes

  • Name
    data
    Type
    list
    Description

    The list of objects returned by the query.

  • Name
    has_more
    Type
    bool
    Description

    Indicates whether there are more pages to browse.

  • Name
    count
    Type
    integer
    Description

    The total number of objects that match the query.

Request

GET
/v1/regions/:region/indel
curl -G https://api.genohub.org/v1/regions/19-44851820-44908922/indel

Paginated response

{
    "count": 1296,
    "has_more": true,
    "data": [
{
    "variant_vcf": "19-44908859-C-CCGAGCGCGGCCTCAGCGCCATCCG",
    "chromosome": "19",
    "position": "44908859",
    "genecode_comprehensive_info": "APOE",
    "bravo_an": "264690",
    "bravo_ac": "1",
    "bravo_af": "0.000003778",
    "filter_status": "PASS",
    "genecode_comprehensive_category": "exonic",
    "genecode_comprehensive_exonic_category": "nonframeshift insertion",
    "metasvm_pred": "",
    // ...
},
{
    "variant_vcf": "19-44908840-GC-G",
    "chromosome": "19",
    "position": "44908840",
    "genecode_comprehensive_info": "APOE",
    "bravo_an": "264690",
    "bravo_ac": "1",
    "bravo_af": "0.000003778",
    "filter_status": "PASS",
    "genecode_comprehensive_category": "exonic",
    "genecode_comprehensive_exonic_category": "frameshift deletion",
    "metasvm_pred": "",
    // ...
},
    // ...
    ]}

The region model

The region model is similar to the gene model, and a subset of the variant model. The region model contains the following properties:

Properties

  • Name
    variant_vcf
    Type
    string
    Description

    The unique identifier of the given variant. Reported as chr-pos-ref-alt format.

  • Name
    chromosome
    Type
    string
    Description

    The chromosome where the variant is located

  • Name
    position
    Type
    string
    Description

    The position of the variant

  • Name
    genecode_comprehensive_info
    Type
    string
    Description

    Identify whether variants cause protein coding changes using Gencode genes definition systems, it will label the gene name of the variants has impact, if it is intergenic region, the nearby gene name will be labeled in the annotation. (Frankish et al., 2018; Harrow et al., 2012)

  • Name
    bravo_an
    Type
    string
    Description

    TOPMed Bravo Genome Allele Number. (NHLBI TOPMed Consortium, 2018; Taliun et al., 2019)

  • Name
    bravo_ac
    Type
    string
    Description

    TOPMed Bravo Genome Allele Count.

  • Name
    bravo_af
    Type
    string
    Description

    TOPMed Bravo Genome Allele Frequency. (NHLBI TOPMed Consortium, 2018; Taliun et al., 2019)

  • Name
    filter_status
    Type
    string
    Description

    TOPMed QC status of the given variant.

  • Name
    genecode_comprehensive_category
    Type
    string
    Description

    Comprehensive category from Genecode

  • Name
    genecode_comprehensive_exonic_category
    Type
    string
    Description

    Identify whether variants cause protein coding changes using Gencode genes definition systems. It will label the gene name of the variants has impact, if it is intergenic region, the nearby gene name will be labeled in the annotation.

  • Name
    metasvm_pred
    Type
    string
    Description

    Description for MetasvmPred

  • Name
    rsid
    Type
    string
    Description

    The rsID of the given variant (if exists).

  • Name
    cage_enhancer
    Type
    string
    Description

    CAGE defined permissive Enhancer sites from Fantom 5.

  • Name
    cage_promoter
    Type
    string
    Description

    CAGE defined promoter sites from Fantom 5.

  • Name
    sift_cat
    Type
    string
    Description

    SIFT category of change.

  • Name
    polyphen_cat
    Type
    string
    Description

    PolyPhen category of change.

  • Name
    cadd_phred
    Type
    string
    Description

    The CADD score in PHRED scale (integrative score). A higher CADD score indicates more deleterious. Range: [0, 99].

  • Name
    genehancer
    Type
    string
    Description

    Predicted human enhancer sites from the GeneHancer database.

  • Name
    clnsig
    Type
    string
    Description

    Clinical significance for this single variant. (Landrum et al., 2017, 2013)

  • Name
    clnsigincl
    Type
    string
    Description

    Clinical significance for a haplotype or genotype that includes this variant. Reported as pairs of VariationID:clinical significance. (Landrum et al., 2017, 2013)

  • Name
    clndn
    Type
    string
    Description

    Clinical disease name

  • Name
    clndnincl
    Type
    string
    Description

    Clinical significance for a haplotype or genotype that includes this variant. Reported as pairs of VariationID:clinical significance.

  • Name
    clnrevstat
    Type
    string
    Description

    ClinVar review status for the Variation ID.

  • Name
    origin
    Type
    string
    Description

    Allele origin. One or more of the following values may be added: 0 - unknown; 1 - germline; 2 - somatic; 4 - inherited; 8 - paternal; 16 - maternal; 32 - de-novo; 64 - biparental; 128 - uniparental; 256 - not-tested; 512

    • tested-inconclusive.
  • Name
    clndisdb
    Type
    string
    Description

    Tag-value pairs of disease database name and identifier, e.g. OMIM:NNNNNN.

  • Name
    clndisdbincl
    Type
    string
    Description

    For included variant: Tag-value pairs of disease database name and identifier, e.g. OMIM:NNNNNN.

  • Name
    geneinfo
    Type
    string
    Description

    Gene(s) for the variant reported as gene symbol:gene id. The gene symbol and id are delimited by a colon (:) and each pair is delimited by a vertical bar (|).

  • Name
    linsight
    Type
    string
    Description

    The LINSIGHT score (integrative score). A higher LINSIGHT score indicates more functionality. Range: [0.215, 0.995].

  • Name
    fathmm_xf
    Type
    string
    Description

    The FATHMM-XF score (integrative score). A higher FATHMM-XF score indicates more functionality. Range: [0.405, 99.451].

  • Name
    apc_conservation_v2
    Type
    null.Float
    Description

    Conservation annotation PC: the first PC of the standardized scores of “GerpN, GerpS, priPhCons, mamPhCons, verPhCons, priPhyloP, mamPhyloP, verPhyloP” in PHRED scale. Range: [0, 75.824].

  • Name
    apc_epigenetics_active
    Type
    null.Float
    Description

    Active Epigenetic annotation PC: the first PC of the standardized scores of “EncodeH3K4me1.max, EncodeH3K4me2.max, EncodeH3K4me3.max, EncodeH3K9ac.max, EncodeH3K27ac.max, EncodeH4K20me1.max,EncodeH2AFZ.max,” in PHRED scale.Range: [0, 86.238].

  • Name
    apc_epigenetics_repressed
    Type
    null.Float
    Description

    Repressed Epigenetic annotation PC: the first PC of the standardized scores of “EncodeH3K9me3.max, EncodeH3K27me3.max” in PHRED scale. Range: [0, 86.238].

  • Name
    apc_epigenetics_transcription
    Type
    null.Float
    Description

    Transcription Epigenetic annotation PC: the first PC of the standardized scores of “EncodeH3K36me3.max, EncodeH3K79me2.max” in PHRED scale. Range: [0, 86.238].

  • Name
    apc_local_nucleotide_diversity_v3
    Type
    null.Float
    Description

    Local nucleotide diversity annotation PC: the first PC of the standardized scores of “bStatistic, RecombinationRate, NuclearDiversity” in PHRED scale. Range: [0, 86.238].

  • Name
    apc_mappability
    Type
    null.Float
    Description

    Mappability annotation PC: the first PC of the standardized scores of “umap_k100, bismap_k100, umap_k50, bismap_k50, umap_k36, bismap_k36, umap_k24, bismap_k24” in PHRED scale. Range: [0.007, 22.966].

  • Name
    apc_mutation_density
    Type
    null.Float
    Description

    Mutation density annotation PC: the first PC of the standardized scores of “Common100bp, Rare100bp, Sngl100bp, Common1000bp, Rare1000bp, Sngl1000bp, Common10000bp, Rare10000bp, Sngl10000bp” in PHRED scale. Range: [0, 84.477].

  • Name
    apc_protein_function_v3
    Type
    null.Float
    Description

    Protein function annotation PC: the first PC of the standardized scores of “SIFTval, PolyPhenVal, Grantham, Polyphen2_HDIV_score, Polyphen2_HVAR_score, MutationTaster_score, MutationAssessor_score” in PHRED scale. Range: [2.974, 86.238].

  • Name
    apc_transcription_factor
    Type
    null.Float
    Description

    Transcription factor annotation PC: the first PC of the standardized scores of “RemapOverlapTF, RemapOverlapCL” in PHRED scale. Range: [1.185, 86.238].

  • Name
    af_total
    Type
    string
    Description

    GNOMAD v3 Genome Allele Frequency using all the samples.

  • Name
    tg_afr
    Type
    string
    Description

    1000 Genomes African population frequency.

  • Name
    tg_all
    Type
    string
    Description

    GNOMAD v3 Genome African population frequency. (GNOMAD Consortium, 2019; Karczewski et al., 2020)

  • Name
    tg_amr
    Type
    string
    Description

    1000 Genomes Ad Mixed American population frequency.

  • Name
    tg_eas
    Type
    string
    Description

    1000 Genomes East Asian population frequency.

  • Name
    tg_eur
    Type
    string
    Description

    1000 Genomes European population frequency.

  • Name
    tg_sas
    Type
    string
    Description

    1000 Genomes South Asian population frequency.