Location of Genes on Chromosome
2
0
Entering edit mode
4.8 years ago
sp29 ▴ 50

I have a list of Genes (Ensemble ids). I need to find their locations in the human chromosomes. Information that I need- Start location Stop Location Length

I have tried ShinyGo but got only a graph with the location of genes on the chromosome but not exact locations.

Chromosome Mapping Gene Chromosomal-Mapping • 1.6k views
ADD COMMENT
1
Entering edit mode
4.8 years ago

If your list of Ensembl IDs isn't too long (500 max), probably the eaiest way to get the starts and the stop is from biomart. Select the "Ensembl Genes" database, and the "Human genes" dataset. Enter your gene ids under Filters>Gene>Input external references ID list and under Attributes>Gene select Gene ID, Chromosome, Gene Start and Gene End.

Note that because "Gene start" is the earliest start coordinate of any transcript associated with that gene, and gene end is the last end coordinate of any transcript, then the "Length" of the gene will almost certainly be longer than the length of any individual transcript.

ADD COMMENT
0
Entering edit mode
4.8 years ago
mark.ziemann ★ 1.9k

Gene location information can be found in the GTF or GFF files on the Ensembl FTP site. You just need to make sure that the version of the GTF/GFF file you use is the same as the annotation of the gene list you received. Older versions of Ensembl can be found at the archive.

Here is the first few lines of a GTF file. You can see lines that have "gene" in the 3rd column, show the coordinates in column 4 and 5

#!genome-build GRCh38.p5
#!genome-version GRCh38
#!genome-date 2013-12
#!genome-build-accession NCBI:GCA_000001405.20
#!genebuild-last-updated 2015-10
1   havana  gene    11869   14409   .   +   .   gene_id "ENSG00000223972"; gene_version "5"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; havana_gene "OTTHUMG00000000961"; havana_gene_version "2";
1   havana  transcript  11869   14409   .   +   .   gene_id "ENSG00000223972"; gene_version "5"; transcript_id "ENST00000456328"; transcript_version "2"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; havana_gene "OTTHUMG00000000961"; havana_gene_version "2"; transcript_name "DDX11L1-002"; transcript_source "havana"; transcript_biotype "processed_transcript"; havana_transcript "OTTHUMT00000362751"; havana_transcript_version "1"; tag "basic"; transcript_support_level "1";
1   havana  exon    11869   12227   .   +   .   gene_id "ENSG00000223972"; gene_version "5"; transcript_id "ENST00000456328"; transcript_version "2"; exon_number "1"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; havana_gene "OTTHUMG00000000961"; havana_gene_version "2"; transcript_name "DDX11L1-002"; transcript_source "havana"; transcript_biotype "processed_transcript"; havana_transcript "OTTHUMT00000362751"; havana_transcript_version "1"; exon_id "ENSE00002234944"; exon_version "1"; tag "basic"; transcript_support_level "1";
1   havana  exon    12613   12721   .   +   .   gene_id "ENSG00000223972"; gene_version "5"; transcript_id "ENST00000456328"; transcript_version "2"; exon_number "2"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; havana_gene "OTTHUMG00000000961"; havana_gene_version "2"; transcript_name "DDX11L1-002"; transcript_source "havana"; transcript_biotype "processed_transcript"; havana_transcript "OTTHUMT00000362751"; havana_transcript_version "1"; exon_id "ENSE00003582793"; exon_version "1"; tag "basic"; transcript_support_level "1";
1   havana  exon    13221   14409   .   +   .   gene_id "ENSG00000223972"; gene_version "5"; transcript_id "ENST00000456328"; transcript_version "2"; exon_number "3"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; havana_gene "OTTHUMG00000000961"; havana_gene_version "2"; transcript_name "DDX11L1-002"; transcript_source "havana"; transcript_biotype "processed_transcript"; havana_transcript "OTTHUMT00000362751"; havana_transcript_version "1"; exon_id "ENSE00002312635"; exon_version "1"; tag "basic"; transcript_support_level "1";

You can use your favorite scripting language to extract the coordinate information for your genes of interest.

ADD COMMENT

Login before adding your answer.

Traffic: 1620 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6