Entering edit mode
17 months ago
Mahan
▴
70
I have a list of gene IDs. I would like to know if there is a way to find the gene regions (START-END) on GRCh37 build? TIA
I have a list of gene IDs. I would like to know if there is a way to find the gene regions (START-END) on GRCh37 build? TIA
A hacky answer. Find genes you need from the list.
Get GRCh37 GTF file here: https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_43/GRCh37_mapping/gencode.v43lift37.basic.annotation.gtf.gz
$ zcat gencode.v43lift37.basic.annotation.gtf.gz | awk -F "\t|;" '{OFS="\t"}{if ($3 == "gene") print $11,$4,$5}' | sed -e 's/gene_name//' -e 's/"//g' > genes_37
$ head genes_37
DDX11L1 12010 13670
WASH7P 14404 29570
MIR1302-2HG 29554 31109
FAM138A 34554 36081
OR4G4P 52473 53312
Your genes of interest in a file called id
.
$ more id
FAM138A
OR4G4P
Grab the start-stop from genes_37
file.
$ grep -f id -w genes_37
FAM138A 34554 36081
OR4G4P 52473 53312
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
What have you tried?