Question

Convert Refseq Id To Gene Name

4

Entering edit mode

12.0 years ago

hicsuntdrac0nis ▴ 250

I'm trying to convert a list of RefSeq IDs to the Gene Symbol. I can do it for Ensembl using http://genome.ucsc.edu/cgi-bin/hgTables [Track -> Ensembl Genes : Table -> ensemblToGeneName]

I can import a list like:

ENSMUST00000000219
ENSMUST00000000450
ENSMUST00000001156
ENSMUST00000001319
ENSMUST00000001559

and get a table that looks like this:

ENSMUST00000000219    Th
ENSMUST00000000450    Pparg
ENSMUST00000001156    Cfp
ENSMUST00000001319    Efnb2
ENSMUST00000001559    Itfg2

and life is super easy. I can't figure out how to do something similar with RefSeq IDs!

I tried using http://idconverter.bioinfo.cnio.es/IDconverter.php which worked the best out of all the converters suggested by http://www.shodhaka.com/cgi-bin/startbioinfo/simpleresources.pl?tn=Gene%20ID%20conversion&sort=Rank%20by%20usage%20frequency but it wasn't recognizing some of the transcripts and its really annoying

Does anyone know how to import a list of RefSeq genes:

NM_001081045
NM_027801
NM_001267620
NM_028121
NM_001167748

and get out the Gene Symbols:

Kansl1
2610015P09Rik
Ankzf1
Adpgk
Egfem1

refseq gene • 51k views

ADD COMMENT • link updated 22 months ago by Ram 45k • written 12.0 years ago by hicsuntdrac0nis ▴ 250

1

Entering edit mode

Please search this site for the many similar questions and answers, which explain how to use BioMart.

ADD REPLY • link 12.0 years ago by Neilfws 49k

0

Entering edit mode

Gene ID conversion tool

ADD REPLY • link 12.0 years ago by Michael 55k

Alex Reynolds · Answer 1 · 2013-05-31

5

Entering edit mode

12.0 years ago

Ashutosh Pandey 12k

Try:

$ mysql --user=genome -N --host=genome-mysql.cse.ucsc.edu -A -D hg19 -e "select name,name2 from refGene" > Refseq2Gene.txt

This will give u the mapping file. For mouse replace hg19 with mm10.

ADD COMMENT • link updated 12.0 years ago by Alex Reynolds 36k • written 12.0 years ago by Ashutosh Pandey 12k

0

Entering edit mode

Copy and paste in excel and do VLOOKUP for your genes.

ADD REPLY • link 12.0 years ago by Ashutosh Pandey 12k

10

Entering edit mode

10 out of 10 scientists agree: don't use Excel!

ADD REPLY • link 12.0 years ago by Steve Lianoglou 5.2k

1

Entering edit mode

Huh!! Its funny that they even have a paper about it.

ADD REPLY • link 12.0 years ago by Ashutosh Pandey 12k

0

Entering edit mode

Do biologists not classify as scientists then?

ADD REPLY • link 12.0 years ago by Michael 55k

4

Entering edit mode

Of course we do, which is why we don't use Excel :).

ADD REPLY • link 12.0 years ago by terdon ▴ 430

0

Entering edit mode

You know the old adage ... if it doesn't have the word "science" in the title, then it's not a real science.

Computer Science all the way, baby ... wooo hoooo!

Oh ... no, wait ...

ADD REPLY • link 12.0 years ago by Steve Lianoglou 5.2k

0

Entering edit mode

i'm sorry but I get really confused when trying to use Open Source databases through Terminal on my Mac. can you direct me towards somewhere where I can learn ?

ADD REPLY • link 12.0 years ago by hicsuntdrac0nis ▴ 250

0

Entering edit mode

Hi Ashutosh, I don't know if you can get my message but I have a question for you. Your code for retrieving mapping file from human refGene database only gives locus and gene symbol. How do I get gi number and reseq protein number from it? Thank you.

ADD REPLY • link 10.0 years ago by grayapply2009 ▴ 300

0

Entering edit mode

How can I do that for UCSC Genes instead of refseq Genes?

Thank you

ADD REPLY • link 8.8 years ago by silas008 ▴ 180

score 3 · Answer 2 · 2013-06-03

3

Entering edit mode

12.0 years ago

vaskin90 ▴ 290

You could try bioDBnet converter: http://biodbnet.abcc.ncifcrf.gov/db/db2db.php

ADD COMMENT • link 12.0 years ago by vaskin90 ▴ 290

Ram · Answer 3 · 2015-06-09

3

Entering edit mode

9.9 years ago

Ming Tommy Tang ★ 4.6k

You can use Biomart http://crazyhottommy.blogspot.com/2014/09/converting-gene-ids-using-bioconductor.html

ADD COMMENT • link updated 2.4 years ago by Ram 45k • written 9.9 years ago by Ming Tommy Tang ★ 4.6k

score 2 · Answer 4 · 2013-05-31

2

Entering edit mode

12.0 years ago

Ashutosh Pandey 12k

An alternate way would be to go to

1) http://genome.ucsc.edu/cgi-bin/hgTables?command=start

2) Select your genome and assembly and selct Genes and Gene Prediction track as group.

3) Select Refseq Genes as track

4) Select refGene as a table and then output the file.

Then you can use a script or excel to map your refseqids to gene names. Make sure you follow what Steve mentioned in the comment section. Also, have u ever used DAVID (http://david.abcc.ncifcrf.gov/conversion.jsp)

ADD COMMENT • link 12.0 years ago by Ashutosh Pandey 12k

0

Entering edit mode

Upvote for this answer. Using the table browser is better than the biomart if you have a huge number of IDs to be converted.

ADD REPLY • link 8.2 years ago by anniepyim • 0

score 1 · Answer 5 · 2013-06-05

1

Entering edit mode

12.0 years ago

plaschkej ▴ 10

Here is a bioperl script

#!/bin/perl
use warnings;
use strict;
use Bio::Perl;
$| = 1;

my $db = new Bio::DB::RefSeq;

print "Input RefSeq ID: ";
my $refseq = <STDIN>;
chomp($refseq);

my $seq = get_sequence('refseq',$refseq);

# most of the time RefSeq_ID eq RefSeq acc
#my $seq = $db->get_Seq_by_id($refseq); # RefSeq ID
#print "accession is ", $seq->accession_number, "\n";

if ($seq->desc =~ /\((\w+)\)/) {
    print"found: $1\n";
    print $seq->desc;
}
else
{
    print "defintion is ", $seq->desc, "\n";
}

ADD COMMENT • link 12.0 years ago by plaschkej ▴ 10

0

Entering edit mode

Hi, I am not perl person - would be possible to change this script to paste LIST of NM_numbers instead of typing to STDIN in cmd??

ADD REPLY • link 8.2 years ago by Paul ★ 1.5k

score 0 · Answer 6 · 2013-06-05

Cistrome/Galaxy can do it very easily. http://cistrome.org/ap/ In the tool box on the left:

Liftover/Others Convert between RefSeq, Gene Symbols to Entrez IDs using Bioconductor. Liftover Wig Files Liftover wig files [Galaxy]Convert genome coordinates between assemblies and genomes Standardize wig file Standardize a wig file to a given span Extract data from Wiggle Extract data for certain chromosome from a wiggle file Extract data from Bed Extract data for certain chromosome from a BED file