Convert Gene Name to RefSeq ID
1
0
Entering edit mode
7.1 years ago
tlorin ▴ 370

Dear all,

I have seen this post that allows gene conversion from RefSeq IDs to gene names. What I would like is a tool (command line or web-based) that:

  1. takes as input a list of gene names (shortcut name or comprehensive gene name) AND a query species
  2. outputs the list of RefSeq IDs

In my case

  1. the species would be Stegastes partitus
  2. the gene names

    LOC103370819
    prtfdc1
    LOC103367872
    tfec
    colony stimulating factor 1 receptor (csf1r)
    

I first thought of using blastdbcmd but it seems that blastdbcmd does not take gene names as input.

$blastdbcmd -db nt_Spar -dbtype nucl -entry prtfdc1 #nt_Spar a subset of nt with only S. partitus sequences
Error: prtfdc1: OID not found

I have tried using Batch Entrez but it does not accept gene names as input neither.

Many thanks for your help!

RNA-Seq ncbi sequence blast • 4.4k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

I didn't know this tool! But I cannot make it work B-)

ADD REPLY
2
Entering edit mode
7.1 years ago
GenoMax 147k

You can get accession numbers for those genes by using NCBI eUtils. Here is an example: esearch -db nuccore -query "prtfdc1 [Gene] AND Stegastes partitus [ORGN]" | efetch -format docsum | xtract -pattern Caption -element Caption This produces XM_008294096 NW_007578669 You would want the NW* numbers. In that case add a pipe to grep NW* at the end of the command above.

ADD COMMENT
0
Entering edit mode

Didn't work at the beginning: all the commands (esearch,efetch,xtract) need to be in the path (obviously). Works perfectly now, thanks!

ADD REPLY
0
Entering edit mode

@genomax: how would you do with a complete gene name instead of the shortcut? For instance Stegastes partitus phosphoribosyl transferase domain containing 1.

This command does not output anything: ./esearch -db nuccore -query "Stegastes partitus phosphoribosyl transferase domain [Gene] AND Stegastes partitus [ORGN]" | ./efetch -format docsum | ./xtract -pattern Caption -element Caption

ADD REPLY
1
Entering edit mode
esearch -db nuccore -query "Stegastes partitus phosphoribosyl transferase domain containing 1 AND Stegastes partitus [ORGN]" | efetch -format docsum | xtract -pattern Caption -element Caption | grep NW

NW_007577984
NW_007578669

While that seems to generate a result those accessions are for the genomic entries. Guess you may not be able to make some of them work.

ADD REPLY

Login before adding your answer.

Traffic: 2590 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6