NCBI GeneID vs ProteinID database
2
0
Entering edit mode
10.2 years ago
biolab ★ 1.4k

Hi everyone

I am looking for NCBI ProteinID vs GeneId pair database as well as Gene seq database, because I have a comprehensive list of protein IDs and need to convert them to Gene IDs, and then get corresponding sequences.

I should note I tried an online convertion tool, but run too slow because of a huge list.

I log on NCBI Gene ftp site and found a couple of dbs available for download. I am wondering if it is the correct place to download? I cannot find fasta data there.

THANKS A LOT in advance for your advice and help!

NCBI • 5.6k views
ADD COMMENT
1
Entering edit mode

This needs clarification please. Proteins and genes are different concepts. Also there are many protein ID types. What exactly do you want to map to what?

ADD REPLY
0
Entering edit mode

Hi cdsouthan,

Thanks a lot for your reply. I finally sorted out the problem. I found my aim databases on NCBI ftp website. Actually my protein list only includes NP_... XP_... and YP_... Could you and others beifly explain to me the abbreviations (NP_, XP_, YP_, XM)? THANKS!!

ADD REPLY
2
Entering edit mode
10.2 years ago
rwn ▴ 610

From the NCBI website About RefSeq:

Definitions:

  • Model RefSeq: RNA and protein products that are generated by the eukaryotic genome annotation pipeline. These records use accession prefixes XM_, XR_, and XP_.
  • Known RefSeq: RNA and protein products that are mainly derived from GenBank cDNA and EST data and are supported by the RefSeq eukaryotic curation group. These records use accession prefixes NM_, NR_, and NP_.

Hope that helps a bit.

ADD COMMENT
0
Entering edit mode

Your comments are really helpful. Thanks a lot!

ADD REPLY
2
Entering edit mode
ADD COMMENT

Login before adding your answer.

Traffic: 1713 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6