Entering edit mode
9.2 years ago
aaredav
•
0
My aim is getting the list of GO terms for a big list of genes (~250000).
The genes belong to genomes of bacteria downloaded from RefSeq, so they have this kind of identifiers: NP_953938.1.
I guess the solution should be using BioMart. However, the database of RefSeq is not included in the ones you can parse with it.
I tried then to retrieve the Entrez IDs from this file: ftp://ftp.ncbi.nih.gov/gene/DATA/gene2refseq
But some of them (I think most of them) do not appear in the document, e.g. WP_013258072.1
Any other ideas? How can I get the GO terms from the RefSeq IDs?
What bacteria are we talking about?
Returns no results, so I am guessing it is not a part of ensembl.
Can you reformulate this sentence?