build a file using acession number (NC_######)
2
0
Entering edit mode
8.5 years ago
kelvinfrog75 ▴ 10

Hey, I have a list of accession number and they are all in the format of NC_######(e.g. NC_002935). I tried to use read.GenBank from R to get those sequence but I think it fail to recognize this type of format since it gave an error message: Error in FI[i]:LA[i] : NA/NaN argument. Does anyone know how to R or other program to get multiple gene sequences using accession number in the format I describe here.

sequence • 1.9k views
ADD COMMENT
0
Entering edit mode
8.5 years ago
piet ★ 1.9k

Accession in the format NC_###### belong to the NCBI reference sequence database, which is NOT a part of Genbank (or more precisely the International Nucleotide Sequence Database Collaboration). Every entry in the reference sequence database is derived from an existing sequence entry in Genbank. For example, NC_002935 is derived from BX248353.1. You may have to use the Genbank accessions in your R script.

ADD COMMENT
0
Entering edit mode
8.5 years ago
estebanpw ▴ 30

You can use the NCBI batch entrez to retrieve a range of genome sequences/genbank summarys etc via their accession numbers. I think this is what you are looking for, unless you explicitly want it to be done under R.

See http://www.ncbi.nlm.nih.gov/sites/batchentrez, you only need to upload a text file containing the accesion numbers (one per line) and upload it. It will redirect you to your personal query and you will be able to retrieve all files in one.

ADD COMMENT
0
Entering edit mode

Thanks. I tried with NCBI reference sequence database number (NC_xxxxxx), it said there is error in character but it works with GI number. So is it also not able to take the NCBI database number? Do you know if there is a way to convert NCBI number to accession number?

ADD REPLY

Login before adding your answer.

Traffic: 2640 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6