Extract metadata with a list of sequence accession number, is it possible?
1
0
Entering edit mode
6.1 years ago

I am relatively new in the field of bioinformatics. In a recent project, I come across the need of extracting metadata (information about the sequence, e.g. descriptions and location of the experiment or less ideally an accession of the Bioproject where the sequence is from) from a list of sequence accessions. Is there any way to accomplish this without searching the accession and navigating NCBI one by one manually? Thank you very much.

metadata • 2.8k views
ADD COMMENT
0
Entering edit mode

search for "NCBI E-Utilities"

ADD REPLY
0
Entering edit mode
6.1 years ago
Mark ★ 1.6k

You might consider that information as metadata, but it's actually just... data. Without it, the sequence information is meaningless.

As Pierre said, NCBI has a utility called E-Utilities. Several packages have been developed for whatever program you know, for Python there's BioPython, for R there's rentrez.

If you want to interface directly with entrez you can also do that. NCBI has all the info herE: https://www.ncbi.nlm.nih.gov/books/NBK25501/

ADD COMMENT
0
Entering edit mode

Thanks all. I will take a look.

ADD REPLY

Login before adding your answer.

Traffic: 1882 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6