Entering edit mode
4.0 years ago
kelvinfrog75
▴
10
I want to extract the country location of a bunch of sequences in the NCBI using renterz. I have their accession numbers but I have the trouble of getting the country info. For example, I have this accession number MH939154 and I need to extract Romania using rentrez.
source 1..10976
/organism="West Nile virus"
/mol_type="genomic RNA"
/strain="DD84c"
/host="Culex pipiens s.l."
/db_xref="taxon:11082"
/country="Romania"
/collection_date="2014"
/note="lineage 2"
I have tried the code below but it seems like it will only extract the countries related to publication. So I wonder if there is any way to get the country under the source.
id = "MH939154.1"
db = entrez_fetch(db= "pubmed", id = id, rettype = "xml")
xml <- read_xml(db)
recs <- xml_find_all(xml, "//Country")
This seems to work fine. I can run this command inside R. Just wonder how do you get this link "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=MH939154&rettype=gb&retmode=xml" ? Thanks.
https://www.ncbi.nlm.nih.gov/books/NBK25500/
Great. I am able to integrate the command script and get the country info. Thanks!