Query bacterial genus on NCBI from a list of accessions
2
0
Entering edit mode
5.7 years ago

Hey guys,

I have a list of several accessions, such as

NZ_JRTV01000009.1 NZ_CEWL01000009.1 NZ_CP013481.2 NZ_CP009553.3 NZ_CBYD010000018.1 NZ_CBYE010000015.1 NZ_CP016370.1 ...

I would like to fetch the genus for each accession. Please be aware that some accessions are old and have been replaced by new accessions on NCBI. Thanks!

genome • 918 views
ADD COMMENT
1
Entering edit mode

You could use the approach of my code at https://github.com/jrjhealey/PYlogeny/tree/master/PYlogeny

Use Entrez to query the Accessions and get their TaxIDs, then use ETE3's NCBITaxa module to extract the associated genus etc.

This will do nothing about converting your obsolete IDs though, so you'll need to tackle that yourself.

ADD REPLY
2
Entering edit mode
5.7 years ago
 echo "NZ_JRTV01000009.1 NZ_CEWL01000009.1 NZ_CP013481.2 NZ_CP009553.3 NZ_CBYD010000018.1 NZ_CBYE010000015.1 NZ_CP016370.1" | tr " " "\n" | while read A ; do echo -n "$A " && wget -O - -q  "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=nuccore&id=${A}" | xmllint --xpath '//Item[@Name="TaxId"]/text()' - | xargs -I % wget -O - -q "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=taxonomy&id=%" |  xmllint --xpath '//Item[@Name="Genus"]/text()' -  && echo ; done

NZ_JRTV01000009.1 Klebsiella
NZ_CEWL01000009.1 Pandoraea
NZ_CP013481.2 Pandoraea
NZ_CP009553.3 Pandoraea
NZ_CBYD010000018.1 Elizabethkingia
NZ_CBYE010000015.1 Elizabethkingia
NZ_CP016370.1 Elizabethkingia
ADD COMMENT
2
Entering edit mode
5.7 years ago
GenoMax 147k

Using Entrezdirect

$ more acc
NZ_JRTV01000009.1
NZ_CEWL01000009.1
NZ_CP013481.2
NZ_CP009553.3
NZ_CBYD010000018.1
NZ_CBYE010000015.1
NZ_CP016370.1

$ epost -db nuccore -input acc | esummary | xtract -pattern DocumentSummary -element Caption,Organism
NZ_JRTV01000009 Klebsiella variicola
NZ_CP013481     Pandoraea apista
NZ_CEWL01000009 Pandoraea apista
NZ_CP009553     Pandoraea pnomenusa
NZ_CBYD010000018        Elizabethkingia anophelis PW2806
NZ_CBYE010000015        Elizabethkingia anophelis PW2809
NZ_CP016370     Elizabethkingia anophelis

If you only want genus name then add | awk -F ' ' '{OFS="\t"}{print $1,$2}' to end of above command.

ADD COMMENT
0
Entering edit mode

I'll learn those ncbi commands one day :-)

ADD REPLY

Login before adding your answer.

Traffic: 1617 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6