Retrieve genus name by species name
2
2
Entering edit mode
10.2 years ago
Phil S. ▴ 700

Hi there,

there might be a rather simple solution to this but I already checked NCBI for a script (or something like this) but I wasn't lucky. So here is my problem...

I got a list of several species names, say 'Gardnerella vaginalis' is one of them, from that name I need to conclude what the genus is this species falls in. In most cases this is fairly easy since naming convention says that the first part is genus and second part specifies the species. However, since there might be one or the other synonym for a species (especially in the fungal world) I need some kind of lookup into the ncbi taxonomy which gives me the respective genus name for a given species name. (and in the best case this sould be script-able)

Any ideas are highly appreciated!

Thanks,

phil

taxonomy • 3.4k views
ADD COMMENT
4
Entering edit mode
10.2 years ago
5heikki 11k

With Entrez Direct:

esearch -db taxonomy -query "Gardnerella vaginalis" | efetch -format docsum | xtract -element Genus
Gardnerella

Alternatively, e.g.:

esearch -db taxonomy -query "Gardnerella vaginalis" | efetch -format xml | xtract -element Lineage
cellular organisms; Bacteria; Actinobacteria; Actinobacteria; Actinobacteridae; Bifidobacteriales; Bifidobacteriaceae; Gardnerella
ADD COMMENT
0
Entering edit mode

Unfortunately the last command

xtract -element Genus

does not work...

what I got up to there is:

esearch -db taxonomy -query "Gardnerella vaginalis" | efetch -format docsum
1. Gardnerella vaginalis
    species, high GC Gram+

but when I add the xtract command I get an empty result. Any Idea why?

ADD REPLY
0
Entering edit mode

Well that's odd:

esearch -db taxonomy -query "Gardnerella vaginalis" | efetch -format docsum
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE eSummaryResult PUBLIC "-//NLM//DTD esummary taxonomy 20130523//EN" "http://eutils.ncbi.nlm.nih.gov/eutils/dtd/20130523/esummary_taxonomy.dtd">
<eSummaryResult>
<DocumentSummarySet status="OK">
<DbBuild>Build140924-0720.1</DbBuild>

<DocumentSummary><Id>2702</Id>
    <Status>active</Status>
    <Rank>species</Rank>
    <Division>high GC Gram+</Division>
    <ScientificName>Gardnerella vaginalis</ScientificName>
    <CommonName></CommonName>
    <TaxId>2702</TaxId>
    <AkaTaxId>0</AkaTaxId>
    <Genus>Gardnerella</Genus>
    <Species>vaginalis</Species>
    <Subsp></Subsp>
    <ModificationDate>2012/10/24 00:00</ModificationDate>
</DocumentSummary>

</DocumentSummarySet>
</eSummaryResult>

vs

esearch -db taxonomy -query "Gardnerella vaginalis" | efetch -format almost_any_text_other_than_docsum_or_xml
1. Gardnerella vaginalis
    species, high GC Gram+
ADD REPLY
0
Entering edit mode

Hmmm it gives me the same results using docsum and DocumentSummary

$ esearch -db taxonomy -query "Gardnerella vaginalis" | efetch -format docsum
1. Gardnerella vaginalis
    species, high GC Gram+

and

$ esearch -db taxonomy -query "Gardnerella vaginalis" | efetch -format DocumentSummary
1. Gardnerella vaginalis
    species, high GC Gram+

it also gives me this output when I just go for:

$ esearch -db taxonomy -query "Gardnerella vaginalis" | efetch
1. Gardnerella vaginalis
    species, high GC Gram+

this is really strange... efetch -format xml gives me a huge XML tree, though.

ADD REPLY
0
Entering edit mode

No idea, -format docsum is supposed to return what I pasted above. Using version 1.90

ADD REPLY
0
Entering edit mode

well that's odd...

do you maybe know:

how can I check on the version of eUtils I am using and where can I get the newest one (if I have an old one)?

ADD REPLY
0
Entering edit mode

e.g.

esearch -version

ftp://ftp.ncbi.nlm.nih.gov/entrez/entrezdirect/

Latest is actually 2.0 but I haven't bothered updating.

ADD REPLY
0
Entering edit mode

I just updated and now all works fine. I had version 1.00 installed and I guess some things changed.... (really need to keep track of all the versions ;)!).

Anyways, thank you so much for your patience and the follow ups!!!!!!

ADD REPLY
4
Entering edit mode
10.2 years ago

Using the good old method: curl | xmllint+xpath

 curl -s  'http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=taxonomy&term="Gardnerella+vaginalis"%5BSCIN%5D' |\
xmllint --xpath '/eSearchResult/IdList/Id[1]/text()' - |\
awk '{printf("http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=taxonomy&id=%s\n",$1);}' |\
xargs xmllint  --xpath '/TaxaSet/Taxon/LineageEx/Taxon[Rank="genus"]/ScientificName/text()'

Gardnerella
ADD COMMENT

Login before adding your answer.

Traffic: 1972 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6