You could also use the REST URLs at the ENA. The Taxon: can be a name or id like
http://www.ebi.ac.uk/ena/data/view/Taxon:Homo%20sapiens&display=xml
You can also paste a comma-separated list of names like
http://www.ebi.ac.uk/ena/data/view/Taxon:Homo%20sapiens,Taxon:Caenorhabditis%20elegans&display=xml
You'll have to test how many names you can pass to a single URL. In R you could try this..
tax <- c("Homo sapiens", "Caenorhabditis elegans")
x <- paste(paste("Taxon:", tax, sep=""), collapse=",")
x
[1] "Taxon:Homo sapiens,Taxon:Caenorhabditis elegans"
url <- paste( "http://www.ebi.ac.uk/ena/data/view/", x, "&display=xml", sep="")
doc <- xmlParse(url)
xpathSApply(doc, "/ROOT/taxon", xmlGetAttr, "scientificName")
[1] "Caenorhabditis elegans" "Homo sapiens"
x <- getNodeSet(doc, "/ROOT/taxon")
# full lineage
sapply(x, function(y) paste(rev(xpathSApply(y, ".//lineage/taxon", xmlGetAttr, "scientificName")), collapse="; ") )
# ranks only
sapply(x, function(y) paste(rev(xpathSApply(y, ".//lineage/taxon[@rank]", xmlGetAttr, "scientificName")), collapse="; ") )
[1] "Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; Rhabditidae; Peloderinae; Caenorhabditis"
[2] "Eukaryota; Metazoa; Chordata; Craniata; Mammalia; Euarchontoglires; Primates; Haplorrhini; Simiiformes; Catarrhini; Hominoidea; Hominidae; Homininae; Homo"
That's very helpful! but can I get multiple lineage at same time? Because I have 3000 organism to probe it's impossible to run this command one by one!
loop: http://www.linuxquestions.org/questions/programming-9/bash-read-entire-file-line-in-for-loop-240016/