The new tool appears to do exactly what I want and I was keen to try it out, but I'm having some dificulty.
I am attempting to pull out sequences from a taxon tree matching a single gene and give me a table of Accession, Author(s), Affiliation, Title. This is to give to collaborators for them to authenticate trusted sources, and I will pull out the chosen fasta sequences at a later date.
When parsing pubmed records the documentation is quite clear, and I can confirm it works for me:
esearch -db pubmed -query "Garber ED [AUTH] AND PNAS [JOUR]" | elink -related | efilter -query "mouse" | efetch -format docsum | xtract -pattern DocumentSummary -element Id SortFirstAuthor Title
I am attempting to search the nucleotide database, but I cannot return the 'Authors' or other details using the 'xtract' command, and I can't find any examples on doing so
My best attempt is as follows, but it only gives the Id:
esearch -db nucleotide -query "txid2836[Organism:exp] AND rbcl[GENE]" | efetch -format docsum | xtract -pattern DocumentSummary -element Id Authors
Alternatively, I have been attempting to use efetch -format xml
and xtract
-ing the information from there, but I can't understand how to select the correct hierarchy level (documentation):
The xtract function is used for processing XML data:
Exploration Argument Hierarchy
-pattern (Highest Rank)
-division
-group
-branch
-block
-section
-subset
-unit (Lowest Rank)
One such attempt looks like this:
esearch -db nucleotide -query "txid2836[Organism:exp] AND rbcl[GENE]" | efetch -format xml | xtract -division Authors -unit Name