I'm tryin to use Entrez direct to extract "Gene-commentary_accession" information from xml file using:
esearch -db gene -query XP_003399880.1| efetch -format xml | xtract -pattern Gene-commentary -match Gene-commentary_type:1 -element Gene-commentary_accession > Bter_FAR_genome_shotgun_sequences2.txt
an example of XML file (shortened):
<Entrezgene_locus>
<Gene-commentary>
<Gene-commentary_type value="genomic">1</Gene-commentary_type>
<Gene-commentary_heading>Reference Bter_1.0</Gene-commentary_heading>
<Gene-commentary_label>Chromosome LG B12 Reference Bter_1.0</Gene-commentary_label>
<Gene-commentary_accession>NC_015773</Gene-commentary_accession>
<Gene-commentary_version>1</Gene-commentary_version>
<Gene-commentary_seqs>
<Seq-loc>
<Seq-loc_int>
<Seq-interval>
<Seq-interval_from>7277254</Seq-interval_from>
<Seq-interval_to>7286174</Seq-interval_to>
<Seq-interval_strand>
<Na-strand value="minus"/>
</Seq-interval_strand>
<Seq-interval_id>
<Seq-id>
<Seq-id_gi>339751241</Seq-id_gi>
</Seq-id>
</Seq-interval_id>
</Seq-interval>
</Seq-loc_int>
</Seq-loc>
</Gene-commentary_seqs>
<Gene-commentary_products>
<Gene-commentary>
<Gene-commentary_type value="mRNA">3</Gene-commentary_type>
<Gene-commentary_heading>Reference</Gene-commentary_heading>
<Gene-commentary_label>transcript variant X1</Gene-commentary_label>
<Gene-commentary_accession>XM_003399832</Gene-commentary_accession>
<Gene-commentary_version>2</Gene-commentary_version>
<Gene-commentary_genomic-coords>
I'd like to retrieve the genomic accession using -match
command but I still keep extracting also other Gene-commentary_accessions such as "mRNA" - could you help me with a correct syntax?
(I find it quite difficult to comprehend the use of -match
from the NCBI's documentation for this topic (https://www.ncbi.nlm.nih.gov/books/NBK179288/) so another example on Biostars might possibly help also others with similar question.)
(deleted - misplaced comment)