I know similar questions have been posted before..
How To Retrieve All Sequences, From Ncbi, That Belong To A Specific Txid And Its Sub Txids?
C: Refseq Proteins For A Given Taxid
But, I am having trouble retrieving the children sequences of a given taxID.
For instance,
from Bio import Entrez
record = Entrez.read(Entrez.esearch(db='protein', term="txid1392[Organism]"))
record['IdList']
Returns just one list of protein UIDs for the Bacillus anthracis species at 1392, not the list for each organism that is below this taxon. Thus, Entrez.efetch only returns one set of protein sequences.
Dropping the [Organism] doesn't change this behavior. Am I missing something?
Unfortunately, I think you might have to list all the child taxon identifiers explicitly - but try exploring the web interface for building an advanced query first in case that shows a better solution.
Actually, I think it may have been an issue with a default retmax of 20.
Oh good. I should have tried the example myself really to confirm my hunch. Thanks!
hy ,i am currenly doing biopython yet in industry does it have influence and impotance in this era?
Hi, this comment is not appropriate to this (very old) thread.
If you wish to ask a question, please create your own thread. If you do, I strongly encourage you to search the forum first (since questions like this are asked often - and are of dubious usefulness). If you cannot find something that satisfies you, ask a question, but please add much more information and detail and make the question as specific as possible.