Entering edit mode
5.1 years ago
harmadikemil
•
0
Dear All!
i'm new to bioinformatics, and i'm working on an archeogenetics project. My first task, that in a genome part i should search the contaminating, non-human segments. It's a ~500 shotgun sequence.
I would have two questions: -How could i print out just the first hits from the XML. -How could i write a counter to each non-human genes with the number and the name of the organisms?
I work in biopython.
Thank you in advance!
Are you bound to use the xml output? Using tabular output like
-outfmt 6
is much easier. https://www.ncbi.nlm.nih.gov/books/NBK279684/If you really need to use xml files you could use something like ElementTree or this: http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc95
ElementTree: https://stackoverflow.com/questions/1912434/how-do-i-parse-xml-in-python
Thank you for the fast aswer! I wrote a short code, but my problem, that i don't know how to reach the hit_num part in the xml. My code is:
So basically i just want to print the query title and the first alignments title.
I can't help you much further, never used the parser. It helps to just print out everything or look what is inside
record
So start with:
Or if you already know that hit_num is inside record.alignments: