I need to write a script that can do a blastp with an input xml file, then find and display the PDB codes of sequences that are homologous. The definition I'm using for homology is: 1. the HSP in the alignment with the highest e-value is less than .001; and 2. the total alignment length (not counting gaps) is 60% or more of the query length.
I'm still new to programming, and I'm quite stumped as to how to do this.
I'd really appreciate your help. Thanks!
Thanks for responding! I am looking through Biopython. Would you be able to point me to something specific?
I just did :) I linked you to the tutorial on the section to parse BLAST XML results. Here are two more links, google search for 'parse blast xml biopython':
http://stackoverflow.com/questions/13835912/parse-only-top-3-hits-from-blast-output-with-ncbixml http://www.biotnet.org/sites/biotnet.org/files/documents/25/biopython_blast.pdf