Hello,
I am running an all-vs-all blastp search using BLAST+ version 2.10.1+ via the subprocess module as part of a reciprical best hit python software. I am aggregating all of the proteins from various genomes, writing them to a fasta that is used as both the query and to make a db, then finding the best hit within each genome (as opposed to the overall best hit) for each query.
Interestingly some of the queries do not have hits in the genome from which they were sourced (1 query from a 7038-protein genome) and others do not have themselves as a top hit for their host genome (5-10 proteins per genome for 4 genomes of a similar size). Does anyone know why this is?
The blast call parameters are:
blastp_exe_path,
'-out', fr'{results_out_path}', #filepath
'-query', fr'{query_user}', #filepath
'-db', fr'{db_user}', #filepath
'-evalue', fr'{evalue_user}', #0.001
'-outfmt', '5', #xml
'-num_threads', fr'{thread_num}', #4
'-parse_deflines',
#Gabriel Moreno-Hagelsieb, Kristen Latimer, Choosing BLAST options for better detection of
#orthologs as reciprocal best hits, Bioinformatics, Volume 24, Issue 3, February 2008, Pages
#319–324, https://doi.org/10.1093/bioinformatics/btm585
#https://microbiome.wordpress.com/research/orthologs/
'-seg', 'yes',
'-soft_masking', 'true',
'-use_sw_tback',
#biopython defaults
'-num_alignments', '200'
The parameters are equivalent to those on the command line - but if it would help for me to write it in command line format please let me know.
One thing I had considered was that the proteins in question could be low complexity, and seg could be filtering them to the extent that they are no longer recognisable as the best hit. Has anyone seen this before, or can think of another reason?
Thanks, Tim
Hi Mensur, thanks for your reply. I was parsing the results in the xml file myself (get the top hsp for each alignment via coveage, blast score and evalue, then pick the best alignment) so I'm not sure that the ordering would be an issue. But the other points are definitely something I'll look into, many thanks!