Entering edit mode
7.2 years ago
hodayabeer
▴
10
Hi, I was blasting proteome of yeast against a large database of Archaea. Something is wrong with the results because I got too large P Identity for too much proteins. I assume that the problem is that blast scores small parts of the query sequence once it found an alignment to that, and therefore gives good scores even though most of the query sequence doesn't match. How can prevent blast from giving me small alignments? maybe setting the window_size parameter? or normalizing the bitScore with the query length? does hsps have to do something with that? Thanx soo much!
Can you give an example?
You can screen for results with lower p-value, coverage or identity (using tabular format output)
but I want to get only full alignments of the query sequence
if you are getting the BLAST output in the tabular format then you can get it display alignment length and parse the output for preferred alignment length.
how can I extract only the results where the alignment lengths is equal or close to the query / subject length?
It's called glocal mode (global-local), unfortunately not implemented in BLAST.
how can I extract only the results where the alignment lengths is equal or close to the query / subject length?
Output the results in tabular mode (-outfmt 6) and then filter using awk/excel/R etc.