Entering edit mode
5.1 years ago
mschmid
▴
180
I am performing the following workflow:
- I start with a bunch of nucleotide sequences as a FASTA
- I blast them against a custom blast database using standalone blastn (
-task blastn
) - Then I cut out all the blastn hits from the initial FASTA nucleotide sequences. This leaves me with supposedly "clean" nucleotide sequences regarding the blast DB used in step 2.
- I check if the sequences are now "clean" regarding the Blast database used in step 2
The thing is that I get hits in the second Blast run which did not pop up in the first run.
I tried to tweak my settings so that I find all hits in the first run. With limited success. I tried:
- Setting
-max_target_seqs
extremely high. So this should not be the problem - Setting evalue extremely high. Should also not be limiting
- Disabled dust and soft masking of low complex. regions
- lowered the parameter
-best_hit_overhang
- and finally made value
-best_hit_score_edge
bigger
But I still get hits in the second Blast run after cutting out the hits from the first run. What am I missing?
Did you also try setting
num_descriptions
andnum_alignments
to an equally high number? Default hits that are returned I think are capped at 500. I don't remember if these apply if you were using-outfmt 6
.