Hi all,
I used to use old version of BLAST (blastall) but becuase of this problem I try to use BLAST+.
However the same query (short (15-30 bp) sequences of nucleotides) in BLAST+ (task blastn) returns much more hits than blastall... and most of them are dust or low-complexity sequences...
I'm using parameters:
"-task", "blastn",
"-db", database, // human database
"-query", file.getPath(), // file with short sequences in FASTA format
"-gapopen", "5",
"-gapextend", "2",
"-penalty", "-3",
"-reward", "2",
"-dust", "yes",
"-word_size", "7",
"-num_alignments", "100",
"-num_descriptions", "50",
"-max_target_seqs", "50",
"-evalue", "250"
The question is... Do you know how to set parameter of the dust (or other parameters) to get similar results as in old version of BLAST? The most important is to keep word_size equally to 7.
At this moment I have a really huge output file... I tried to use different values but the file is still too big...
Thanks,
Adam