Is there a way to tell blast to output the unmapped reads into a separate file or at least the name of the unmapped sequences?
Is there a way to tell blast to output the unmapped reads into a separate file or at least the name of the unmapped sequences?
I assume you have used blast to align some reads "reads.fa" onto a reference "ref.fa", and have a blast report in "out.bls" ?
% formatdb -i ref.fa -p F
% blastall -p blastn -i reads.fa -d ./ref.fa -o out.bls
As RM answered, you can (hackily) get the read IDs that got no hits. I have added extra code to ensure a clean list of IDs only into "nohits.ids":
% grep -B5 "***** No hits" out.bls | grep '^Query=' | sed 's/^Query= //' > nohits.ids
Now if you want to get back the sequences listed in nohits.ids from reads.fa, you can use this trick:
% formatdb -i reads.fa -p F
% fastacmd -d ./reads.fa -i nohits.ids -D 1 -o nohits.fa
Good luck!
If i understand your query correctly: This will extract sequences without hits (unmapped sequences) from blast (BLASTN 2.2.25+) output.
grep -B5 "***** No hits found *****" blast.output.txt | grep Query=
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
How do you map your reads with BLAST? To what? I don't understand your question.