Blastn on illumina reads for contamination detection before or after quality filtering?
0
0
Entering edit mode
4.0 years ago

Should you blastn illumina reads for contamination before or after you have quality controlled them based on phred score?

What effects would a before and after be predicted to have?

Assembly • 1.6k views
ADD COMMENT
0
Entering edit mode

you should not blast(n) any reads whenever ...

no , serious now: blast should not be your go-to tool when dealing with NGS data, there exists for better and more efficient software to accomplish blast-like tasks for NGS data.

Seen that you ask for contamination, have a look in to things like KRAKEN (and/or google for "NGS data and contamination" or such)

And before or after quality filtering will not make much difference, if real contamination is present it would not get removed by Q-filtering.

ADD REPLY
0
Entering edit mode

Thank you for this.

Why should one not blastn reads?

ADD REPLY
0
Entering edit mode

efficiency/speed for one. Also the sensitivity of blast on those rather short sequences is less than real read mappers. (perhaps less of an issue here but blast has also no notion of "paired-end", which is an important concept in NGS data).

ADD REPLY
0
Entering edit mode

So should you merge paired end reads when using blastn in this way or just only use the forward reads? I will give kraken a look :)

ADD REPLY
0
Entering edit mode

Remember to use -task blastn-short when you run the blast searches. Blast would be sensitive to contamination from adapter sequences so you should merge and then scan/trim the reads prior to blast searches, if you want to do this.

ADD REPLY
0
Entering edit mode

Blastn is very slow, you may use bwa o bowtie2 for mapping reads on known possible contamination genomes.

Removing condamination AFTER QC is better, because the latter is faster. Runing the slower process in smaller data costs less time.

ADD REPLY
0
Entering edit mode

You should have no contamination in well prepared libraries. Do you know for certain there is contamination?

ADD REPLY
0
Entering edit mode

In the short reads I found no contamination but when blasting the assembled contigs I found 2 mapping to the incorrect species. The culture was pure but the DNA was extracted by the sequencing company. Is is likely the long reads (it is a hybrid assembly) as contaminated but the short reads are not)

ADD REPLY
0
Entering edit mode

blasting the assembled contigs I found 2 mapping to the incorrect species

That does not seems like strong evidence of contamination. Since blast does local alignments it is possible that you may have got those alignments by chance. You would want to investigate carefully before drawing a conclusion.

ADD REPLY
0
Entering edit mode

So due to how the blast algorithm works it is not the best for contamination detection unless paired with other information?

ADD REPLY

Login before adding your answer.

Traffic: 1844 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6