Need an advice on quality control of illumina reads
2
0
Entering edit mode
8.2 years ago

Hi,

I have a dataset generated by illumina MiSeq. FastQC failed the per base sequence quality and sequence length distribution modules.

I did a quality trim using a sliding window 5, step 2 and min quality score of 20. and filtered reads less than 70bp. This removes low quality bases but when look at the sequence length distribution, I noticed that the number of reads of length 300 were reduced from nearly 500,000 to 160,000. Appreciate any advice on this.

Thanks

next-gen • 3.1k views
ADD COMMENT
1
Entering edit mode

Can you provide details (images) of your FASTQC results? Can you elaborate what you mean by "number of reads of length 300 were reduced from nearly 500,000 to 160,000"?

ADD REPLY
1
Entering edit mode

Thanks a lot. I'm new to this kind of analysis and really appreciate your advices.

The below images show initial per base qualities and sequence lengths distribution

commercial photography locations

http://www.freeimagehosting.net/upl.php commercial photography locations

After trimming and filtering reads <70bp commercial photography locations

commercial photography locations

I hope the images are clear. Am I doing the correct thing? If need any more clarification pls ask me.

Thanks Sumudu

ADD REPLY
1
Entering edit mode

Thanks! Those look fine to me. I might not have trimmed so aggressively (I usually use a phred cutoff of 5) but otherwise that looks correct.

ADD REPLY
1
Entering edit mode

If you want, you can trim reads beyond 250. You may get better alignment.

ADD REPLY
4
Entering edit mode
8.2 years ago

The sequence length distribution will always fail if any reads have been trimmed, just ignore that. It sounds like either you're trimming too aggressively or the sequencing quality just wasn't that good. I have to say that the quality at the end of really long reads tends to decrease a fair bit, so it's sort of expected for trimming to do that. If you're happy with the resulting quality scores then continue with mapping or assembly or whatever. If not, then retrim appropriately. As Satya mentioned, we can't give any other advice without seeing the plots.

ADD COMMENT
0
Entering edit mode

Very much this.I'd also add that various tools are getting better and better with soft-clipping and handling poor quality bases all of the time so depending on your application trimming may not even be that necessary.

ADD REPLY
0
Entering edit mode
8.2 years ago

Thank you all very much !!!

ADD COMMENT

Login before adding your answer.

Traffic: 1686 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6