I am new to metagenome and I am confused about the quality control strategy of 16s pair_end sequence from Illumina miseq platform. No doubt quality control strategy may affect the downstream analysis. I find the slide window of 50bp is common used. But for pair_end reads, I used FLASH software to assemble the contigs, I am not sure about the QC strategy. Reads are truncated at the end of the last window before the average quality score falls below the threshold, even if downstream windows would again rise above the average quality score threshold. Unfortunately about half of my reads were truncated too short. So I used the strategy of FASTX with -p 60 -q 20.Little reads were trimmed. Is it not strict enough? any suggestion? Thanks.
Thanks for reply. I just did as you say. What confuse me is what QC strategy should be performed after FLASH.
The slide window of 50bp and the FASTX with
-p 60 -q 20
seem not suitable enough.I see. Sorry, didn't get it from your post. I had that problem with fastx before and not sure what is the reason for that. It seems like other people also have that problem https://biostar.usegalaxy.org/p/7715/ At the end I used a quality trimmer built-in into Pipeline Pilot but haven't found a free analog for it. Have you tried other trimmers, like trimmomatic?