FastQC
reports in the overrepresented sequences
module a warning for all R2 FASTQ's corresponding to a stretch of >50 G's, amounting to <0.5% and <0.3% of the whole library before and after removing of adapters (TrueSeq Nano), respectively.
Our partner that performed the labwork advised not to concern about this, idicating that it is a known artifact of NovaSeq runs and that their quality checks did not pick up on this.
I would like to get some advice on whether or not I should deal with this warning (i.e. add this GGG...GGG to the adapter file passed to trimmomatic
), or should I just leave it as it is and proceed with the alignments.
Thanks Michael, is there a tool that you would recomend to do the job? I am not sure if
trimmomatic
is the right tool for this.Thanks
After checking the literature, I found two softwares that address, among other things, the poly-G problem: AfterQC and fastp. Both from the same author, and the later is faster than the former.
I've worked with bbduk.sh from BBMAP and with cutadept. The first is quite fast and flexible the later is better suited for PE-reads (IMHO).