Hi all, I have found a high presence of Illumina adapter content in a batch of samples which were sequenced on a NovaSeq. I set out to trim the adapters using cutadapt:
R1:
cutadapt -a GATCGGAAGAGCACGTCTGAACTCCAGTCAC -o outputR1.fastq.gz R1_001.fastq.gz
R2:
cutadapt -a AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT -o outputR2.fastq.gz R2_001.fastq.gz
These reads map fine with bwa mem, but I have a different type of analysis written initially for bwa backtrack (aln + sampe) and when I try to run bwa sampe it says there are mismatched read pairs (x, y coordinates are off).
If I trim the adapters with Illumina's standard bcl2fastq trimming I see some evidence that adapters were trimmed, but not to the degree I want. Cutadapt trimmed more bases, which was required in this case. I was wondering if there was a way for cutadapt to not change the x,y in the read name. Thanks
Thanks for the info, I have not done any kind of trimming so this has been good to know! Do you have any preference for adapter trimming software?
bbduk
/trimmomatic
/cutadapt
/sickle
etc. should all work.bbduk
is easy to use but that is just a personal thing. You can find a guide for bbduk here.Thank you! I see in the documentation that it says plainly that paired reads should always be used together (this seems like a no-brainer). Definitely a pretty bad oversight on my part.
When you care about them being in order/sync in resulting R1/R2 files. Aligners assume they are and may not always check leading to discordant mapping.