Hello everyone, I have a question(probably very basic one) about read processing. I have paired ended Illumina reads of some virus(not important which one). Few words about lib preparation: 3000bp amplicon, then they used something like Nextera for fragmentation.
My task is to trim pcr primers and adapters. Question: how to do it properly? I am confused about primer orientation and location. Should I trim both forward and reverse primer from each one from the pair? Should I include reverse complements of forward and reverse primers and trim them? Where do the should be located, in the beginning or at the end of read-pair? Same question about adapters.
And, the general question: how to figure out, is it necessary to trim reverse complements of primers/adapters? Does it depends on method of library preparation(pcr, ligation)? And how? Thanks.
Not sure if it helps but I like to use cutadapt (https://cutadapt.readthedocs.io/en/stable/guide.html). If your reads are longer then the amplicon you need to reverse complement the reverse primer and trim them from the R1 reads. If the amplicon is longer then the read length it is mostly fine to trim the forward primer from the forward reads and the reverse from the reverse reads.
This can also be usefull: https://cutadapt.readthedocs.io/en/stable/recipes.html
But there are many ways to do things. I would just just play around with a trim tool and check the log output every time you change some settings.
If you get sick of going through logs to see if the trimming is doing what you think it's doing, I've written a tool to visualise trimming - just feed it the before-trimming .fastq and the after-trimming fastq: https://github.com/MonashBioinformaticsPlatform/trimviz
It's also capable of summarising sequences that are soft-clipped during mapping (from a .bam file) - if you get lots of the same sequence getting clipped, it might be an adapter sequence that would be better to trim prior to mapping.