Hello,
I'm new to bioinformatics and am at trimming reads stage of my pipeline. I guess I'm confused on how many operations I am supposed to conduct for my samples.
I have whole genome sequenced samples from Illumina sequencing.
From reading around, my understanding is that I should at least use IlluminaClip to get rid of the adapter sequences, followed by Sliding Window for them, but there's lots of other things like MinLen, Leading, Trailing, etc that are available.
Is Illumina Clip and Sliding Window enough? Should I be doing additional trimming operations? Should I be using the FASTQC/MULTIQC to inform what operations I do? If so, what should I be pulling from the FASTQC/MULTIQC to make those decisions?
Thanks for your time.
It would definitely be useful to first run FastQC/MultiQC on your data to understand the overall characteristics of the data. Make a note of presence of any adapters FastQC detects (Note: It is not mandatory that your data contain adapters. In fact with good quality libraries there should be no adapter present in your data). Post the MultiQC output of the adapter plots and quality plots if you want a second opinion of biostars.
Please consider using
fastp
orbbduk.sh
from BBMap suite for scanning/trimming. Both programs have easy to understand options and are easy to use compared totrimmomatic
.Follow up question, how do I share the .html file here to show you the Multi QC report?
You can post screenshots of plots you have questions about.