I am working with Bulk RNAseq data (Illumina, PE, 101bp), 3 samples of a condition X and 3 samples control.
I have performed FastQC in all of them and although in general, they look okay (adapter content included), there is one sample (condition X) that although it has passed the adapter content, if you check the Overrepresented sequences, it appears "TruSeq Adapter, Index 1 (100% over 50bp)".
The rest of the samples (all the controls and the rest of the conditions) do not contain that particular overrepresented sequence previously mentioned.
My initial thought is removing the adapters (getting the sequence that Illumina provides) in all the samples in order to follow the same pipeline and therefore, be able to compare them (DE analysis, enrichment...).
However:
I do not know if it makes sense for those that do not have adapters, cause 1) I would be using extra-space in disk for "nothing" and I may modify them in a unnecessary step (?)
I also think that if the sample does not have that sequence, Cutadapt (the tool that I will use) will not do anything to it and it will be exactly the same (?)
So my question would be:
When you are trimming (just adapters, not Phred Score), what do you usually do? Do you run the same pipeline for all of the samples that you have or you just run the tool with the sample that "should be trimmed"?
**Needless to say that I have read all the possible posts/papers related to if I should trim or not when working with RNAseq data and there is not a proper consensus. Some people recommend doing it (better do it than to be sorry later), others not (they trust in the soft-clipping of the newest tools) and others just mention that it is recommended when you are working with small RNAseq or if the data is used for variant analyses, genome annotation or genome or transcriptome assembly purposes.
Any feedback will be really appreciated.
Thanks in advance
Just do it. There is no harmful effect. If there are no adapters present nothing happens, except you spend a bit of time on the scan/trim run.
Yes, run it for all samples. I generally delete the trimmed files immediately after (or I pipe them directly from cutadapt into the next step of my pipeline if possible, therefore using no disk space).
6 samples is nothing -- if you were analyzing three thousand samples, I might recommend skipping adapter trimming due to computational cost, but I don't see any cost of trimming adapters from 6 samples to improve the quality of your alignments (even if very slightly).