Hello --
I've begun pre-processing of my paired-end RNA seq data (run on Illumina HiSeq).
After running fastqc on my samples, I noticed some have overrepresented sequences corresponding to adaptors.
I've been trying to use Trimmomatic to remove the adaptors, however, after Trimming I get MORE over represented reads than I do before trimming! I'm not sure what's going on.
For instance, in my unprocessed read, I'll have a single overpresented sequence corresponding to adapter index 1. Once trimmed and processed by trimmomatic, I'll have 25 overrepresented sequences, all corresponding to different variants of the adapter index 1 sequence.
Here is my command line:
Code:
TrimmomaticPE -phred33 /R1_001.fastq.gz /R2_001.fastq.gz /R1_pairedout /R1_unpairedout /R2_pairedout /R2_unpairedout ILLUMINACLIP:/TruSeq3-PE.fa:2:30:10 LEADING:5 TRAILING:5 AVGQUAL:20
Any idea what I'm doing wrong? The same thing occurs even if I leave out the ILLUMINACLIP line.
Hello samantha_jeschonek!
It appears that your post has been cross-posted to another site: http://seqanswers.com/forums/showthread.php?t=44949
This is typically not recommended as it runs the risk of annoying people in both communities.
woops, not sure how to delete post so it isn't posted in both places!