I'm using Trimmomatic/0.33 to proces my raw Illumina reads. I've seen some adapter contamination with FASTQC so I first run Trimmomatic as follows:
java -jar trimmomatic-0.33.jar PE R1.fastq.gz R2.fastq.gz PR1.fastq.gz UPR1.fastq.gz PR2.fastq.gz UPR2.fastq.gz ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:10:1:keepBothReads
TruSeq3-PE-2.fa contains:
>PrefixPE/1
TACACTCTTTCCCTACACGACGCTCTTCCGATCT
>PrefixPE/2
GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
>PE1
TACACTCTTTCCCTACACGACGCTCTTCCGATCT
>PE1_rc
AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA
>PE2
GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
>PE2_rc
AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC
Trimmomatic reports back:
Input Read Pairs: 39618168
Both Surviving: 28128581 (71.00%)
Forward Only Surviving: 8927582 (22.53%)
Reverse Only Surviving: 47164 (0.12%)
Dropped: 2514841 (6.35%)
I was surprised that despite selecting the option to keep both reads even if they are palindromic via "keepBothReads" there is only 71% of "Both Surviving" : what might be the reads which did not survive? I can think of them only as entirely consisting of adapter sequences - is that correct? Thanks for any suggestion!