Hi, guys.
I'm having trouble using cutadapt to trim the raw read(single-end) of small RNA seq.(Illumina TruSeq) I got the index information of the adapter from the sequencing facility. So I tried to trim adapters using cutadapt like below
cutadapt -b TGGAATTCTCGGGTGCCAAGG -b AACTCCAGTCAC"$INDEX_SEQ"ATCTCGTATGCC -O 3 -m 17 -o $trimmed_fastq.gz $fastq.gz
The first adapter is Illumina Samll RNA Adapter and the second one is the adapter information including index(bold 6 bases). But the fastqc report of trimmed read gives me there's TGGAATTCTCGGGTGCCAAGG sequence still over represented~!.
So if I go cutadapt again "-b TGGAATTCTCGGGTGCCAAGG" for $trimmed_fastq.gz, then they are gone. Why doesn't cutadapt get rid of TGGAATTCTCGGGTGCCAAGG at the first time?
I'm asking this because I'm ordered to use cutadapt.... If anybody know what's going on or know how to change the option of cutadapt, please help me..
Thank you
You should be able to see in the log file how many times it found each adapter in each position. Try to compare the two runs.
I tried to compare the tow runs. First trimming result tells me it trimmed for 13298819 items for the first adapter. And then second trimming tells me it trimmed for 432477 items. Doesn't this mean cutadapt failed to trim for all of the "TGGAATTCTCGGGTGCCAAGG" adapter sequences??
Looks like it. That's weird.
Nothing to say in that case.
Putting
bbduk.sh
from BBMap out there in case you are able to use a different program. Option to use would beliteral=TGGAATTCTCGGGTGCCAAGG,AACTCCAGTCAC
Thank you genomax!
I'll try to compare results from the both. Thank you for your advice