Question

Possible tiny amount of adaptor sequence left

0

Entering edit mode

7.7 years ago

GLFR ▴ 10

Hello,

I used trimmomatic to remove adaptor sequences and sequences less than 36bp from my RNA-seq reads. At first I used TruSeq 3 adaptors that came with the trimmomatic binary but this didn't remove much at all. I then used the TruSeq3-2 adaptors which removed nearly everything, but unless my eyes are playing tricks on me, on the "adaptor sequences" graph I can still see a tiny red line indicating "Illumina universal primer" at position 85 on the read. As the reads are 85bp long, does this indicate that I have just 1bp of adaptor sequence left in my reads? If so, is this anything to worry about? As FASTQC didn't highlight any specific primer sequence I spotted this presence of adaptors using the graph and I made an educated guess it would be TruSeq3 so I'm not sure how I'd go about spotting which specific primer I have in my sample.

Many thanks for your help as always

EDIT: I should probably add my sequences are Illumina HiSeq 2X100bp PE reads

RNA-Seq • 1.8k views

ADD COMMENT • link 7.7 years ago by GLFR ▴ 10

0

Entering edit mode

I suggest that you use bbduk.sh from BBMap with tbe tpo flags to take care of these types of issues (see: A: BBDuk and qtrim parameter )

ADD REPLY • link 7.7 years ago by GenoMax 146k

0

Entering edit mode

I have just 1bp of adaptor sequence left in my reads

It may be that there are reads extending into the adapter by just 1bp and the trimmer doesn't "see" that last base as adapter, which is probably the right thing to do unless you want to be very stringent and systematically remove that last base (this in turn could introduce some biases). I don't know about trimmomatic, but cutadapt is set by default to trim if the overlap with the adapter is 3 bases or more. So reads ending with 1 or 2 adapter bases are left untouched. Consider also that these trimmers, as far as I know, process each read independently of the others, so the trimmer "doesn't know" if a base in a position is overrepresented across reads.

My guess anyway is that minimal contamination left is negligible as the aligner will align these reads fine.

ADD REPLY • link 7.7 years ago by dariober 15k