Question

How to remove Illumina_small_rna_3'_adapter adapters without knowing their sequence?

0

Entering edit mode

2.9 years ago

Wendy Lorena • 0

I used Trimmomatic, and referenced in the command a file that contained all known adapter sequences for Illumina, including these for small_rna:

>Illumina Small RNA v1.5 3p Adapter
ATCTCGTATGCCGTCTTCTGCTTG
>Illumina RNA 3p Adapter (RA3)
TGGAATTCTCGGGTGCCAAGG
>Illumina RNA 5p Adapter (RA5)
GTTCAGAGTTCTACAGTCCGACGATC
>Illumina 3p RNA Adapter
TCGTATGCCGTCTTCTGCTTGT

The command removed all the adapters for me except the Illumina_small_rna_3'_adapter. Please could someone help me to fix this problem?

rna small • 2.9k views

ADD COMMENT • link updated 2.9 years ago by GenoMax 147k • written 2.9 years ago by Wendy Lorena • 0

1

Entering edit mode

Please provide some additional information. For instance, why do you write that the Illumina_small_rna_3'_adapters were not trimmed ? Was this specific adapter used during library preparation ? Is it contaminating your samples ? Have you used FASTQC or similar to assess that ?

ADD REPLY • link 2.9 years ago by Carlo Yague 8.9k

0

Entering edit mode

Hello Carlo, after trimming I did a quality control with FastQc and in the content of adapters they indicate that there is presence of Illumina_Small_Rna_3 '. For this reason I assume that they were not trimmed, even though when I ran Trimmomatic I gave them the sequence that in the Illumina manual corresponds to Small_Rna_3´. FastQc

ADD REPLY • link 2.9 years ago by Wendy Lorena • 0

1

Entering edit mode

If this is small RNA data then having ~275 bp reads is very odd. Generally one would need only 50 bp reads.

ADD REPLY • link 2.9 years ago by GenoMax 147k

0

Entering edit mode

They are amplicons of the V4 region of the 16S gene.

ADD REPLY • link 2.9 years ago by Wendy Lorena • 0

0

Entering edit mode

Are you following a standard protocol for 16S or trying to roll something of your own? I am not sure why small RNA adapters are involved in 16S amplicons.

ADD REPLY • link 2.9 years ago by GenoMax 147k

1

Entering edit mode

I'm a bit confused too because I have never seen such a adapter contamination pattern: the small RNA 3' adapter is 25nt-long, yet covers about 125nt in the read... Just thinking out loud here, but could it be that serial ligation happened during adapter ligation because of a lack of ddC-3' (that blocks self ligation) on the adapter ?

But perhaps the explanation is more trivial. Can you show the code you used for trimming the reads ?

ADD REPLY • link 2.9 years ago by Carlo Yague 8.9k

0

Entering edit mode

Clear. The command I used was this:

#!/bin/bash
adapter="adapters.fa"
trimmer="ILLUMINACLIP:$adapter:2:12:6"
datadir="."
for r1 in $datadir/*_R1_001.fastq.gz; do
    r1="${r1/.fastq.gz/}"
    r2="${r1//_R1_001/_R2_001}"
    command="trimmomatic PE -threads 20 \
    $r1.fastq.gz $r2.fastq.gz \
    $r1.clean.fastq.gz $r1.unpaired.fastq.gz \
    $r2.clean.fastq.gz $r2.unpaired.fastq.gz \
    $trimmer"
    echo $command
    $command

The adapters.fa file contained all known Illumina adapters, (including that of small_rna_3´).

ADD REPLY • link updated 2.9 years ago by GenoMax 147k • written 2.9 years ago by Wendy Lorena • 0

0

Entering edit mode

In other words, it could have been an error in the sequencing process?

ADD REPLY • link 2.9 years ago by Wendy Lorena • 0

0

Entering edit mode

No I don't think the sequencing itself is a a problem, but there is definitely something weird in those results... Perhaps you could check with whoever provided with the data or made the sequencing libraries whether the adapter contamination makes sense ?

Keeping this issue in mind, perhaps it is possible to find a workaround. What is the next step in your analysis ? Assembly or mapping ? Because it is probably safe to skip adapter trimming before mapping (but not before assembly !), since most modern aligner use soft-clipping and can ignore parts of the read (i.e., adapters) that do not map on the reference.

ADD REPLY • link 2.9 years ago by Carlo Yague 8.9k

0

Entering edit mode

The next step is the elimination of the wrongly named bases, specifically the "N" ones. Could you help me with a command or program to help me achieve this?

Thank you!

ADD REPLY • link 2.9 years ago by Wendy Lorena • 0

1

Entering edit mode

It would not be advisable to remove N calls in the middle of reads. If you have too many of them then there may be an issue with sequencing. Otherwise aligners may be able to handle them during normal course of alignment by treating them as a mismatch.

ADD REPLY • link 2.9 years ago by GenoMax 147k