Is It Necessary To Remove Adapters In All Orientations When Preprocessing Ngs Data?
1
6
Entering edit mode
11.1 years ago
JacobS ▴ 990

I'm prepping some NGS Illumina data for downstream analysis. To begin, I want to remove any sequencing/ligating adapters and multiplexing (barcoding) tags. To do this, I am using fastx_clipper, which is part of the FASTX-Toolkit. I've also using Trimmomatic for this in the past.

Example command usage: fastx_clipper -Q33 -a TGGAATTCTCGGGTGCCAAGGAACTCCA-mid_tag_insert-AATCTCGTATGCCGTCTTCTGCTTG -l 14 -M 7 -i Input.fastq -o Output.fastq -v -c

Here is my question... Both of these software packages only scan for a single orientation of the adapter you provide within the Illumina reads. However, I find many sequences in all orientations of the adapter, namely: forward, reverse, forward complement, reverse complement. In the forward orientation, the software detects and trims the adapter in >90% of the reads, but in the other 3 orientations the software only detects are trims adapters in ~5% of the reads.

So, is it possible for the adapters to be found in different orientations than the forward sense, or am I seeing artifacts of non-strict adapter matching? Do people usually trim adapters in every possible orientation? Any other suggestions for successfully handling adapters?

Thanks!

trimming filtering ngs • 8.3k views
ADD COMMENT
4
Entering edit mode
11.1 years ago

Seeing an adapter in the forward orientation is the result of a DNA fragment being shorter than the read length, it is a "normal" occurrence in these cases. The Illumina TrueSeq indexed sequencing adapters were designed in such a way that the same adapter sequence will be found on reads coming from both strands.

In that case adapters present in any other orientation most likely indicate a protocol failure, in which case probably the entire read should be removed.

ADD COMMENT
0
Entering edit mode

@Istvan Thanks for the answer. This is what I expected, which makes me unsure of the FASTX results. Perhaps my stringency is simply too loose, only requiring 7 sequential adapter bases to be matched. I'll try a few variations and report back later.

ADD REPLY
0
Entering edit mode

also to correct myself some (not all) Illumina adapters are designed to produce the same sequence on both strands

ADD REPLY
0
Entering edit mode

Can you please specify which ones do and which ones don't? Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 2096 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6