newbie question: why the length of adapter is not fixed?
1
0
Entering edit mode
5 months ago
Dora ▴ 10

When I want to remove adapter from the raw reads, I used the fastp tools. But the problem is that I still need to trim 2 N bases from 5' end.

So my classmate told me to use fastp trim the 5' adapter first, and run the fastp again to trim 2N bases from 5' end.

But why I cannot just run once to remove the bases with a fixed length? For example, remove length = length of adapter + 2 (N).

My classmate said that because the length of adapter in each read is not fixed, so you don't know the exact length of adapter.

I am a little bit confused because the adapter is shown in the way like:

5’   ABCDEFGHIJK NNXXXXXX ABCDEFGHIJKHL    3'

3’   ABCDEFGHIJKHL NNXXXXXX ABCDEFGHIJKH 5’

I though the length of adapter will be the length of "ABCDEFGHIJK" . I want to confirm whether my classmate is right and why?

AND MAYBE

Can someone recommend some materials to read through to better understand the Library construction thing ?

Library construction Sequencing • 382 views
ADD COMMENT
0
Entering edit mode
5 months ago
GenoMax 147k

Imagine trying to make libraries from a set of fragments with identical (or very similar) sequences e.g. amplicons. You can appreciate that there will be low-nucleotide diversity present. Illumina sequencing is predicated to assume that there will be a normal ~25% distribution of A/C/T/G at each cycle in sequencing. With amplicons this assumption can be violated leading to every fragment in library having the same base at a particular location. This affects imaging/spot identification. So to avoid this low nucleotide diversity variable length PCR primers (like in your case) shift the sequence frame.Phased primers allow for equalization of nucleotide diversity. Addition of a neutral DNA like phiX spike-in at the time of run is also used. See this paper for a way to do this: https://bmcmicrobiol.biomedcentral.com/articles/10.1186/s12866-015-0450-4

While trimming you can use ABCDEFGHIJK to remove fixed part of the adapter but then the 2 additional bases will need to be addressed. Depending of type of data these bases may be soft-clipped by aligner even if they are not explicitly removed.

But why I cannot just run once to remove the bases with a fixed length? For example, remove length = length of adapter + 2 (N).

If fastp allows you to do that then you can.

ADD COMMENT
0
Entering edit mode

Thank you very much! I have a much clear mind now. fastp do allow me to trim the fix length of adapter, so I am thinking of counting the extract 2 N bases into adapter. The another reason why I want to use the length to trim data is that N bases are not accepted while specifying the trimming adapter sequence.

So, the length of adapter is fixed indeed.

Thank you very much again!

ADD REPLY

Login before adding your answer.

Traffic: 1869 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6