When I want to remove adapter from the raw reads, I used the fastp tools. But the problem is that I still need to trim 2 N bases from 5' end.
So my classmate told me to use fastp trim the 5' adapter first, and run the fastp again to trim 2N bases from 5' end.
But why I cannot just run once to remove the bases with a fixed length? For example, remove length = length of adapter + 2 (N).
My classmate said that because the length of adapter in each read is not fixed, so you don't know the exact length of adapter.
I am a little bit confused because the adapter is shown in the way like:
5’ ABCDEFGHIJK NNXXXXXX ABCDEFGHIJKHL 3'
3’ ABCDEFGHIJKHL NNXXXXXX ABCDEFGHIJKH 5’
I though the length of adapter will be the length of "ABCDEFGHIJK" . I want to confirm whether my classmate is right and why?
AND MAYBE
Can someone recommend some materials to read through to better understand the Library construction thing ?
Thank you very much! I have a much clear mind now.
fastp
do allow me to trim the fix length of adapter, so I am thinking of counting the extract 2 N bases into adapter. The another reason why I want to use the length to trim data is that N bases are not accepted while specifying the trimming adapter sequence.So, the length of adapter is fixed indeed.
Thank you very much again!