Hi,
I have around 70 samples which have undergone WES.
I have a list of the adapters used for each sample; however, it seems that the sequencing company used lots of different adapters for each sample (I.e they all used different adapters). Originally, I was going to use the adapter sequences as an input for fastp and then run it in parallel. However, because the adapter sequences are all different, I don't think I can do this.
Is there a way to just run fastp on default to find adapters or is it good practice to provide each individual adapter sequence?
Thanks! Amy
You absolutely can. In Illumina sequencing there is a core sequence at beginning of adapters as @Istvan showed below. So any adapter sequence is always going to be present on 3'-end of reads (unless you have adapterdimers). Scanning/trimming programs identify this sequence and then trim remaining read 3' of that adapter (including it).
Thanks GenoMax and @Istvan! So, would it be okay to use a scanning/trimming program without giving it these adapter sequences because they will already look for this core sequence?
Or would you give fastp the adapters for each sample separately?
Thank you for your patience!! Amy
First I would establish that the adapter does indeed exist.
Many adapters are automatically recognized by fastp and reported in the HTML file that gets generated by default. FastQC also recognizes a number of common adapters and shows them in the report.
Run these tools on a few samples and see what these say.
See also the similar posts in the right hand sidebar ---->, for example:
illumina adapter specifying and removing using fastp
Thank you! Will give this a try
I will put a plug in for
bbduk.sh
from BBMap suite. It is also easy to use and include a full set of commercially available sequences in theadapters.fa
file inresources
directory in software bundle.A guide to use
bbduk.sh
is available here: https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/bbduk-guide/Thank you!!