Trimming of reverse complement primers
1
0
Entering edit mode
2.7 years ago
aupris • 0

Hello everyone, I have a question(probably very basic one) about read processing. I have paired ended Illumina reads of some virus(not important which one). Few words about lib preparation: 3000bp amplicon, then they used something like Nextera for fragmentation.

My task is to trim pcr primers and adapters. Question: how to do it properly? I am confused about primer orientation and location. Should I trim both forward and reverse primer from each one from the pair? Should I include reverse complements of forward and reverse primers and trim them? Where do the should be located, in the beginning or at the end of read-pair? Same question about adapters.

And, the general question: how to figure out, is it necessary to trim reverse complements of primers/adapters? Does it depends on method of library preparation(pcr, ligation)? And how? Thanks.

Primer trimming • 2.0k views
ADD COMMENT
0
Entering edit mode

Not sure if it helps but I like to use cutadapt (https://cutadapt.readthedocs.io/en/stable/guide.html). If your reads are longer then the amplicon you need to reverse complement the reverse primer and trim them from the R1 reads. If the amplicon is longer then the read length it is mostly fine to trim the forward primer from the forward reads and the reverse from the reverse reads.

This can also be usefull: https://cutadapt.readthedocs.io/en/stable/recipes.html

But there are many ways to do things. I would just just play around with a trim tool and check the log output every time you change some settings.

ADD REPLY
0
Entering edit mode

If you get sick of going through logs to see if the trimming is doing what you think it's doing, I've written a tool to visualise trimming - just feed it the before-trimming .fastq and the after-trimming fastq: https://github.com/MonashBioinformaticsPlatform/trimviz

It's also capable of summarising sequences that are soft-clipped during mapping (from a .bam file) - if you get lots of the same sequence getting clipped, it might be an adapter sequence that would be better to trim prior to mapping.

ADD REPLY
0
Entering edit mode
2.7 years ago

Trimming is a dark art, nobody really knows what is on your sequences contain, there is a huge variability in strategies, library preparations, sequencing protocols etc.

But then it is really easy to visualize the content

cat mydata.fq | grep ATGC --color

count the various hits, what percent is present,

now look at where the matches are relative to the read, the colors will show it clearly, visually extend the matches past the hits, are those the same?

then use the reverse complement it etc. take jhust the start or end fragments of the primers/indices etc

Pretty soon you'll have a solid understanding of what in your data

Now think about how are you going to use the data. Does the present of a primer matter? If not leave it on. If yes cut it off :-)

ADD COMMENT
0
Entering edit mode

Thank you, Albert.

ADD REPLY

Login before adding your answer.

Traffic: 2202 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6