Question

Primer sequences in fastq files from 16s rRNA experiment

0

Entering edit mode

6.2 years ago

wangdp123 ▴ 340

Hi there,

I am working on the analysis of 16s RNA datasets using mothur toolkit.

In the fastq files, the illumina adaptor overhang sequences have been eliminated but primer sequences are still present.

Would you like to provide some advice on whether the primer sequences should be removed from the fastq files before creating contigs?
Is there a need to take into account the removal of primer sequences in the process of creating the customized reference alignment that will be used by align.seqs?

Many thanks,

Tom

16s rRNA mothur • 3.7k views

ADD COMMENT • link updated 6.2 years ago by gb ★ 2.2k • written 6.2 years ago by wangdp123 ▴ 340

score 3 · Accepted Answer · 2019-04-25

3

Entering edit mode

6.2 years ago

gb ★ 2.2k

You should always remove the primers, they are "non informative" this is because that primer sequences in your fastq is not (always) the same as the real biological sequence.This has to do with the PCR and annealing temperatures. Dont know what you mean with creating contigs, if you want to merge/assemble paired ends you can remove them before or afterwards. Afterwards is easier I think. Especially with creating a reference you don't want the primer sequence still on.

ADD COMMENT • link 6.2 years ago by gb ★ 2.2k

0

Entering edit mode

primers are "non informative"

I'd agree on that, however my reasoning is a bit different. (Traditionally) you target the conserved regions, so there's close to no variation in that part of the sequence anyway (except the designed ambiguities). What you don't know is how much of the sample you didn't amplify because your primers didn't match in the first place

primer sequences in your fastq is not (always) the same as the real biological sequence

Out of curiosity, is this personal experience or can you link to some systematic observations?

ADD REPLY • link 6.2 years ago by Carambakaracho ★ 3.3k

2

Entering edit mode

Yes they are also non informative because they are all the same anyways. With the primer, you target most of the times like you said a conserved region, but even that it is conserved there are differences (that you can not solve with ambiguities). In other words, the primer is not exactly the same as that conserved region. A mismatch is allowed and it still binds because of the annealing temp. When you sequence the product you sequence the primer seq and not that conserved region. In practice that reason does not matter much because the fact that they are all the same is reason enough to trim them off but I just wanted to try to explain.

I think if you need a reference you can use this paper

Primer mismatch is an inherent characteristic of PCR with ‘universal’ primers, while, owing to single nucleotide variability even in the evolutionarily highly conserved regions of the rRNA genes, the designation of a perfectly matching ‘universal’ primer is simply not possible (Schmalenberger et al., 2001; Baker et al., 2003).

Rita Sipos, Anna J. Székely, Márton Palatinszky, Sára Révész, Károly Márialigeti, Marcell Nikolausz, Effect of primer mismatch, annealing temperature and PCR cycle number on 16S rRNA gene-targetting bacterial community analysis, FEMS Microbiology Ecology, Volume 60, Issue 2, May 2007, Pages 341–350, https://doi.org/10.1111/j.1574-6941.2007.00283.x

https://academic.oup.com/femsec/article/60/2/341/584515

ADD REPLY • link 6.2 years ago by gb ★ 2.2k

0

Entering edit mode

Awesome, thank you - I learned something!

ADD REPLY • link 6.2 years ago by Carambakaracho ★ 3.3k

0

Entering edit mode

Thanks for this comment.

Further to this question, after removing the 5'-primers for both R1 and R2 reads from fastq files and assembling the reads from fastq files together, it is seen that the primer-like sequences appear in the 5'-end and 3'-end of the merged contigs, which is because it has sequenced through to the other end of the targeted regions of 16s rRNA.

Is there any need to continue to remove the primer-like sequences in the fasta files? Or it is OK to leave them there?

Thanks,

ADD REPLY • link 6.2 years ago by wangdp123 ▴ 340

0

Entering edit mode

You need to remove both primers

ADD REPLY • link 6.2 years ago by gb ★ 2.2k