Question

Should we assemble/merge R1 and R2 reads from Illumina MiSeq of fungal ITS amplicon before further analysis?

1

Entering edit mode

10.1 years ago

sentausa ▴ 650

Dear all,

I'm new in fungal ITS (internal transcribed spacer) metabarcoding analysis, so please ask if you don't understand my question.

Our lab received Illumina MiSeq sequencing result of our 150 samples from a company, but they give us only the R1 and R2 fastq files, without the consensus sequences. Our lab used to have 454 sequencing previously, and usually we get the consensus sequences. My question is, should we assemble those R1 and R2 into a consensus sequence before further analysis?

We have a pipeline for further analysis which is based on the 454 data, and I'm afraid that if we don't have consensus sequences beforehand, the R1 and R2 sequences would be recognized as two different ITS by the pipeline.

Any idea?

And thanks a lot in advance.

MiSeq metagenomics ITS Assembly • 7.1k views

ADD COMMENT • link updated 2.6 years ago by Ram 44k • written 10.1 years ago by sentausa ▴ 650

1

Entering edit mode

10.1 years ago

5heikki 11k

What's your fragment size? Do the pairs overlap? They should, if you knew what you were ordering. In this case, you should most definitely merge the pairs, QC the seqs, and proceed to OTU clustering and taxonomy assignments..

ADD COMMENT • link 10.1 years ago by 5heikki 11k

0

Entering edit mode

10.1 years ago

marina.v.yurieva ▴ 580

If you are talking about QIIME pipeline, you should merge them. You can use Flash to do that. But check reads QC after that. You might have to trim them if quality is not very good. And double-check if the adapters are trimmed. Alternatively, you can use just one read but if you want to take an advantage of the PE then you should merge them.

ADD COMMENT • link updated 2.8 years ago by Ram 44k • written 10.1 years ago by marina.v.yurieva ▴ 580

0

Entering edit mode

9.8 years ago

mikhail.shugay 3.5k

More library structure details are needed, however I completely agree with sentausa. For example in this case

  ----  ---- read1
 ----  ----  read2
----  ----   read3
======       consensus1
      ====== consensus2
...******... region of interest (barcode)

only assembled consensus sequences can be overlapped, therefore pre-overlapping of reads can decrease barcode yield.

ADD COMMENT • link updated 2.8 years ago by Ram 44k • written 9.8 years ago by mikhail.shugay 3.5k

Ram · Accepted Answer · 2015-03-31

3

Entering edit mode

9.8 years ago

sentausa ▴ 650

After a few months, I've learnt one or two things about ITS analysis, and one of them is that R1 and R2 reads might be used without merging them beforehand. In fact, we might fail to identify many species using only the merged paired-reads (it's pointed out in this paper, for example).

ADD COMMENT • link updated 2.6 years ago by Ram 44k • written 9.8 years ago by sentausa ▴ 650