quality filtering of Miseq dataset
2
1
Entering edit mode
10.0 years ago
User 4014 ▴ 40

Hi folks,

I am new to NGS, but I am about to work on amplicon sequencing with Miseq (PE). I am now studying an article from Balint et al. (2014) An Illumina metabarcoding pepline for fungi, which mentions 'It is important to preserve the order of the reads in both forward and reverse read files: the paired-end read assembler needs corresponding read orders in both files'.

This might be an innocent question, but I would like to ask how I obtain the files with the right order of the reads preserved in both forward and reverse read files? Do I get the files from a sequencing facility with specific requests or do I need a program or a script to work on afterwards?

Thank you in advance!

Johan

Amplicon-sequencing • 2.5k views
ADD COMMENT
0
Entering edit mode

Thank you very much for super quick replies! :)

ADD REPLY
2
Entering edit mode
10.0 years ago
Dan D 7.4k

Your FASTQ data will come off the MiSeq like this:

[Sample Name]_S[Sample number]_L001_R[read number]_001.fastq.gz
#                                   ^^^^^^^^^^^^^^

Read 1 data will have R1 for the highlighted portion of the file name.

Read 2 data will have R2 for the highlighted portion.

What the paper is saying is not to combine those files before feeding them to the assembler. There are some (mostly old) programs (trimmers, groomers, etc) that combine paired-end data into a single file (this is sometimes termed an interleaved dataset) in the course of processing.

ADD COMMENT
1
Entering edit mode
10.0 years ago

If you have paired reads, they should come from the facility in two files, both in the same order (or in one file with read 1 and read 2 alternating). To keep them in the same order during filtering or trimming, you just need to use a pair-aware program such as BBDuk that processes both reads at the same time.

ADD COMMENT

Login before adding your answer.

Traffic: 2719 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6