Split Fastq files
2
0
Entering edit mode
5.0 years ago
bsmith030465 ▴ 240

Hi,

I just got some fastq files from our sequencing center. The folder names are:

DXC-1-1_lane2_20180520000/
DXC-1-1_lane4_20180520000/
DXC-1-1_lane1_20180520000/
DXC-1-1_lane3_20180520000/

DXC-1-2_lane2_20180520000/
DXC-1-2_lane4_20180520000/
DXC-1-2_lane1_20180520000/
DXC-1-2_lane3_20180520000/
DXC-1-3_lane2_20180520000/
DXC-1-3_lane4_20180520000/
DXC-1-3_lane1_20180520000/
DXC-1-3_lane3_20180520000/

.

.

DXC-1-5_lane3_20180520000/

Each folder above has a forward and reverse fastq.gz file. Does this mean that ,for each sample, the fastq has been split in 5 parts (across each lane) and that I'll have to combine the forward and reverse reads for each lane to get one set of fastq files for each sample?

Is there a webpage that would explain this?

thanks!

fastq illumina • 1.3k views
ADD COMMENT
0
Entering edit mode

Hello bsmith030465

Why did you edit your post? There is no change in the content and you already have answers that solved the question.

ADD REPLY
2
Entering edit mode
5.0 years ago

Does this mean that ,for each sample, the fastq has been split in 5 parts (across each lane) and that I'll have to combine the forward and reverse reads for each lane to get one set of fastq files for each sample?

In theory, there might be QC issues between lanes, like if there was a fluid blockage or a bubble, but in general, you can and should combine data from different lanes together. If it all comes from one Illumina library, being split onto different lanes, or even different flowcells is not a problem. The Illumina instrument doesn't add any technical batch effects at that step.

I assumed DXC1-1 and DXC1-2 are different samples, and should not be combined, but you would know that better than anyone here.

ADD COMMENT
1
Entering edit mode
5.0 years ago

that I'll have to combine the forward and reverse reads for each lane to get one set of fastq files for each sample?

if there is more than on pair of fastq for each sample, you can take advantage of this by parallelizing your processes.

First map each pair of fastq with bwa and sort the resulting sam (one process for each pair of fastq)

Then merge each bam by sample .

ADD COMMENT

Login before adding your answer.

Traffic: 2957 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6