Question

RNA-Seq multiple lane

1

Entering edit mode

7 weeks ago

MaxMin ▴ 10

Hello everyone,

I have paired-end fastq files from different lanes. When is it most appropriate to merge them during a transcriptomic analysis?

The files are named like: File_condition1_R1_L001.fastq, File_condition1_R2_L001.fastq, File_condition1_R1_L002.fastq, File_condition1_R2_L002.fastq, etc.

I used HISAT2 and generated BAM files, then I merged the BAM files from the same sample but from different lanes (ex. File_condition1_R1_L001.bam and File_condition1_R1_L002.bam).

After that, I sorted the merged BAM files using Samtools.

Is this correct? Do you have any suggestions or opinions on the best method to work with the same sample from different lanes?

Thank you very much in advance.

bam RNA-seq transcriptomics • 476 views

ADD COMMENT • link 7 weeks ago by MaxMin ▴ 10

1

Entering edit mode

After that, I sorted the merged BAM files using Samtools. Is this correct?

As long as you did the alignments using the paired files you can merge the BAM files for lanes for a sample afterwards.

File_condition1_R1_L001.bam

But if you aligned the reads independently i.e. R1/R2 independently then that would be wrong. You always want to align the read-pairs for a sample together.

ADD REPLY • link 7 weeks ago by GenoMax 147k

0

Entering edit mode

Yes, sorry, my mistake. I meant to say that I used R1 and R2 of a sample, generated the BAM, and then merged the BAM files from different lanes for the same sample. On the merged BAM files, I then performed sorting and indexing. Is that correct?

ADD REPLY • link 7 weeks ago by MaxMin ▴ 10

score 1 · Answer 1 · 2024-10-02

1

Entering edit mode

7 weeks ago

Trivas ★ 1.8k

I merge the raw fastq files upfront before any downstream processing.

ADD COMMENT • link 7 weeks ago by Trivas ★ 1.8k

0

Entering edit mode

Hi, thank you very much for your opinion! In your approach, isn't there a risk of compromising the quality if one of the lanes has a sample with low quality? if QC doesn't pick up anything, the mapping might, right?

ADD REPLY • link 7 weeks ago by MaxMin ▴ 10

2

Entering edit mode

I have never seen a failed run in modern Illumina machines that was actually released to the customer by a sequencing center. Since sequencing replicates do in my experience introduce no batch effects I agree with Trivas to simply cat them together.

ADD REPLY • link 7 weeks ago by ATpoint 85k

1

Entering edit mode

I've never experienced an instance where one lane failed but the rest of the flowcell was fine. In that scenario, I'd guess that those reads would have lower quality and would likely be removed by trimming steps.

ADD REPLY • link 7 weeks ago by Trivas ★ 1.8k