Hi,
i am fairly new to bioinformatics (genomics to be specific) so excuse me if this is a straight forward question.
I have perforemd paired end WES which was performed across different lanes, so I have 2 fastq files per sample (4 in total).
I know that I can merge BAM files after aligning each fastq like this:
bwa mem lane1_R1.fq lane1_R2.fq | samtools view -o lane1.bam
bwa mem lane2_R1.fq lane2_R2.fq | samtools view -o lane2.bam
samtools merge merged.bam lane1.bam lane2.bam
If you dont care about read groups or potential batch effects, is it also possible to just concatenate lane1_R1 and lane2_R1 and then do the alignment, so something like this:
cat lane1_R1.fq lane2_R1.fq > WES_R1.fq
cat lane1_R2.fq lane2_R2.fq > WES_R2.fq
bwa mem WES_R1.fq WES_R2.fq
If anyone could tell me what the "best practice" for this is, I'd be very thankful!
Cheers!
Good question! I don't see this as a rule of what you have to do... I like to concatenate all my R1 and R2 files before alignment. But it should be fine to align your BAM files also!
Cross-posted on reddit: https://www.reddit.com/r/bioinformatics/comments/idum3k/question_merging_bam_vs_concatenate_fastq/
What's up with that, OP?