Question

Are merged bam files and merge fastq file -> bam files same?

0

Entering edit mode

6.7 years ago

woongjaej ▴ 30

Hi, guys

I'm processing NGS data and have a question.

I need to make my data have over 100,000,000 reads, so when my first processing is done, I check if they are good to go. When the bam files are not over 100,000,000 reads, I sequence those libraries which are more needed.

Here are the questions. 1. If I suppose my library, sample, sequencing machine and everything is the exactly the same, are the bam file which is merged after mapping and pre-merge fastq file, then mapped bam file same??

And if they are same, can I merge bam files using samtools or sambamba??

Thank you very much.

Woongjae

sequencing bam merge • 2.9k views

ADD COMMENT • link updated 6.7 years ago by Devon Ryan 104k • written 6.7 years ago by woongjaej ▴ 30

score 0 · Answer 1 · 2018-03-06

0

Entering edit mode

6.7 years ago

Devon Ryan 104k

Yes, they'll be essentially the same. There's always a bit of randomness with aligners, so you might find things like a different primary alignment for some multimappers. But everything high quality should be the same. Yes, you can then merge the BAM files with sambamba or samtools. If you're doing variant calling, be sure to assign appropriate read groups to each run.

ADD COMMENT • link 6.7 years ago by Devon Ryan 104k

0

Entering edit mode

I guess in an RNAseq-setting this does not hold true, since some aligners have a threshold for junction detection. If you have split files, you'll may miss junctions. The merged BAM is still missing these junctions whereas the mapping of the total reads' set find those and store it in the BAM file.

For DNAseq, I agree.

ADD REPLY • link 6.7 years ago by michael.ante ★ 3.9k

0

Entering edit mode

If you're doing something with 2-pass then yes, you could theoretically miss something. Given the numbers getting tossed around by OP I suspect that's not the case.

ADD REPLY • link 6.7 years ago by Devon Ryan 104k

0

Entering edit mode

Thank you for the replies guys.

So you mean I can either merge fastq files first and then process the mapping or process mapping first for the additional fastq file and then merge the bam file with existing bam file, right??

ADD REPLY • link 6.7 years ago by woongjaej ▴ 30

0

Entering edit mode

Right. Some things, like looking for novel splice junctions, work better if you align everything in a single go (so merge the fastq files). For most other things it doesn't much matter if you merge fastq or BAM files, you get more or less the same result either way.