Question

Insert size distribution for BWA

0

Entering edit mode

5.9 years ago

seta ★ 1.9k

Hi all,

I know that BWA mem calculates insert size distribution during mapping. However, for some problems, I think it may wrong in this calculation, so I calculated this parameter by Picard (CollectInsertSizeMetrics), but I don't sure how I can feed them into BWA mem via I option. Could you please kindly help me out on this issue?

Thanks

insert size BWA genome sequencing • 2.1k views

ADD COMMENT • link 5.9 years ago by seta ★ 1.9k

0

Entering edit mode

Picard is only reading what bwa calculated, so I doubt you gain anything.

ADD REPLY • link 5.9 years ago by ATpoint 88k

0

Entering edit mode

OK, thanks for your point. What do you think about wrong insert size estimation by bwa mem, is it possible? If I should use the alternative calculator for getting insert size distribution, then feed to bwa mem?

ADD REPLY • link 5.9 years ago by seta ★ 1.9k

0

Entering edit mode

Usually you could estimate the library fragment size distribution from a gel or bioanalyzer results before sequencing, then compare to the fragment size estimated by BWA. Or you may map using another read aligner and compare that estimation with BWA.

ADD REPLY • link 5.9 years ago by Vitis ★ 2.6k

0

Entering edit mode

Is this any non-standard library? What kind of experiment is it? Is this maybe something transformed back to fastq from bam without shuffling reads?

ADD REPLY • link 5.9 years ago by ATpoint 88k

0

Entering edit mode

It's whole genome sequencing by Illumina (100bp PE). I posted the original problem here, could you please take a look at it or I explain again here?

ADD REPLY • link 5.9 years ago by seta ★ 1.9k

0

Entering edit mode

Yeah this is pretty much what I was suspecting. You will have quite some multimapping because of the HLA allelel so it is not unexpected that insert size calculation is off. That is why bwa expects random read order so that normally most reads in the batch of reads that is processed together come from well-mappable regions and stabilize insert size estimation. I have no experience towards HLA mapping, so I cannot contribute any further but will ask around in the Slack if someone has a recommendation.

ADD REPLY • link 5.9 years ago by ATpoint 88k

0

Entering edit mode

Thanks for your feedback. Assuming the insert size estimation by bwa mem may be wrong, could you please tell me how I can define the insert size distributions (which I calculated by picard) for bwa mem via I option?

ADD REPLY • link 5.9 years ago by seta ★ 1.9k