Bwa Map Single And Pe Reads To Ref?

1

Entering edit mode

11.4 years ago

Adrian Pelin ★ 2.7k

Hello,

I want to store detect my variants in a VCF format. For that I need to map my PE reads (100x2) to reference. What I have discovered however is that for a lot of my PE reads, the fragments size is smaller than 200bp, which means some PE reads overlap, increasing the coverage of a region artificially.

I thought I would merge the PE reads with SeqPrep to place all PE reads that overlap in a merged.fastq file.

My question is, how do I then map both unmerged PE reads and merged SE reads to the genome with bwa? I only know how to do it if it's only SE or only PE.

Also, does this approach sound reasonable? Adrian

bwa vcf • 3.2k views

ADD COMMENT • link 11.4 years ago by Adrian Pelin ★ 2.7k

1

Entering edit mode

At least GATK is supposed to handle this properly. In fact, I'd have serious misgivings about a variant caller that handled this situation incorrectly.

ADD REPLY • link 11.4 years ago by Devon Ryan 105k

0

Entering edit mode

@dpryan79 : very useful, thanks.

ADD REPLY • link 11.4 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

" increasing the coverage of a region artificially. " why ? here, a fragment of the genome is sequenced twice -> it increases your confidence of the calling.

ADD REPLY • link 11.4 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

Yes, the caller does not know that, it thinks 2 separate reads are calling one variant. What if there is more than 1 variant at that position? What if it's a bi-allelic SNP, and one alleles happens to get more calls because of fragments being sequenced twice.

ADD REPLY • link 11.4 years ago by Adrian Pelin ★ 2.7k

Login before adding your answer.