Bwa-mem alignment on duplicated regions
1
1
Entering edit mode
8.6 years ago
annecarol7 ▴ 10

I have aligned ChiP-seq data using BWA-mem. The reference genome has a duplicated region on 2 different chromosomes. When I count the number of reads aligned to each of the regions I get a different number. If BWA-mem align the reads randomly on duplicated regions wasn't I supposed to get the same number of reads for each duplicated region?

bwa mem -B 40 -O 60 -E 10 -L 50 -M -R '@RG\tID:WTCHG_261086_294\tSM:WTCHG_261086_294' -t 10 /home/plxacb/Fixed_fasta/Ade_fasta/Cen8Ade6.fa WTCHG_261086_294.unmapped_ecoli_1.fastq WTCHG_261086_294.unmapped_ecoli_2.fastq | samtools view -bS - > WTCHG_261086_294.strict_bwamem.bam

samtools view WTCHG_261086_294.strict_bwamem.sorted.bam Not76_Chr2_ref:125168-133320 | wc -l

222754

samtools view WTCHG_261086_294.strict_bwamem.sorted.bam Not76_Chr4_ref:1619693-1627846 | wc -l

271696

ChIP-Seq alignment • 3.1k views
ADD COMMENT
0
Entering edit mode

To be pedantic, your Chr2 region is 1bp shorter than the Chr4 region (8152 vs 8153), it shouldn't make much difference but still who knows...

ADD REPLY
0
Entering edit mode

When I remove this extra 1bp from chr4 the result is 271688, only 8 reads less.

ADD REPLY
2
Entering edit mode
8.6 years ago

It looks like you're using paired-end reads. BWA MEM disambiguates multi-mapping reads if the mate is uniquely aligned. Your results suggest that there are more mates mapped to chr4 than chr2, which would explain the discrepancy.

ADD COMMENT
0
Entering edit mode

It makes sense but the regions are 8kb long and flanked by the same sequences where my precipitated protein doesn't bind so I would not expect uniquely alignment for any of the sequences.

ADD REPLY

Login before adding your answer.

Traffic: 1145 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6