Samtools Mpileup Snp Calling Improperly Paired Reads
2
0
Entering edit mode
10.9 years ago
rob234king ▴ 610

I'm doing samtools mpileup snp calling but samtools is not including reads that are improperly paired or mates unmapped when I examine the snps using the bam file. Due to the nature of the reference I have I want to turn this behavior off. Turning quality scores filters for base and reads do not have an effect. Any ideas how to do this, or a snp caller that can set to include reads that are improperly paired to the snp calling process?

samtools mpileup -I -q 20 -Q 20 -DSuf Triticum_aestivum.IWGSP1.21.dna.genome.fa A6_P33_Triticum_aestivum.IWGSP1.20.dna.genome_novo_H15_t60.sort.rmdup.bam C6_P33_Triticum_aestivum.IWGSP1.20.dna.genome_novo_H15_t60.sort.rmdup.bam D3_P33_Triticum_aestivum.IWGSP1.20.dna.genome_novo_H15_t60.sort.rmdup.bam| bcftools view -bvcg - > A6C6D3_P33_Triticum_aestivum.IWGSP1.20.dna.genome_novo_H15_t60.sort.rmdup.bcf

bcftools view A6C6D3_P33_Triticum_aestivum.IWGSP1.20.dna.genome_novo_H15_t60.sort.rmdup.bcf | ./vcfutils.pl varFilter -D100 > A6C6D3_P33_Triticum_aestivum.IWGSP1.20.dna.genome_novo_H15_t60.sort.rmdup.vcf
samtools mpileup • 8.6k views
ADD COMMENT
3
Entering edit mode
10.9 years ago
Andreas ★ 2.5k

Hi,

The sam flag filter description that ashutoshmits listed are correct. However, samtools mpileup will by default ignore what it calls "anomalous read pairs" or "orphans". Those are paired reads where the mate is not paired. This behaviour can be toggled with

       -A           count anomalous read pairs

If you're interested in digging deeper: the corresponding source code hides in bam_plcmd.c:

       else if ((ma->conf->flag&MPLP_NO_ORPHAN) && (b->core.flag&1) && !(b->core.flag&2)) skip = 1;

i.e. if orphans are to be ignored (default, if -A is not given) then reads will be ignored if they are paired (b->core.flag&1) but are not a proper pair !(b->core.flag&2).

Andreas

PS Edit: -A is all you need. No need to modify the source code.

ADD COMMENT
0
Entering edit mode

Thanks Andreas for such a great answer.

ADD REPLY
0
Entering edit mode

So just to check I understand correctly if I want it to include anomalous read pairs I use the -A option and delete "&& (b->core.flag&1) && !(b->core.flag&2)" to remove the skipping of improperly paired reads?

ADD REPLY
0
Entering edit mode

Yes you are right. You may have to do "make samtools" again in order for that change to take place.

ADD REPLY
0
Entering edit mode

Thanks a lot, should be a big help.

ADD REPLY
0
Entering edit mode

No sorry, that's not what I wanted to say. Just specifying -A is enough to include orphan reads!

ADD REPLY
0
Entering edit mode

It worked by doing what I said but if same can be done just with the -A option, would be more reproducible for others, I'll check it thanks.

ADD REPLY
0
Entering edit mode

My bad. I thought you asked to change the code. But the code was for the purpose of explaining the parameter.

ADD REPLY
1
Entering edit mode
10.9 years ago

I am not sure but it may be the case that your mate-pair reads that are improperly paired or mate-pair reads with only one read mapped are being discarded due to one of the following reasons.

The default settings for mpileup is to filter reads with bitwise flag 0X704.

So for pileup generation the following reads will not considered at all from the bam files:

1) 0x0400 (aka 1024 or "d") duplicate

2) 0x0200 (aka 512 or "f") failed QC

3) 0x0100 (aka 256 or "s") non primary alignment

4) 0x0004 (aka 4 or "u") unmapped

I think in your case, the improperly paired reads and mate-pairs with only one read mapped may belong to non-primary alignment or duplicates. I am just guessing. Unfortunately, you can't tell mpileup which reads to choose from command line but you can modify the code as explained in this post (http://seqanswers.com/forums/showthread.php?t=22752).

ADD COMMENT
0
Entering edit mode

Thanks very much ill give it a try.

ADD REPLY

Login before adding your answer.

Traffic: 1966 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6