Samtools Mpileup And Bcftools Calling Snps Againt Non-Existent N Characters
1
1
Entering edit mode
13.0 years ago
Travis ★ 2.8k

Hi,

I am using the standard mpileup/bcftools route to call variants with some publicly available Ion Torrent E.coli DH10B data. In my final results I get a couple of SNPs called where the reference base is an N character. The problem is that I have checked the entire genome and can't find any N characters! The bam file I am using was also publicly available so I am wondering if maybe the alignments were originally done against a version of the genome that contained Ns however I am not certain on the mechanics of how samtools/bcftools call the variants so am unsure if this explanation could be correct. Can anyone lend their opinion?

Thanks in advance.

next-gen sequencing samtools snp bcftools • 4.5k views
ADD COMMENT
2
Entering edit mode
13.0 years ago
Swbarnes2 ★ 1.6k

Check the gneome for other ambiguous characters; MRWSYK. mpileup treats them as N's.

ADD COMMENT
1
Entering edit mode

If mpileup thinks that every single position is an N, that means that there's some discrepancy between the reference sequence you used, and what mpileup thinks you used. For instance, if the reference name in the fasta differs from the name in the .sam file, or if the fasta.fai was made wrong (or if that step failed, the software will yell, but it will gamely go on without it), you get that result.

ADD REPLY
0
Entering edit mode

I found literally two non A|T|G|C characters in the genome. These don't account for the number of variants I am seeing called against N's unfortunately...

ADD REPLY
0
Entering edit mode

I just performed the same operation with a publicly available MiSeq dataset for the same genome - a SNP against an N character was called for every genomic position covered by a read!

ADD REPLY
0
Entering edit mode

Cheers for the input. I'll have a look at it asap and will get back to you. The perils of doing too much at once!

ADD REPLY

Login before adding your answer.

Traffic: 1996 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6