What's the difference between mpileup output and bcftools call ?
1
5
Entering edit mode
9.7 years ago
sacha ★ 2.4k

I guess it's a simple question.. Could you detail what's the difference between vcf file generated by samtools and from bcftools?

samtools mpileup - ref.fa file.bam > file.bcf
bcftools call file.vcf > file2.bcf
bam pipeline vcf • 7.6k views
ADD COMMENT
16
Entering edit mode
9.7 years ago

Usually these command are used together:

samtools mpileup command automatically scans every position supported by an aligned read, computes all the possible genotypes supported by raw reads, and then calculates the probability that each of these genotypes is truly present in your sample.

For example, let's consider the first 1000 bases in Reference Genome file. Suppose the position 35 (in reference G) will have 27 reads with a G base and two reads with a T nucleotide. Total read depth will be 29. In this case, the app concludes with high probability that the sample has a genotype of G, and the T reads are likely due to sequencing errors. In contrast, if the position 400 in reference genome is T, but it is covered by 2 reads with a C base and 66 reads with a G (total read depth equal to 68), it means that the sample more likely will have G genotype.

bcftools call command uses the genotype likelihoods generated from samtools mpileup to call genetic variants and outputs the all identified variants.

So, it means, that file.bcf will contain all possible genotypes in the genome, but the bcftools bcf file will contain only sites which were found to be variant.

If you are interested in specific sites that were not called by bcftools, you can break it down into two separate steps.

Do you want to see example vcf files from both commands?

ADD COMMENT
0
Entering edit mode

Thanks for your enlightment!

Could you edit your post, I think you made a mistake: "Total read depth will be 26" Should be "29"! Or I may be wrong!

So, in brief mpileup compute frequency for each bases (homozygote/heterozygotes/error) And vcftools is used to filters and get only interesting variant. Then, bcftools should have a threashold parameter?

If you have a small head of both vcf file, you can publish it here. It will be useful for me and for other people.

ADD REPLY
0
Entering edit mode

Yes, 29 is correct. Thanks.

ADD REPLY

Login before adding your answer.

Traffic: 1809 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6