Correct way to report variant statistics from bcftools stats report
0
0
Entering edit mode
5.8 years ago
prasundutta87 ▴ 670

Hi,

I am making a table of family-wise variants statistics. I have a multisample VCF file of biallelic SNPs on which I ran BCFtools v1.6 using the following command.

bcftools stats -s - <multisample VCF file>

This gives me the following output (edited for brevity):

sample  nRefHom nNonRefHom  nHets
family_1_sample1    191929  159 24424
family_1_sample2    185432  522 30505
family_2_sample1    186873  538 29132
family_2_sample2    189493  632 26333

So, when I report the number of variants per family, should I be doing this calculation?

family 1:

no. of variants=nNonRefHom+nHets i.e. 159+522+24424+30505=55610

Same goes with family 2.

Second question

SN  id  key value
SN  0   number of samples:  2
SN  0   number of records:  216734
SN  0   number of no-ALTs:  0
SN  0   number of SNPs: 216734

When should we specifically report the number of records or number of SNPs? Do we report it when we are not interested in sample-specific information? For ex. no. of SNPs discovered from a multisample variant calling pipeline?

SNP variants statistics bcftools • 5.7k views
ADD COMMENT

Login before adding your answer.

Traffic: 2366 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6