Entering edit mode
9.3 years ago
james.blackshaw
▴
20
I have been using bcftools stats, but I'm uncertain about what several fields in the output mean. The documentation is good for what the command line options do, but has no breakdown of what the output means or how it is calculated.
This is part of the output from vcftools stats on my file:
# SN, Summary numbers:
# SN [2]id [3]key [4]value
SN 0 number of samples: 4301
SN 0 number of records: 803
SN 0 number of SNPs: 714
SN 0 number of MNPs: 0
SN 0 number of indels: 94
SN 0 number of others: 7
SN 0 number of multiallelic sites: 33
SN 0 number of multiallelic SNP sites: 2
Things I can't find in the documentation:
- Does "multiallelic" denote "more than 2 alleles" rather than "not monomorphic"?
- The number of SNPS+indels+others does not sum to the total number of records. Is this because an SNP can also be an indel or "other?"
- What types of variants are covered by "others" here?
- Why is the
[2]id
field blank for all sections of my output?
Thanks,
James
To anyone looking for the answer to these questions, its here:
https://github.com/samtools/bcftools/issues/316
The poster actually posted the same question on the tool's GitHub page where the contributers responded.