Entering edit mode
6.8 years ago
oars
▴
200
The following is a neat feature found in bcftools...
bcftools stats file.vcf > file.stats
...however, it doesn't seem to differentiate between insertions or deletions - just indels?
Here is an example of the output:
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 1761
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 1663
SN 0 number of MNPs: 0
SN 0 number of indels: 98
SN 0 number of others: 0
SN 0 number of multiallelic sites: 2
SN 0 number of multiallelic SNP sites: 0
# TSTV, transitions/transversions:
# TSTV [2]id [3]ts [4]tv [5]ts/tv [6]ts (1st ALT) [7]tv (1st ALT) [8]ts/tv (1st ALT)
TSTV 0 1267 396 3.20 1267 396 3.20
Is there a way to separate the insertions and deletions using bcftools?
Pie in the sky would be a stats read out option that would also provide information about heterozygous genotypes and dbSNP sites.
is there a SVTYPE tag in the INFO column ?
I don't think so, I see the following...
I've also tried vcftools with its vcf-stats feature:
This also provides an indel count but does not separate insertions and deletions, it does however provide a confusing list following the indel count. I'm not sure what it represents and its not clear from the manual pages.
However, this simple command line vcftools script from matt (Count Of Variants) seems to do the trick: