Hey all!
I have run snippy-core but I'm a little confused about the output. I get (among others) a core.txt file which has the following info:
ID LENGTH ALIGNED UNALIGNED VARIANT HET MASKED LOWCOV
DNA1 4641652 3935699 690053 100612 1777 0 14123
I tried looking for some documentation but I couldn't find any. My guess it that the column "VARIANT" tells me how many SNP this particular isolate has in comparison with my reference, is that right? I imagine what LOWCOV mean but what do HET and MASKED represent?
On the other hand, when I check the core.vcf, I see a matrix with samples as columns and SNP as rows, in this case, the total number of rows is 91. There is a huge difference between the 100612 reported for DNA1, for example, and 91. Does this file represent the common SNPs for all the samples? If so, what do the values in the matrix mean? At first I thought it was a presence/absence matrix but then I notice there are non-1/0 numbers.
Sorry for the long question, I tried reading the docs and could get nowhere. Any help would be really appreciated, some thread/file to read from.
Many many thanks!