Hello,
I have numerous samples which were sequenced, and then genotyped, resulting in two VCF with the same samples in each. I am hoping to calculate the percentage concordance between the files on a sample by sample basis. I have used the snpEff/snpsift concordance command, but I do not understand the column headings of the output file. Any help in understanding the meaning of these headings and/or advice on which columns to use when calculating concordance would be greatly appreciated.
The output column headings are as follows;
output column headings;
MISSING_ENTRY_array/MISSING_ENTRY_WGS
MISSING_ENTRY_array/MISSING_GT_WGS
MISSING_ENTRY_array/REF
MISSING_ENTRY_array/ALT_1
MISSING_ENTRY_array/ALT_2
MISSING_GT_array/MISSING_ENTRY_WGS
MISSING_GT_array/MISSING_GT_WGS
MISSING_GT_array/REF
MISSING_GT_array/ALT_1
MISSING_GT_array/ALT_2
REF/MISSING_ENTRY_WGS
REF/MISSING_GT_WGS
REF/REF
REF/ALT_1
REF/ALT_2
ALT_1/MISSING_ENTRY_WGS
ALT_1/MISSING_GT_WGS
ALT_1/REF
ALT_1/ALT_1
ALT_1/ALT_2
ALT_2/MISSING_ENTRY_WGS
ALT_2/MISSING_GT_WGS
ALT_2/REF
ALT_2/ALT_1
ALT_2/ALT_2
ERROR
Hi,
can you please tell me if you figured out what those represent and how to interpret the result?