I performed alignments and variant calls using GATK Best Practice guidelines for germline variant calls, against both GRCh38 and T2T-CHM13 for my paired-end WES data. Upon comparing variant counts per gene with a paired t-test I observed significant differences in some genes. There are a few genes with additional variants in GRCh38 and a few genes with additional variants in T2T-CHM13.
To compare the variants called, I lifted over the T2T-CHM13 variants to GRCh38 and ran bcftools isec to identify common and unique variants between the two datasets.
Since the downstream tools that I am using only works with GRCh38, Is it a good idea to add the unique T2T-CHM13 variants (lifted over to GRCh38) to have a combined list of variants for downstream analysis? I haven't come across any papers doing this, any advice on this is most welcome!
T2T adds almost 200mio basepairs of new sequence, which is essentially the size of an additional chromosome. Hence, I would say you cannot compare variant calls since the novel sequence information can cause different alignments for some regions (or genes...), and as such the variant calling for these loci might be affected. Use T2T, it is the currently most comprehensive human genome annotation. What is the point of using two annotations anyway?
i would be alarmed if new stuff was showing up in exomes. it is an interesting idea though.