Trying to do my variant calling on RNA paired tumour-normal pairs with Mutect2.
Reads were aligned with Hisat2 and alignment is 97-98%.
I ran Mutect 2 and it gave me about 100 SNPs and 100 indels (across the whole exome). I validated these against known variants from previous targeted sequencing on the same subject.
I then went back and did some preprocessing- marking duplicates, splitNCigar and BQSR.
The SNPs are about the same- slightly fewer but probably just increased sensitivity- but I now have 5000 indels.
I have a feeling that SplitNCigar should not be used on Hisat2 aligned reads and this is the reason for these 'indels'.
Has anyone else found similar? Can I variant call using Hisat2 aligned reads, and just skip the SplitNCigar step?
I far as I understand it, SplitNCigar was designed specifically to make data alignmed with a splice-aware aligner compatible with the GATK pipeline. Its clearly helpful for RNAseq because it takes a spliced read and turns it into two non-spliced reads.
Thanks- I think after some discussion that the best thing to do is just focus on SNVs.