Hello everyone
I am using GATK Mutect2 to identify somatic SNPs. I followed procedure as given in the link https://gatkforums.broadinstitute.org/gatk/discussion/24057/how-to-call-somatic-mutations-using-gatk4-mutect2#latest . Here, one of the steps is to run GetPileupSummaries to summarize, read support for a set number of known variant sites using following command
gatk GetPileupSummaries -I tumor.bam -V af-only-gnomad.hg38.vcf.gz -L af-only-gnomad.hg38.vcf.gz -O getpileupsummaries.table
But due to memory issue, this step was not working for me due to large data size of af-only-gnomad.hg38.vcf.gz.
Therefore, I decided to split the af-only-gnomad.hg38.vcf.gz chromosome specific and developed getpileupsummaries.table for each chromosome such as chr1_getpileupsummaries.table, chr2_getpileupsummaries.table . All tables were merged at the end representing the "final_getpileupsummaries.table". In order to avoid any kind of error in working methodology, I would like to know if this strategy is "OK" or not. I will appreciate all the suggestions