GetPileupSummaries for each chromosome
0
0
Entering edit mode
4.3 years ago

Hello everyone

I am using GATK Mutect2 to identify somatic SNPs. I followed procedure as given in the link https://gatkforums.broadinstitute.org/gatk/discussion/24057/how-to-call-somatic-mutations-using-gatk4-mutect2#latest . Here, one of the steps is to run GetPileupSummaries to summarize, read support for a set number of known variant sites using following command

gatk GetPileupSummaries -I tumor.bam -V af-only-gnomad.hg38.vcf.gz -L af-only-gnomad.hg38.vcf.gz -O getpileupsummaries.table

But due to memory issue, this step was not working for me due to large data size of af-only-gnomad.hg38.vcf.gz.

Therefore, I decided to split the af-only-gnomad.hg38.vcf.gz chromosome specific and developed getpileupsummaries.table for each chromosome such as chr1_getpileupsummaries.table, chr2_getpileupsummaries.table . All tables were merged at the end representing the "final_getpileupsummaries.table". In order to avoid any kind of error in working methodology, I would like to know if this strategy is "OK" or not. I will appreciate all the suggestions

GATK • 1.2k views
ADD COMMENT

Login before adding your answer.

Traffic: 1628 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6