Hello,
Has anyone worked on mtDNA NGS analysis ? I'm developing an inhouse pipeline using the rCRS reference. The pipeline is as follows
1) BWA map to reference; Picard/Sambamba sort, index mark duplicates
2) GATK base recalibration
3) Samtools mpileup followed by Varscan for variant calling (strand filter 0 min-var-freq 0.01 --min-avg-qual 20 --min-coverage 50 )
We know that rCRS reference has an artefact at position 3107 and should be detected when aligned and also should be homzygous with high heteroplasmy levels. But using the above steps, 3107 does not get detected in all samples and if it does, it classifies as heterozygous with low levels of heteroplasmy (less than 15%)
Has anyone encountered such issues or can suggest ways to improve the pipeline ?
I know this artefact is usually excluded from the reference but in this particular case, we do not exclude it but classify it as an artefact.
thank you
Hey Nandini, I used
mPileup + varscan
as you suggested. I still have some questions/problems want to ask you:varscan
mpileup2cns
command? I used this and I found this only call one mutation frequency for me each position even there are more than 1 variant at that site. Check here for detail description.HaplotypeCaller
but still have some problems. I wonder have you think about treat mtDNA as tumor and usevarscan
Tumor-normal Comparison
set rCRS as normal tissue data, or useMutect
tool from GATK. I am think about this because mtDNA and tumor both has heterogeneity. Howevertumor
has more other features different frommtDNA
like more commonly structure variants. Thanks.Hi MatthewP, So I use varscan
mpileup2SNP
andmpileup2indel
and then combine the two result files, followed by annotation. If you are using GATK, then I THINK you can use GATK's Mutect rather than Haplotypecaller (though I do not have much experience with GATK and mtDNA analysis)So can you get 1 frequency value per variant instead of per site(POS?) I really need frequency for each variant(Clients need mtDNA heterogeneity information). I already find HaplotypeCaller not suite for mtDNA data(because amplicon-based and haploid).