Hi Folks,
I am carrying out Exomeseq analysis for trio (Son is affected, Father is affected, Mother is unaffected). I did following steps
Trimmed using Trimmomatic
Aligned using BWA mem, Sorted, Marked Duplicates and Recalibrated.
For all 3 samples the coverage for target bases at 20X - 92%, at 30X - 86%, 40X - 80%, 50X - 72% (Is this coverage good enough for exome sequencing?)
Variant calling using Unified Genotyper - GATK ( Is Unifiedgenotyper good for Trio analysis?)
Got raw variants (900,000), Filtered variants, Annotated with SnpEff
Selected PASS variants (750,000), Selected Exonic Variants (12,000)
Then I checked in disease-related genes (Got 15 variants)
Finally, I selected common variants between affected son and affected father ( 3000 Heterozygous and 2800 Homozygous alternate variants)
Planning to check in DBSNP.
What are the other general ways to reduce my variants?
Thanks, WouterDeCoster.
Yeah, I am also going to run HaplotypeCaller (with GVCFs) individually for 3 samples and then run joint genotyping. I thought of taking the union of calls from Unified genotyper and HaplotypeCaller. I thought those calls might true positives. Does it make sense or not?
I am working on hypercholesterolemia. Sure I will compare with ExAC. Can ExAC database be used to filter out variant with MAF >1% as frequent variants.
Don't you mean the intersection rather than the union, to increase true positives?
Anyway, HaplotypeCaller is supposed to be superior, so I'm not sure if combining both would be an added value. You could even lose variants which were found by HaplotypeCaller but missed by unified genotyper.
As far as I know that's not exactly a rare condition, right?
You can download summary data from ExAC and use the vcf to annotate your variants for filtering of frequent variants.