Question

Best Practices for Tumor-Only WES Data Analysis: Seeking Feedback on Variant Calling Pipeline

0

Entering edit mode

9 months ago

George ▴ 20

Hello everyone,

I’m currently working on a pipeline for analyzing tumor-only WES data, which I understand has its challenges and limited resources available. I’d greatly appreciate any feedback or suggestions on my current approach.

Here's what I’m doing so far:

Somatic Variant Calling: I use GATK’s Mutect2 to call somatic variants, leveraging the germline resource and a panel of normals (PoN). Since my samples are FFPE, I also collect the F1R2 files during this step.
FFPE Artifacts & Contamination Handling:
- I run LearnReadOrientationModel to model FFPE artifacts.
- I use GetPileupSummaries and CalculateContamination to estimate contamination.
- Then, I run FilterMutectCalls with both --contamination-table and --tumor-segmentation to apply the appropriate filters.
Variant Filtering:
- I use SelectVariants to retain only the variants that pass the filters.
- Next, I filter out variants that are common across several germline databases to reduce the likelihood of retaining germline polymorphisms.
Functional Annotation:
- Finally, I focus on functional filtering by retaining only variants that are confirmed in either COSMIC or OncoKB.

Given that I only have tumor data without a matched normal, do you think this approach is robust and reliable for calling somatic variants? I'm particularly interested in any suggestions on refining the contamination estimation, filtering strategy, or any best practices I might have missed. Also, I want to call CNV with cnvkit, can I use the provided by mutect contamination estimation for the -m clonal --purity step

Thanks in advance for any insights!

GATK Mutect WES Cancer • 391 views

ADD COMMENT • link 9 months ago by George ▴ 20