Hi all!
I am writing this question to ask, where can I find the exact steps used in the GATK Structural Variants Pipeline?
I understand that there are various pipelines, my goal is to compare the GATK Somatic Short Variant pipeline (SNPs and short indels) to the GATK Somatic Structural Variants. Particularly, I am searching for a schematic that ilustrate each step of both pipelines from fasta to vcf/maf.
This blogpost https://gatk.broadinstitute.org/hc/en-us/articles/9022487952155-Structural-variant-SV-discovery mentions the steps in general I would like one with all steps.
Sorry if it is a too general question but I believe that having this question would facilitate researchers finding this resources.
Note: Structural variants (SVs) are DNA rearrangements that involve at least 50 nucleotides: The variants that this GATK-SV is able to detect:
- Copy number variants (CNVs), including deletions and duplications
- Insertions
- Inversions
- Reciprocal chromosomal translocations
- Complex structural variants involving two or more distinct SV signatures in a single mutational event
If it is just the tools that they used which you are looking for, the first link in the overview section of that page leads to the following GitHub repository: https://github.com/broadinstitute/gatk-sv?tab=readme-ov-file#overview, which contains the pipeline, the tools they used and some descriptions.
Thank you for the response, I mean those tools aren't all the steps from fasta to vcf. Would be key understanding the aligner, reference used, if they use mark duplicates, ... and compare the short variant / indel pipeline (best practices) that uses Mutect vs the GATK-SV that uses (Manta, MELT, and Wham for SV / cn.MOPS and GATK gCNV for copy-number variations)