Question

Discussion: Clinical genetics germline variants detection pipeline

0

Entering edit mode

3.4 years ago

Jordi ▴ 60

Hi,

I am building a clinical genetics germline variants detection secondary analysis pipeline from scratch, mainly for WES and WGS data. I am following GATK's best practice guidelines, specifically:

GATK Map and Clean Up Short Read Sequence Data GATK Germline Short Variants Discovery Best Practices

The resulting pipeline involves the following tools:

Picard FastqToSam
Picard MarkIlluminaAdapters
Picard SamToFastq
BWA-MEM
Picard MergeBamAlignment
Picard MarkDuplicates
GATK BaseRecalibrator (based on dbSNP common variants .vcf file)
GATK ApplyBQSR
Picard ValidateSamFile
GATK HaplotypeCaller
GATK CNNScoreVariants
GATK FilterVariantTranches
GATK Funcotator

I will extend the pipeline to include SV/CNV calling in the nearfuture; however, I wanted to get inputs from the community as to whether any of the steps listed here is redundant and not necessary and, if so, why. All of these steps are computationally intensive and require a significant amount of time to complete on WGS data.

NGS GATK clinical pipeline genetics • 1.4k views

ADD COMMENT • link updated 3.4 years ago by ATpoint 85k • written 3.4 years ago by Jordi ▴ 60

0

Entering edit mode

Why do you reinvent the wheel? There are dozens of these kinds of pipelines already available, for example sarek from nf-core (nextflow) that builds upon GATK best practices: https://github.com/nf-core/sarek

Do you really want to build these very common things from scratch?

ADD REPLY • link 3.4 years ago by ATpoint 85k

0

Entering edit mode

Not really re-inventing anything. We have been using an old pipeline for years now. It is time to update to the latest versions and recommendations. We want to have more control over the workflow and be able to tweak it, if necessary.

Ready-made pipelines are often outdated very fast and managing versions and libraries can be a problem.

I will look into it, though.

ADD REPLY • link 3.4 years ago by Jordi ▴ 60

0

Entering edit mode

nf-core pipelines are actively maintained and you can either use the provided container images or use one you make yourself so software versions are not an issue. I see your point though of the desire to have control. Maybe you can use it simply as a template to get inspiration from.

ADD REPLY • link 3.4 years ago by ATpoint 85k