How to analyze whole-genome data using Picard and GATK?
0
0
Entering edit mode
2.9 years ago
Dan ▴ 180

Hello,

I am new to whole-genome sequencing data analysis. I am not sure whether I am correctly using Picard and GATK for my whole-genome analysis. Can someone help to check my pipeline? Thanks a lot!

TrimGalore-0.6.6/trim_galore --paired --cores 4 --retain_unpaired H_5_S12_L001_R1_001.fastq.gz H_5_S12_L001_R2_001.fastq.gz -o ./out

bwa mem -t 18 bwa_index/GRCh38_Broad/GRCh38_Broad H_5/out/*val_1.fq.gz H_5/out/*val_2.fq.gz > H_5_mem_val.sam

samtools view -Sb -T hg20/Broad_Homo_sapiens_assembly38.fasta H_5_mem_val.sam > H_5_mem_val.bam

samtools sort -n  H_5_mem_val.bam -o H_5_mem_val.bam

samtools fixmate -m  H_5_mem_val.bam H_5_fixed.bam 

samtools sort H_5_fixed.bam -o H_5_sorted.bam

samtools markdup -r H_5_sorted.bam H_5_dedup.bam

samtools view -S H_1_dedup.bam | head -1  | awk '{print $1}'
# A01494:44:H53Y7DMXY:1:2301:27579:4883

java -jar ~/picard.jar AddOrReplaceReadGroups I=H_5_dedup.bam O=H_5_dedup.RG.bam RGID=A01494.44 RGLB=lib RGPL=illumina RGSM=H_5 RGPU=A01494.44.H53Y7DMXY.1

java -jar ~/gatk-4.2.3.0/gatk-package-4.2.3.0-local.jar BaseRecalibrator -I H_5_dedup.RG.bam -O H_5/recal.txt -R hg20/Broad_Homo_sapiens_assembly38.fasta --known-sites vcf/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz --known-sites vcf/Homo_sapiens_assembly38.dbsnp138.vcf --known-sites vcf/1000G.phase3.integrated.sites_only.no_MATCHED_REV.hg38.vcf


java -jar ~/gatk-4.2.3.0/gatk-package-4.2.3.0-local.jar ApplyBQSR -I H_5_dedup.RG.bam -O H_5.final.bam -R hg20/Broad_Homo_sapiens_assembly38.fasta --bqsr-recal-file H_5/recal.txt


samtools index H_5.final.bam



java -jar ~/gatk-4.2.3.0/gatk-package-4.2.3.0-local.jar Mutect2 \
-R hg20/Broad_Homo_sapiens_assembly38.fasta \
--germline-resource vcf/af-only-gnomad.hg38.vcf.gz \
--panel-of-normals vcf/1000g_pon.hg38.vcf.gz \
-I H_5.final.bam \
-O vcf/H_5.vcf.gz \
--f1r2-tar-gz tmp/H_5.f1.tar.gz \
--af-of-alleles-not-in-resource -1.0

java -jar ~/gatk-4.2.3.0/gatk-package-4.2.3.0-local.jar GetPileupSummaries \
-I H_5.final.bam \
-V vcf/af-only-gnomad.hg38.vcf.gz \
-O H_5.pileup.txt \
--intervals Bed/Agilent.71M.Covered.hg38.bed


java -jar ~/gatk-4.2.3.0/gatk-package-4.2.3.0-local.jar CalculateContamination \
-I H_5.pileup.txt \
-O tmp/H_5.contamination.table \
-tumor-segmentation tmp/H_5.segments.table


java -jar ~/gatk-4.2.3.0/gatk-package-4.2.3.0-local.jar LearnReadOrientationModel \
-I tmp/H_5.f1.tar.gz \
-O tmp/H_5.prior.tar.gz


java -jar ~/gatk-4.2.3.0/gatk-package-4.2.3.0-local.jar FilterMutectCalls \
-V vcf/H_5.vcf.gz \
-R hg20/Broad_Homo_sapiens_assembly38.fasta \
--contamination-table tmp/H_5.contamination.table \
-O vcf/H_5.filtered.vcf.gz \
--ob-priors tmp/H_5.prior.tar.gz \
--tumor-segmentation tmp/H_5.segments.table


java -jar ~/gatk-4.2.3.0/gatk-package-4.2.3.0-local.jar Funcotator \
--variant vcf/H_5.filtered.vcf.gz \
--reference hg20/Broad_Homo_sapiens_assembly38.fasta \
--ref-version hg38 \
--data-sources-path gatk/funcotator_dataSources.v1.7.20200521s \
--output vcf/H_5.filtered.func.vcf \
--output-file-format VCF
whole-genome GATK Picard • 743 views
ADD COMMENT

Login before adding your answer.

Traffic: 2009 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6