variant calling through gatk
0
0
Entering edit mode
13 hours ago
ashaneev07 ▴ 30

Hello all,

I have been working with a sample dataset and have developed a workflow that generates a VCF file through GATK. This is typically used for identifying genetic variants in human samples. It would mean a lot to me if anyone could kindly review my commands, as your feedback would help boost my confidence in my approach. Only then can I apply this to the original sample.

Thank you..

bwa mem -t 4 -R "@RG\tID:sample003\tPL:ILLUMINA\tSM:sample003" /hg38/hg38.fa sample_R1.fastp.fastq.gz sample_R2.fastp.fastq.gz  > sample003.paired.sam

OR

bwa mem -t 4 -R "@RG\tID:sample003\tPL:ILLUMINA\tSM:sample003" /hg38/hg38.fa sample_R1.fastp.fastq.gz sample_R2.fastp.fastq.gz | samtools view -Sb - > sample003.paired.bam 

samtools view -Sb sample003.paired.sam > sample003.paired.bam
samtools sort sample003.paired.bam -o sample003.paired_sorted.bam

gatk MarkDuplicatesSpark -I sample003.paired_sorted.bam -O sample003.sorted_dedup_reads.bam 

gatk BaseRecalibrator -I sample003.sorted_dedup_reads.bam -R  /hg38/hg38.fa  --known-sites  /hg38/Homo_sapiens_assembly38.dbsnp138.vcf -O recal_data.table 

gatk ApplyBQSR -I sample003.sorted_dedup_reads.bam -R /hg38/hg38.fa --bqsr-recal-file recal_data.table -O sample003_sorted_dedup_bqsr_reads.bam 

gatk CollectAlignmentSummaryMetrics R=/hg38/hg38.fa I=sample003_sorted_dedup_bqsr_reads.bam O=alignment_metrics.txt

gatk HaplotypeCaller -R /hg38/hg38.fa -I sample003_sorted_dedup_bqsr_reads.bam -O 53_raw_variants.vcf --dbsnp /dbsnp/hg38_All_20180418.vcf

gatk SelectVariants  -R /hg38/hg38.fa   -V 53_raw_variants.vcf   -O snps_only.vcf   --select-type-to-include SNP

gatk VariantFiltration \
  -R /hg38/hg38.fa \
  -V snps_only.vcf \
  -O snp.filtered.unnorm.vcf \
  --filter-expression "MQ0 >= 5 && ((MQ0 / (1.0 * DP)) > 0.1)" --filter-name "HARD_TO_VALIDATE" \
  --filter-expression "DP < 10" --filter-name "LowCoverage" \
  --filter-expression "QUAL < 30.0" --filter-name "VeryLowQual" \
  --filter-expression "QUAL > 30.0 && QUAL < 50.0" --filter-name "LowQual" \
  --filter-expression "QD < 1.5" --filter-name "LowQD"

bcftools norm -c w  -f /hg38/hg38.fa  -m -any  -o variants.filtered.norm.vcf.gz   -Oz snp.filtered.unnorm.vcf
gatk vcf variant-calling • 101 views
ADD COMMENT

Login before adding your answer.

Traffic: 2049 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6