Question

Ngs And Recalibration

6

Entering edit mode

15.2 years ago

Pierre Lindenbaum 166k

In NGS, what is Recalibration ? how should I do this ? Why should I care ?

Many thanks

Pierre

next-gen sequencing • 4.4k views

ADD COMMENT • link updated 14.5 years ago by Brad Chapman 9.7k • written 15.2 years ago by Pierre Lindenbaum 166k

2

Entering edit mode

No idea! But do you mean this: http://www.broadinstitute.org/gsa/wiki/index.php/Base_quality_score_recalibration ?

ADD REPLY • link 15.2 years ago by Michael 56k

Ram · Answer 1 · 2010-06-01

Pierre;

I've been using Broad's GATK recalibration that Michael mentions to recalibrate quality scores after alignment and before SNP calling.

Since scores are based off of machine data (the original score), alignment information (the recalibration) and cycle/tile/sequence information (also recalibration) they should be more stable for SNP finding.

Practically, it's a two step process. First you count the covariate data according to specified criteria:

java -Xmx4g -jar GenomeAnalysisTK.jar \
-T CountCovariates \
-cov ReadGroupCovariate \
-cov QualityScoreCovariate \
-cov CycleCovariate \
-cov DinucCovariate \
-cov TileCovariate \
-recalFile recal_data.csv \
-I aligned.duplicates_marked.bam \
-R /path/to/reference.fasta \
-l INFO -U \
--use_original_quals \
-B dbsnp,PicardDbSNP,/path/to/reference.dbsnp

and then use this to provide recalibrated quality scores:

java -jar GenomeAnalysisTK.jar \
-T TableRecalibration \
-recalFile recal_data.csv \
-R reference.fasta \
-I aligned.duplicates_marked.bam \
-outputBam recal.gatk.bam \
-l INFO -U \
--use_original_quals \

I've written some code to analyze and display the recalibration scores and should have a full blog post on it, but in the meantime here's a plot that shows the original/post-calibration score distribution.