Ngs And Recalibration
1
6
Entering edit mode
14.5 years ago

In NGS, what is Recalibration ? how should I do this ? Why should I care ?

Many thanks

Pierre

next-gen sequencing • 4.3k views
ADD COMMENT
2
Entering edit mode
ADD REPLY
5
Entering edit mode
14.5 years ago

Pierre;

I've been using Broad's GATK recalibration that Michael mentions to recalibrate quality scores after alignment and before SNP calling.

Since scores are based off of machine data (the original score), alignment information (the recalibration) and cycle/tile/sequence information (also recalibration) they should be more stable for SNP finding.

Practically, it's a two step process. First you count the covariate data according to specified criteria:

java -Xmx4g -jar GenomeAnalysisTK.jar \
-T CountCovariates \
-cov ReadGroupCovariate \
-cov QualityScoreCovariate \
-cov CycleCovariate \
-cov DinucCovariate \
-cov TileCovariate \
-recalFile recal_data.csv \
-I aligned.duplicates_marked.bam \
-R /path/to/reference.fasta \
-l INFO -U \
--use_original_quals \
-B dbsnp,PicardDbSNP,/path/to/reference.dbsnp

and then use this to provide recalibrated quality scores:

java -jar GenomeAnalysisTK.jar \
-T TableRecalibration \
-recalFile recal_data.csv \
-R reference.fasta \
-I aligned.duplicates_marked.bam \
-outputBam recal.gatk.bam \
-l INFO -U \
--use_original_quals \

I've written some code to analyze and display the recalibration scores and should have a full blog post on it, but in the meantime here's a plot that shows the original/post-calibration score distribution.

ADD COMMENT

Login before adding your answer.

Traffic: 2615 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6