Snp Calling Error By Gatk Unifiedgenotyper Program
3
0
Entering edit mode
11.1 years ago
Tonyzeng ▴ 310

HI, I have tried to do SNP calling using the GATK UnifiedGenotyper program,here is the command, (mouse)

java -jar /raid1/rzeng/softwares/GenomeAnalysisTK-2.7-4-g6f46d11/GenomeAnalysisTK.jar -glm BOTH -R genome.fa -T UnifiedGenotyper -I output.bam --dbsnp SNPMOUSENEW.vcf -o snps.vcf -metrics snps.metrics -stand_call_conf 50.0 -stand_emit_conf 10.0 -dcov 200 -A DepthOfCoverage -A AlleleBalance -L targets.interval_list

Error message like this,

ERROR MESSAGE: Badly formed genome loc: Contig 'chr1    134202950    134203590    NM_001008533_cds_0_0_chr1_134202951_r    0    -' does not match any contig in the GATK sequence dictionary derived from the reference; are you sure you are using the correct reference fasta file?

Here is my targets.interval_list looks like,

chr1    134202950    134203590    NM_001008533_cds_0_0_chr1_134202951_r    0    -
chr1    134234014    134234355    NM_001008533_cds_1_0_chr1_134234015_r    0    -
chr1    134202950    134203590    NM_001039510_cds_0_0_chr1_134202951_r    0    -
chr1    134234014    134234355    NM_001039510_cds_1_0_chr1_134234015_r    0    -
chr1    134202950    134203590    NM_001282945_cds_0_0_chr1_134202951_r    0    -
chr1    134234014    134234355    NM_001282945_cds_1_0_chr1_134234015_r    0    -
chr1    25068167            25068356            NM_175642_cds_0_0_chr1_25068168_r            0    -
chr1    25074684       25074789            NM_175642_cds_1_0_chr1_25074685_r            0    -

I downloaded targets.interval_list from

http://genome.ucsc.edu/cgi-bin/hgTables?hgsid=351729785&clade=mammal&org=Mouse&db=mm10&hgta_group=allTracks&hgta_track=refGene&hgta_table=0&hgta_regionType=genome&position=chr12%3A56694976-56714605&hgta_outputType=bed&hgta_outFileName=targets.interval_list

Anyone can help me take a look at it....?

Thank you,

snp gatk • 3.6k views
ADD COMMENT
1
Entering edit mode
11.1 years ago

try to use

-L:capture,BED targets.interval_list

instead of

-L targets.interval_list

also check that the chromosome names are the same as in the reference, and check that you did not change the 'tab's to spaces.

ADD COMMENT
1
Entering edit mode
11.1 years ago
vdauwera ★ 1.2k

My bet is that the tabs got changed to spaces, which is why the program is reading the entire line in as the contig.

ADD COMMENT
0
Entering edit mode
11.1 years ago
Tonyzeng ▴ 310

HI guys, I have changed and shortened commands as follows for making it simple, it works this time,Thank you,

java -jar /raid1/rzeng/softwares/GenomeAnalysisTK-2.7-4-g6f46d11/GenomeAnalysisTK.jar -glm BOTH -R genome.fa -T UnifiedGenotyper -I output.bam --dbsnp SNPMOUSENEW.vcf -o snps.vcf -metrics snps.metrics -stand_call_conf 50.0 -stand_emit_conf 10.0 -dcov 200
ADD COMMENT

Login before adding your answer.

Traffic: 1898 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6