Entering edit mode
6.8 years ago
win
▴
990
So i downloaded a .bam from phase 3 1000 genomes and i want to run the RealignerTargetCreator on it using GATK 3.8.
The following command works fine:
sudo java -jar algorithms/gatk/gatk3.8.jar -T RealignerTargetCreator -R references/hg38.fasta -I data/HG100.reordered.bam -o data/HG100.realignertargetcreator.intervals
But when this command is run with -know i get contigs mismatch error between my bam and the vcf file. The know indel VCF used is located here:
https://storage.googleapis.com/genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.known_indels.vcf.gz
Step prior I have applied the picard ReorderSam using hg38.
Am i using the incorrect indels file?
Thanks in advance.
Thanks, in the know_indels.vcf.gz i can see a lot of HLA-A, HLA-B, HLA-C contigs, where is not in the reordered BAM. Can this be an issue?
yes
........and how do take care of this issue?
change the sequence dictionary in the bam (insert the missing lines using samtools reheader ) or remove the ##contig lines and the variants in the vcf.
OK, will try. I have never done any of this before.
Consider @Devon's instructions as you try
samtools reheader
: A: Problems while reheadering BAM with Samtools 1.3.1