Entering edit mode
21 months ago
kamanovae
▴
100
Hi I run the GATK HaplotypeCaller and hope to get a file where each sample will have a column.
My bam file looks like this:
input_bam/SRR8859080.bam
input_bam/ENCFF477JTA_new.bam
This is my GATK command:
allele_chunk_file=rs_coord.vcf
gatk_run_line="../bin/gatk-4.1.2.0/gatk"
outfile=wgs_test_out.genotypes.vcf
bam_file=wgs_test.bam.list
genome_seq="../hg38.fa"
intervals=wgs_test.bed
$gatk_run_line \
HaplotypeCaller\
--reference $genome_seq \
--input $bam_file \
--genotyping-mode GENOTYPE_GIVEN_ALLELES \
--alleles $allele_chunk_file \
--intervals $intervals \
--output $outfile
As a result I get a vcf file like this(this is only first three position):
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT TUMOR
chr5 33987450 . N C 0 LowQual AC=0;AF=0.00;AN=2;DP=38;ExcessHet=3.0103;FS=0.000;MLEAC=0;MLEAF=0.00;MQ=60.00;SOR=0.693 GT:AD:DP:GQ:PL 0/0:38,0:38:99:0,114,1404
chr5 33994716 . N C 0 LowQual AC=0;AF=0.00;AN=2;DP=40;ExcessHet=3.0103;FS=0.000;MLEAC=0;MLEAF=0.00;MQ=60.00;SOR=0.287 GT:AD:DP:GQ:PL 0/0:39,0:39:99:0,117,1348
chr6 341321 . C T 0 LowQual AC=0;AF=0.00;AN=2;DP=40;ExcessHet=3.0103;FS=0.000;MLEAC=0;MLEAF=0.00;MQ=60.00;SOR=0.269 GT:AD:DP:GQ:PL 0/0:40,0:40:99:0,120,1873
I have one column TUMOR for two samples. But by running the HaplotypeCaller separately for each file, I get such information.
SRR8859080.bam
chr5 33987450 . N C 0 LowQual AC=0;AF=0.00;AN=2;DP=13;ExcessHet=3.0103;FS=0.000;MLEAC=0;MLEAF=0.00;MQ=60.00;SOR=0.368 GT:AD:DP:GQ:PL 0/0:13,0:13:39:0,39,323
chr5 33994716 . N C 0 LowQual AC=0;AF=0.00;AN=2;DP=17;ExcessHet=3.0103;FS=0.000;MLEAC=0;MLEAF=0.00;MQ=60.00;SOR=0.495 GT:AD:DP:GQ:PL 0/0:16,0:16:48:0,48,456
chr6 341321 . C T 0 LowQual AC=0;AF=0.00;AN=2;DP=5;ExcessHet=3.0103;FS=0.000;MLEAC=0;MLEAF=0.00;MQ=60.00;SOR=0.027 GT:AD:DP:GQ:PL 0/0:5,0:5:15:0,15,220
ENCFF477JTA_new.bam
chr5 33987450 . N C 0 LowQual AC=0;AF=0.00;AN=2;DP=25;ExcessHet=3.0103;FS=0.000;MLEAC=0;MLEAF=0.00;MQ=60.00;SOR=0.495 0/0:25,0:25:75:0,75,1081
chr5 33994716 . N C 0 LowQual AC=0;AF=0.00;AN=2;DP=23;ExcessHet=3.0103;FS=0.000;MLEAC=0;MLEAF=0.00;MQ=60.00;SOR=0.095 0/0:23,0:23:69:0,69,892
chr6 341321 . C T 0 LowQual AC=0;AF=0.00;AN=2;DP=35;ExcessHet=3.0103;FS=0.000;MLEAC=0;MLEAF=0.00;MQ=60.00;SOR=0.382 GT:AD:DP:GQ:PL 0/0:35,0:35:99:0,105,1653
The numbers in the last columns of two samplesis the sum of numbers in the last column of the first example of vcf. But I want to get a vcf with a column for each sample. I would be grateful for any hint!
please, acknowledge people's answers. kamanovae?active=posts