Hi, am using Ensembl Virtual Machine 71 Variant Effect Predictor (VEP) to annotate NGS whole genome sequence variants from two individuals. The input VCF file has both unqiue and common SNPs between the two samples and I am more interested in what is unique between the two only when compared to reference genome.
VCFv4.1 sample
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT IND1 IND2
1 615519 . C T 721.4 PASS AC=2;AF=0.500;AN=4;BaseQRankSum=-0.462;DP=30;Dels=0.00;FS=0.000;HaplotypeScore=0.0000;MLEAC=3;MLEAF=0.750;MQ=58.11;MQ0=0;MQRankSum=1.271;QD=24.88;ReadPosRankSum=1.040 GT:AD:DP:GQ:PL 1/1:0,29:29:69:782,69,0 0/0:1,0:1:3:0,3,32
1 87593200 rs197529280 C A,T 1288 PASS AC=2,2;AF=0.500,0.500;AN=4;DB;DP=51;Dels=0.00;FS=0.000;HaplotypeScore=0.0000;MLEAC=2,2;MLEAF=0.500,0.500;MQ=59.63;MQ0=0;QD=25.25 GT:AD:DP:GQ:PL 1/1:0,34,0:34:78:936,78,0,936,78,936 2/2:0,0,17:17:51:548,548,548,51,51,0
1 2863220 . A C,G 1791.73 PASS AC=3,1;AF=0.750,0.250;AN=4;DP=94;Dels=0.00;FS=0.000;HaplotypeScore=0.4470;MLEAC=3,1;MLEAF=0.750,0.250;MQ=59.01;MQ0=0;QD=19.06 GT:AD:DP:GQ:PL 1/1:0,43,5:46:18:1058,93,0,983,18,974 1/2:0,35,11:44:99:1115,244,151,871,0,844
1 282540299 . A T 288.2 PASS AC=3;AF=0.750;AN=4;BaseQRankSum=1.004;DP=14;Dels=0.00;FS=3.256;HaplotypeScore=0.0000;MLEAC=3;MLEAF=0.750;MQ=59.44;MQ0=0;MQRankSum=-0.091;QD=20.59;ReadPosRankSum=1.917 GT:AD:DP:GQ:PL 1/1:2,4:6:47:95,0,47 1/1:0,8:8:24:253,24,0
Problem: The output of VEP for the two sample using the flag --individual all
will contain common SNPs with the same genotype (see line 5) or different genotype (see line 3,4) from two samples and will annotate it in two different lines; one for each individual.
Question: Can I filter what the common SNP with same genotype from the two individuals in the VCF (a command line) or VEP using an addtional flag or plugin? while the output of common SNP with different genotype (or all the SNPs) from the two samples be in one line:
SNP Variant Chr FromPostion End Reference-allele Alternate-Allele IND1 IND2
1_9612728_G/A 1 9612727 9612728 G A 2 1
in which 2 mean homozygous for alternate allele, 1 heterozygous for alternate allele and 0 is the same as reference allele.
Thanks.