VCF sample fields output
0
0
Entering edit mode
6.5 years ago
Emilio Marmol ▴ 180

Hi,

I am analysing some SNPs in a VCF, and I have found some mutation of interest, but, I would like to know what this output in the samples fields mean.

The two SNPs I am interested in are these:

1   270574995   .   T   C   8523.06 PASS    
AC=64;AF=0.889;AN=72;DP=333;FS=0.000;MQ=60.00;set=Intersection  GT:AD:DP:GQ:PGT:PID:PL  
1/1:0,20:20:60:1|1:270574995_T_C:900,60,0   1/1:0,13:13:39:1|1:270574995_T_C:585,39,0   
0/1:1,5:6:27:0|1:270574995_T_C:207,0,27 1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 
1/1:0,12:12:36:1|1:270574995_T_C:540,36,0   1/1:0,16:16:48:1|1:270574995_T_C:720,48,0   
0/1:1,7:8:21:0|1:270574995_T_C:291,0,21 1/1:0,19:19:57:1|1:270574995_T_C:855,57,0   
1/1:0,29:29:87:1|1:270574995_T_C:1305,87,0  1/1:0,21:21:63:1|1:270574995_T_C:945,63,0   
1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 
1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 1/1:0,6:6:18:1|1:270574995_T_C:249,18,0 
0/1:5,6:11:99:0|1:270574995_T_C:237,0,192   1/1:0,10:10:36:1|1:270574995_T_C:509,36,0   
1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 0/1:2,4:6:99:0|1:270574995_T_C:159,0,99 
1/1:0,7:7:21:1|1:270574995_T_C:305,21,0 1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 0/0:5,0:5:0:.:.:0,0,92  
1/1:0,3:3:9:1|1:270574995_T_C:135,9,0   1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 
1/1:0,7:7:21:1|1:270574995_T_C:315,21,0 1/1:0,11:11:33:1|1:270574995_T_C:495,33,0   
1/1:0,8:8:24:1|1:270574995_T_C:355,24,0 1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 
1/1:0,8:8:24:1|1:270574995_T_C:360,24,0 1/1:0,7:7:21:1|1:270574995_T_C:315,21,0 
1/1:0,9:9:27:1|1:270574995_T_C:372,27,0 1/1:0,9:9:27:1|1:270574995_T_C:405,27,0 
1/1:0,9:9:27:1|1:270574995_T_C:405,27,0 1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 0/0:4,0:4:12:.:.:0,12,109   
1/1:0,8:8:24:1|1:270574995_T_C:360,24,0 1/1:0,11:11:33:1|1:270574995_T_C:495,33,0

1   270574996   .   T   A   8523.06 PASS    AC=64;AF=0.889;AN=72;DP=335;MQ=60.00;set=Intersection   
GT:AD:DP:GQ:PGT:PID:PL  1/1:0,20:20:60:1|1:270574995_T_C:900,60,0   
1/1:0,13:13:39:1|1:270574995_T_C:585,39,0   0/1:1,5:6:27:0|1:270574995_T_C:207,0,27 
1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 1/1:0,12:12:36:1|1:270574995_T_C:540,36,0   
1/1:0,16:16:48:1|1:270574995_T_C:720,48,0   0/1:1,7:8:21:0|1:270574995_T_C:291,0,21 
1/1:0,19:19:57:1|1:270574995_T_C:855,57,0   1/1:0,29:29:87:1|1:270574995_T_C:1305,87,0  
1/1:0,21:21:63:1|1:270574995_T_C:945,63,0   1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 
1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 
1/1:0,5:5:18:1|1:270574995_T_C:249,18,0 0/1:5,6:11:99:0|1:270574995_T_C:237,0,192   
1/1:0,12:12:36:1|1:270574995_T_C:509,36,0   1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 
0/1:3,4:7:99:0|1:270574995_T_C:159,0,99 1/1:0,7:7:21:1|1:270574995_T_C:305,21,0 
1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 0/0:5,0:5:0:.:.:0,0,92  1/1:0,3:3:9:1|1:270574995_T_C:135,9,0   
1/1:0,4:4:12:1|1:270574995_T_C:180,12,0 1/1:0,7:7:21:1|1:270574995_T_C:315,21,0 
1/1:0,11:11:33:1|1:270574995_T_C:495,33,0   1/1:0,8:8:24:1|1:270574995_T_C:355,24,0 
1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 1/1:0,8:8:24:1|1:270574995_T_C:360,24,0 
1/1:0,7:7:21:1|1:270574995_T_C:315,21,0 1/1:0,9:9:27:1|1:270574995_T_C:372,27,0 
1/1:0,9:9:27:1|1:270574995_T_C:405,27,0 1/1:0,9:9:27:1|1:270574995_T_C:405,27,0 
1/1:0,6:6:18:1|1:270574995_T_C:270,18,0 0/0:4,0:4:12:.:.:0,12,109   1/1:0,8:8:24:1|1:270574995_T_C:360,24,0 
1/1:0,11:11:33:1|1:270574995_T_C:495,33,0

These are consecutive SNPs, one very deleterious and the other one, compensating it. I would like to know why the samples fields, where I get the genotype for both alleles in each sample, look that way. I would expect to hace these field like, say, 1/1:0,20:20:60:1, but I get 1/1:0,20:20:60:1|1:270574995_T_C:900,60,0. Why is that? I've checked other SNPs and they look as expected.

I would also like to know why the second mutation have the first mutation cited in the samples fields.

Anyone know if this is a special type of output meaning something? Or simply I should not care about it?

Thanks

vcf sample SNP • 1.6k views
ADD COMMENT
1
Entering edit mode

You should look at your VCF header. From the FORMAT column, it is evident the field you're looking for information on is called PID, so look for that in the header's ##FORMAT section.

ADD REPLY
0
Entering edit mode

I looked at the header and indeed it is refering to the PID and PGT fields. I have been looking about the meaning of this, related to physical phasing. From what I have understood, this applies for consecutive variants or near variants. I do not understand what are the implications of that, as I also noticed that Allele Frequency (AF) are the same for both SNPs.

Could this mean that both SNPs are always present as an haplotype and always segregate together?

ADD REPLY
0
Entering edit mode

this applies for consecutive variants or near variants

it means that the variants are located on the same homologous chromosome.

ADD REPLY

Login before adding your answer.

Traffic: 2217 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6