Understanding VCF file for downstream analysis
0
0
Entering edit mode
7.6 years ago

I got this vcf file from my collaborator and I am so trying to understand this for the last few days but no avail. I needed to use this vcf for all downstream analysis such as Structure, PC, Linear Discriminant Analysis. I have worked with VCF files before but I never seen this kind before.

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  1KS 2861    2862    2A
Potrs007791 23342   .   G   T   427264.55   PASS    AC=342;AF=0.777;AN=440;BaseQRankSum=14.742;DP=16493;Dels=0.00;ExcessHet=32.5636;FS=1.803;HaplotypeScore=1.1332;InbreedingCoeff=-0.2081;MLEAC=342;MLEAF=0.777;MQ=34.94;MQ0=208;MQRankSum=-58.531;QD=26.19;ReadPosRankSum=4.181;SOR=0.995 GT:AD:DP:GQ:PL  B   B   B   B

Here 1KS, 2881, 2862 and 2A are samples and B here indicates Alternate SNP. Can someone help me convert this to regular VCF so that I can start using that? I understand that I need values for each of the GT:AD:DP:GQ:PL but I don't know how to get these from the INFO column.

vcf SNP GATK • 2.5k views
ADD COMMENT
1
Entering edit mode

it looks like a VCF file but it's not (anymore) a VCF file.

ADD REPLY
0
Entering edit mode

Is there a way to convert this to VCF file based on information in INFO column?

ADD REPLY
0
Entering edit mode

INFO column needs a header, and the header is missing. Nobody but your collaborator knows what are those "B" in the genotype columns.

ADD REPLY
0
Entering edit mode

Ok. I will ask my collaborator. Thanks..

ADD REPLY

Login before adding your answer.

Traffic: 1359 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6