Extracting Genotype Information From Vcf
4
4
Entering edit mode
11.3 years ago
rickyflintoff ▴ 100

Is there a quick way to extract genotype information about a sample from a vcf file? For a given VCF, I would essentially like to create the following table for a given sample: POS CHROM GT

• 19k views
ADD COMMENT
2
Entering edit mode

did you try a few 'cut's ? like cut -f 1,2,10 | cut -d ':' -f1 ?

ADD REPLY
0
Entering edit mode

I have a 10 line python script. Do you run how to run a python script? I can send it to you. In case you want to extract all the positions irrespective of if the given sample is polymorphic (compared to reference) for that position then you can simple use "cut" command from unix.

ADD REPLY
0
Entering edit mode

i would like to have a look at that script and most probably will use it. can you possibly share?

ADD REPLY
2
Entering edit mode
11.3 years ago
Adam ★ 1.0k

You can do this in vcftools:

vcftools --vcf <your_vcf> --indv <your_sample> --extract-FORMAT-info GT --out <prefix>

The results will be in a file called "<prefix>.GT.FORMAT".

ADD COMMENT
0
Entering edit mode

I am having some issue reading the exported file (extension GT.FORMAT) in my python script, even when copied into a text file (OSX). Is there some conversion required to read the file properly?

ADD REPLY
2
Entering edit mode
7.7 years ago

Use gatk varianttotable

ADD COMMENT
1
Entering edit mode
11.3 years ago
Rm 8.3k

check vcflib for various vcf utility programs including vcfgenotypes

you can clone the git and install

ADD COMMENT
1
Entering edit mode
10.1 years ago

There's a vcftools module that does exactly what you were asking for: vcf-to-tab. the syntax is very simple:

vcf-to-tab <in.vcf >out.tab
ADD COMMENT

Login before adding your answer.

Traffic: 2586 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6