Extract several fields from vcf file
3
2
Entering edit mode
7.6 years ago

Hello all, I'd like to modify my vcfs to contain several info only. There are "GT:AD:DP:FT:GQ:PL:PP" in format column, but I want to include "GT,DP,GQ" only.

I ran "--extract-FORMAT-info" of vcftools, but the result file is not vcf format. What I want to have finally, is vcf file with "GT,DP,GQ" fields only.

Does anyone know how to handle this? Thanks.

vcf • 6.0k views
ADD COMMENT
1
Entering edit mode

Have you tried AWK aleady?

ADD REPLY
0
Entering edit mode

Actually I extracted these fields with python. But when I tried to analyze this file using rare variant association tool like pseq, vtools, and rvtest, something is not working properly.

I wonder it there are validated tools to extract, not with linux or python.

ADD REPLY
0
Entering edit mode

Can you please provide a line having GT:AD:DP:FT:GQ:PL:PP from the vcf file?

ADD REPLY
0
Entering edit mode

Thanks, but I am looking for tools to handle this.

ADD REPLY
0
Entering edit mode

you can use bcftools annotate to keep/remove FORMAT fields

ADD REPLY
2
Entering edit mode
7.6 years ago
Len Trigg ★ 1.6k

Using RTG Tools:

rtg vcfsubset -i input.vcf.gz -o output.vcf.gz --keep-format GT,DP,GQ
ADD COMMENT
2
Entering edit mode
6.9 years ago
erdiazval ▴ 110

I would go for using SNPSift program This is an instance of how I use it to extract: position, ref allele, alt allele, Allele Depth (AD from genotype field), and functional annotation by gene ID.

!/bin/bash
for i in *.vcf;
do java -jar /data/software/snpEff/snpEff/SnpSift.jar\
extractFields "$i"\
POS REF ALT GEN[*].AD ANN[*].GENEID > "filt_${i}";
done
ADD COMMENT
1
Entering edit mode
7.6 years ago
nsmi8446 ▴ 170

Have you come across SnpSift in your search for tools to do this?

http://snpeff.sourceforge.net/SnpSift.html

Section 10 in the link above (SnpSift documentation) could potentially be useful: 10. SnpSift Extract Fields

ADD COMMENT

Login before adding your answer.

Traffic: 1460 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6