extracting parts of vcf file
1
0
Entering edit mode
3.5 years ago
storm1907 ▴ 30

Hello, I have following vcf file with header and this content of columns:

chr1    10643146    .   G   GC  63.2    PASS    CSQ=|FAIL|0.00|0.00|0.01|0.00|13|40|-3|13|||MODIFIER|CASZ1|ENSG00000130940|ENST00000377022|protein_coding||19/20|||||,|FAIL|0.00|0.00|0.01|0.00|13|40|-3|13|||MODIFIER|AL139423.1|ENSG00000272078|ENST00000606802|lncRNA||1/1|||||  GT:GQ:DP:AD:VAF:PL  0/1:58:86:40,45:0.523256:63,0,59
chr1    10646034    .   G   C   64.8    PASS    CSQ=|FAIL|0.00|0.00|0.00|0.00|22|3|1|2|||MODIFIER|CASZ1|ENSG00000130940|ENST00000377022|protein_coding||17/20|||||,|FAIL|0.00|0.00|0.00|0.00|22|3|1|2|||MODIFIER|AL139423.1|ENSG00000272078|ENST00000606802|lncRNA||1/1|||||    GT:GQ:DP:AD:VAF:PL  0/1:59:27:13,14:0.518519:64,0,60

I would like to extract only gene name in first column, and chromosomal position in second column, so that my final file could like:

chr1:10643146             CASZ1

are there any options in awk, how to do that?

Thank you!

vcf plink • 958 views
ADD COMMENT
2
Entering edit mode
3.5 years ago
Ram 44k

Take a look at bcftools query. The VCF data you show above seems to be the output of VEP, so searching online for "VEP extract fields" might yield some interesting results. See: https://samtools.github.io/bcftools/howtos/plugin.split-vep.html

ADD COMMENT
0
Entering edit mode

this tool is not appropriate, as I got message:

The field "Consequence" is not present in INFO/CSQ: "Consequence annotations from Ensembl VEP. Format: 'Allele

This is the header of my vcf:

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  sample.F
chr1    69270   .       A       G       55.4    PASS    CSQ=|FAIL|0.01|0.00|0.00|0.00|-10|26|-28|-25|||LOW|OR4F5|ENSG00000186092|ENST00000335137|protein_coding|1/1||216|60|S|tcA/tcG|,|FAIL|0.01|0.00|0.00|0.00|-10|26|-28|-25|||LOW|OR4F5|ENSG00000186092|ENST00000641515|protein_coding|3/3||303|81|S|tcA/tcG|    GT:GQ:DP:AD:VAF:PL      1/1:55:18:0,18:1:55,65,0 
ADD REPLY
0
Entering edit mode

That plugin was just an example of a search result one would find. There should be other tools to parse VEP output. Plus, you may want to update that plugin so it works - that's how open source software stays relevant.

What was the result of your trials with bcftools query?

ADD REPLY

Login before adding your answer.

Traffic: 1567 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6