Entering edit mode
3.5 years ago
storm1907
▴
30
Hello, I have following vcf file with header and this content of columns:
chr1 10643146 . G GC 63.2 PASS CSQ=|FAIL|0.00|0.00|0.01|0.00|13|40|-3|13|||MODIFIER|CASZ1|ENSG00000130940|ENST00000377022|protein_coding||19/20|||||,|FAIL|0.00|0.00|0.01|0.00|13|40|-3|13|||MODIFIER|AL139423.1|ENSG00000272078|ENST00000606802|lncRNA||1/1||||| GT:GQ:DP:AD:VAF:PL 0/1:58:86:40,45:0.523256:63,0,59
chr1 10646034 . G C 64.8 PASS CSQ=|FAIL|0.00|0.00|0.00|0.00|22|3|1|2|||MODIFIER|CASZ1|ENSG00000130940|ENST00000377022|protein_coding||17/20|||||,|FAIL|0.00|0.00|0.00|0.00|22|3|1|2|||MODIFIER|AL139423.1|ENSG00000272078|ENST00000606802|lncRNA||1/1||||| GT:GQ:DP:AD:VAF:PL 0/1:59:27:13,14:0.518519:64,0,60
I would like to extract only gene name in first column, and chromosomal position in second column, so that my final file could like:
chr1:10643146 CASZ1
are there any options in awk, how to do that?
Thank you!
this tool is not appropriate, as I got message:
This is the header of my vcf:
That plugin was just an example of a search result one would find. There should be other tools to parse VEP output. Plus, you may want to update that plugin so it works - that's how open source software stays relevant.
What was the result of your trials with
bcftools query
?