Hi all
I've been starting to work with VCF (Variant Call Format) files recently, and I am not very familiar with them. The task that I have to solve right now is to retrieve all the genes associated to each VCF file. How can I do it?
To read the VCF files, I used the vcR
R function:
library("vcfR")
fileName <- "CODE.gatk.snp.indel.vcf"
vcf <- read.vcfR(fileName)
print(head(vcf))
print(colnames(vcf))
Other than that, I don't know how to proceed. Can someone suggest me how to move forward?
Should I convert this file to a bed file and then intersect it through bedtools intersect
with the human genome reference file?
Something else?
If we want to stay within R, then there is a bioconductor package ensemblVEP.