hi all, i have a rather large VCF files with about 2.8 million variations. i want to know if each of the variation is within an exon, intron or promoter region?
i have downloaded the corresponding files from RefSeq and I wanted to know if i could use BED Intersect to accomplish this?
can anyone help with the syntax especially i need the output file in a form that i can parse easily and which will tell me if the variant was exonic or intronic or in the promoter region.
thanks in advance
Unfortunately I won't be able to flesh out a more complete answer, but have you taken a look at using a tool like 'annovar' to annotate your VCF file?
http://www.openbioinformatics.org/annovar/