Hi, I want to find the significant variants fall in non-coding regions with the WGS data of cancer patients. I called variants and now I have some variants calling reports (vcf files) look like:
CHROM POS REF ALT ....
1 567579 C T
1 569624 T C
1 808156 A G
....
Then, I want to add some columns in these vcf files to show that if this variant is located in any regulatory regions like promoters, enhancers, etc. e.g.
CHROM POS REF ALT INFO
1 567579 C T promoter of gene1
1 569624 T C enhancer of gene2
1 808156 A G intron of gene3
I have tried some tools on web interface but weird problems always happened, and I also found many databases but don't know which one is good and how to use.... Can anyone help me? It makes a beginner feel very confused.
Thanks, but the effects are included in my variants reports ..
If the effects are already in your reports, then what are you asking for?
How do I know that which variants are in which gene's promoter and enhancer?
For that you'd need to know comprehensively which promoters and enhancers belong to which genes, which we don't know.