Hi all, I have my own de-novo assembled reference genome and gff file. I aligned my data on newly assembled reference genome. I have now some SNPs. I want to annotate these snps using snpeff to know about the genes linked with these SNPs. The problem is database creation for snpeff. I am not able to understand the process of database development for my newly assembled genome. Anyone knows how to do this? any guide or manual for that?
<h6>#</h6>problem#02: I want to divide my reference genome into 100kb bins and based on the gene density I want to assign SNPs (ones that I got after aligning my data on newly assembled genome) into each bin. For now I have raw vcf file (not annotated) and gff file which contains gene information. How can I use the information from gff file to calculate gene density in each bin and based on density distribute SNPs into each bin. Should I annotate my file first?
There is a section on the snpeff manual on this which I am 100% certain you can find. I've done it myself, it is a little challenging.
It can lead to weird results (I used gmap for annotation and had to use a strange switch to get "correct" eg expected behaviour).