I have non-model organism and I have annotated SNPs using Variant effect Predictor with VCF file using Ensembl assembly and I got the list of synonymous and non synonymous mutations, stop codon etc for each contig/scaffold id. The ensembl assembly is based on automated genebuild pipeline annotation, it has only contig/scaffold id not chromosomes id.
My question, (1) Is there any tool to take Ensembl variant effect predictor output to mark the synonymous and non synonymous mutations on the protein domain level? (2) My idea is to show these mutations in gene, protein level or any type of disease associated with these mutations.Any suggestion or paper related to this are welcome. VCF file format is given below
##fileformat=VCFv4.0
##source=VarScan2
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##FILTER=<ID=str10,Description="Less than 10% or more than 90% of variant supporting reads on one strand">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1
gi|xxxxxxxxx|gb|RAxxxxx1.1| xxxxxxxx . G A PASS DP=21 GT:GQ:DP 0/1:1:21
how does your VCF look like ?
I have added first few lines of VCF file, it has some alignment problem when I did copy & paste. Any solution to my questions are welcome.