Hi all,
I called variants using GATK, and annotated the results using snpeff. Since the organism is chinese hamster, there is not much information available. I want to use provean to predict the effects of variants. The problem is that the input for provean for non human/mouse model should be amino acids sequence and amino acids variants. What I have now are variants in genomic level, does anyone know any tool that can transfer the genomic vcf files to input files for provean? Thanks.
Did snpEff list the coding variants? You can use that to filter the VCF and get only the relevant variants (as the first step)
Thanks. Yes, there is coding variants, but provean also needs the whole protein sequence, After building the database using snpeff, there is only one .bin file, do you know how to get the whole protein sequence? The only way I can think of is using genome annotation file, but it would be a little bit tricky.
Standalone PROVEAN? I've only used Web based PROVEAN. If you know the start codon location, translation shouldn't be a huge problem. I can't recollect any tool that gives you protein mutations from nucleotide changes, sorry :-(
Hi @shl198 I have the same issue did you find a solution
Do not add answers if you're not answering the top level question. Use Comments instead.