How to get input for provean from vcf files annotated by snpeff?
1
2
Entering edit mode
10.0 years ago
shl198 ▴ 440

Hi all,

I called variants using GATK, and annotated the results using snpeff. Since the organism is chinese hamster, there is not much information available. I want to use provean to predict the effects of variants. The problem is that the input for provean for non human/mouse model should be amino acids sequence and amino acids variants. What I have now are variants in genomic level, does anyone know any tool that can transfer the genomic vcf files to input files for provean? Thanks.

snpeff vcf provean • 4.4k views
ADD COMMENT
0
Entering edit mode

Did snpEff list the coding variants? You can use that to filter the VCF and get only the relevant variants (as the first step)

ADD REPLY
0
Entering edit mode

Thanks. Yes, there is coding variants, but provean also needs the whole protein sequence, After building the database using snpeff, there is only one .bin file, do you know how to get the whole protein sequence? The only way I can think of is using genome annotation file, but it would be a little bit tricky.

ADD REPLY
0
Entering edit mode

Standalone PROVEAN? I've only used Web based PROVEAN. If you know the start codon location, translation shouldn't be a huge problem. I can't recollect any tool that gives you protein mutations from nucleotide changes, sorry :-(

ADD REPLY
0
Entering edit mode

Hi @shl198 I have the same issue did you find a solution

ADD REPLY
0
Entering edit mode

Do not add answers if you're not answering the top level question. Use Comments instead.

ADD REPLY
1
Entering edit mode
4.0 years ago
predeus ★ 2.1k

Unfortunately, it's not easy, because this requires mutation effect predictors to work in terminal, and most of those are old and work poorly. I've spent some time trying to get Provean to work, for example, but without any luck.

One tool that works brilliantly (and is well supported) is SIFT4G, a re-implementation of older SIFT. You can get all the instructions here: https://github.com/rvaser/sift4g

You'll need to work out the calibration (cutoffs) since databases change all the time.

ADD COMMENT

Login before adding your answer.

Traffic: 1666 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6