Dear all,
currently, I am analyzing variants in genomes of Aspergillus Fumigatus strains and would like to predict the consequences of amino acid changes/frameshifts etc. For human genome data I am familiar with SIFT/PolyPhen etc. and routinely use that for annotation using ANNOVAR.
So far, I couldn't find any preconfigured databases for A.Fumigatus and tried to build my own genome-scale variant predictor database with PROVEAN (though I would be happy with any prediction tool). As I understand, I would have to generate a list with every single possible nucleotide exchange in the genome, but it is not very clear to me how to get to that point. Ideally, the output would be a list that could be used by ANNOVAR for annotation. Maybe someone else already ran into these issues with custom genomes and could share some knowledge?
Thanks!
Thanks for the feedback Charles!
I already worked with SnpEff and it does annotate variants nicely. But my problem is rather to create a protein effect database that can be used by SnpEff/ANNOVAR to annotate a given (amino acid changing) variant with the predicted effect on the protein, e.g. damaging/deletorious/tolerable. For human genomes this is not a problem since PolyPhen/SIFT databases are readily available, but this is not the case for A.Fumigatus (as far as i know), meaning I would have to create such database from scratch. I am currently stuck at this step :(
what about VEP/Ensembl ? ftp://ftp.ensemblgenomes.org/pub/release-28/fungi/vep/
Thanks for the hint! I already checked VEP from Ensembl but it seems to lack any information about PolyPhen/SIFT etc. (or I couldn't find it?). However, I found SIFT databases for A.fumigatus on (surprise) the SIFT homepage which can be easily adapted for annotation using ANNOVAR:
http://sift-db.bii.a-star.edu.sg/public/Aspergillus_fumigatus/