Hi,
I've been using Ensembl Variant Effect Predictor (VEP) to determine the effect of variants in non-model organism. Despite of being a collection of variants identified in non-model organisms it has a reference genome and a VEP database hosted in ensembl fungi portal.
So, I ran vep using a multi-sample VCF as input and got a final VCF with annotations about the consequence of each variant. When I was looking at the results I realised a variant that supposedly result in a stop codon insertion in some samples as you can check in first line bellow.
Allele Consequence IMPACT SYMBOL Gene Feature_type Feature BIOTYPE EXON
2055 T stop_gained HIGH ATEG_09961 CADATEAG00000507 Transcript CADATEAT00000507 protein_coding 8/8
2059 C missense_variant MODERATE ATEG_09961 CADATEAG00000507 Transcript CADATEAT00000507 protein_coding 8/8
INTRON HGVSc HGVSp cDNA_position CDS_position Protein_position Amino_acids Codons Existing_variation DISTANCE
2055 8260 8260 2754 Q/* Cag/Tag
2059 8262 8262 2754 Q/H caG/caC
STRAND SYMBOL_SOURCE
2055 1 CADRE
2059 1 CADRE
But if you pay attention there's also a second variant into same codon (same Protein_position) but two bases apart from the first variant. Don't VEP should take into account the variants into same codon before output the consequence results?
Because if we consider the second's variant role and supposing that a sample is homozygote for variant allele in both variants the consequence will be the insertion of a TaC (translate to tyrosine).
I don't know if I did something wrong or miss any parameter in my analysis but it sounds a bit strange for me that VEP's algorithm doesn't handle with issues related to variants located at same codon.
Very cool - well done for creating this! :)