I am having a hard time deciphering the real meaning of the warnings issued by snpeff.
snpeff docs state:
WARNING_TRANSCRIPT_INCOMPLETE -- A protein coding transcript having a non-multiple of 3 length. It indicates that the reference genome has missing information about this particular transcript.
WARNING_TRANSCRIPT_MULTIPLE_STOP_CODONS -- A protein coding transcript has two or more STOP codons in the middle of the coding sequence (CDS). This should not happen and it usually means the reference genome may have an error in this transcript.
I am using the standard S. cerevisiae genome as a reference (provided by snpeff).
The input file has ~80 strains of data output from a bowtie2 & GATK pipeline.
When running snpeff as follows:
java -Xmx6g -jar /bin/snpEffv3.6/snpEff.jar eff -c /bin/snpEffv3.6/snpEff.config EF3.64 -ud 1000 -v Scerevisiae.vcf > results.vcf
It reports:
WARNINGS: Some warning were detected
Warning type Number of warnings
WARNING_TRANSCRIPT_INCOMPLETE 6459
WARNING_TRANSCRIPT_MULTIPLE_STOP_CODONS 425
Can anyone elucidate these messages?
Thank you