Hi
I use augustus gene prediction software since my organism is a unicellular eukaryote.From my nucleotide dataset I am more concentrating on membrane proteins.
I trained augustus with input of "genome file" and Protein file for "training". The protein file contained some incomplete transmembrane protein sequences. After training, I used this information to predict genes. I found a single gene being predicted as several genes. Usually this was common for Transmembrane proteins.
I dont have very nearby relative whose transmembrane protein sequences are available with annotation. Any suggestions to overcome this problem?
thank you
raghul
Thanks for the answer!
Is it acceptable to include protein sequences from 3 related species for "training" in gene prediction? Will this cause errors or more information cause better gene prediction?