Hi! I did functional annotation (using KEGG, eggNOG, TRAPID, NR,Uniprot, and Mercator) of 5 cultivars (plants) to do my analysis. Long story short I identified genes that are private to each cultivar. Almost 70% genes in every cultivar have no functional domain. What can I do about them? What would be the best way to proceed with my analysis? Its more like hearing your opinion. Also, I did functionally annotated transcripts but I am not sure how can utilize them!
I have around 33-35k genes predicted per cultivar. No, I have not used Interproscan. I will just do it by running it on server. By the way I manually blasted a few random proteins via their online tool but got no functional domain.
I think complete functional domains might be rarer than you think. I'm not sure, but you can check some of the more recent comparative literature on what you might expect. This might be a good place to start:
https://biofunctionprediction.org/cafa/
Also, a lot of researchers look at GO terms not just domains.
Interproscan do provide results but I dont think they are much valid because it annotates even parts of protein like: from amino acid 1-34, there is Xxxx protein and so on. I read about CAFA before your comment, but I think this is still in experimetal stage? And we can't use it like other databases? If I am wrong please correct me! And thank you for being a part of this discussion. I have got GO terms from eggNOG already, should I do it separately? What do you suggest here?