Hello all,
I have just completed my first RNAseq analysis using EdgeR.
But since I am working on non-model organism I do not have annotation file or corresponding gene names for it. I have got 22864 transcripts by using reference transcript data of closest relative. My question is how can I assign gene names to those transcripts? Blast is one way but it is not helpful for such a big data set. I have ran interproscan but that gives me different domains. This information is not 100% helpful since several genes may have same domains. can somebody please tell me any method or software that I can use to give proper functional names to my transcripts?
Thank you in advance
Amol
Hi, I think the easiest method for functional annotation is homology search, i.e., BLAST analysis. You can probably not use NCBI NR as database but the genes/proteins from you close relative if they are already annotated. Otherwise, you have to create your own database from a set of related organisms. But any method you choose will be computationally demanding and time consuming.
Gene/transcript/protein annotation is a complex task and you should have a look at some papers about de novo genome assembly and annotation as well. This might give you a feeling about the challenges that are waiting for you ;-)