How To Find Novel Genes From Comparative Transcriptomics?
2
2
Entering edit mode
10.8 years ago

We have sequenced from two tissue of plants which are distantly related but are from same species. I want to compare these two transcriptome of these two plants and to find novel genes by comparing two transcriptomes. How can I do that?. Any suggestions?.

I am thinking on the workflow 1. Denovo assembling of two plant transcriptome separately
2. Do gene expression analysis for these denovo assembled plants separately
3. Find novel overlapping genes of these plants.

Is this workflow is correct, please correct me if I am wrong.

transcript • 4.8k views
ADD COMMENT
0
Entering edit mode

Please let me know if this question is not clear.

ADD REPLY
5
Entering edit mode

Here are some issues:

  • "We have sequenced from two tissue of plants which are distantly related but are from same species." what do you mean by that?
  • We are not going to steal your ideas ;) please tell us the name of the species, and maybe also the tissues.
  • Is there a reference genome?
  • Is there already a gene prediction?
  • You say you want "novel genes", novel with respect to what: the existing predictions on these plants? or novel to the world of genes?
ADD REPLY
1
Entering edit mode
10.7 years ago
jackuser1979 ▴ 890

I assume you don't have reference genome. So you can do comprehensive denovo assembly combining both plant sequence reads. Then map the reads of these two plant reads to the denovo assembly and can do differential gene expression analysis. Any differential gene expression finding tools like DEseq, EdgeR or cufflinks can do this. I recommend cufflinks which can do transcript assembly and can find novel gene and transcripts. Refer cufflinks manual for more information.

ADD COMMENT
1
Entering edit mode
10.7 years ago
Adrian Pelin ★ 2.6k

Based on your question, I suggest denovo assembly with oases, then jump to #3 and find unique/overlapping genes using tblastx evalue of e-5. After you found something interesting then you can quantify expression.

ADD COMMENT
0
Entering edit mode

why using tblastx? isnt it extremely slow? he can use blastx instead, no?

ADD REPLY
0
Entering edit mode

Think about what you are suggesting. He has 2 databases of assembled transcripts, they are both at the nt level. How can he use blastx? blastx required query to be nt and db to be protein.

ADD REPLY
0
Entering edit mode

I was not suggesting I was just wondering. So you are suggesting to run tblastx of one assembled transcript database against the other transcript database, then tblastx makes sense. I apologise for misunderstanding, obviously blastx is out in this case. I have some doubts whether to find overlaps and then do GE or better vice versa, I suppose the other way round is better, but if you have some facts showing that is it better to do overlaps then GE, it would be interesting to know..

ADD REPLY

Login before adding your answer.

Traffic: 1798 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6