Entering edit mode
8.2 years ago
haoyan14ioz
•
0
I used Transdecoder to predict ORF, but there were still many genes had more than one transcripts like below.
>Zhze_TR100572|c0_g1_i1|m.127365 TR100572|c0_g1_i1|g.127365 ORF TR100572|c0_g1_i1|g.127365 TR100572|c0_g1_i1|m.127365 type:5prime_partial len:354 (-) TR100572|c0_g1_i1:322-1383(-)
TMTSAILRRNSSKQGLQNLIRLTAQWSVEDEEEAARERRRREREKQLRSQAEEGLNGTVS
CSESAALAQENHYDFKPSGTSELEEDEGFSDWSQKLEQRKQRSPRQSYEEENSGVREAEV
KLEQIQLDQECLEEKMVGREEGRLCQEEEEAQEQEEGEQAEQEEKKRRRNDGGKEEETPE
KRQKAPSLASLEEEELCSDHTAVCSTKITDRTESLNRSIQKSNSIKRSQPPLPVSKIDDR
LEQYTQAIETSTKAPKPVRQPSLDLPTTSMMVASTKSLWETGEVTAQSAVKPLACKDIVA
GDIVSKRSLWEQKGNPKPESSIKSIHPSGKKYKFVATGHGQYKKVLIDDAAEQ*
>Zhze_TR100572|c1_g1_i1|m.127368 TR100572|c1_g1_i1|g.127368 ORF TR100572|c1_g1_i1|g.127368 TR100572|c1_g1_i1|m.127368 type:complete len:184 (+) TR100572|c1_g1_i1:287-838(+)
MSDEEKKRRAATARRQHLKSAMLQLAATEIEKEAAAKEVEKQNYLAEHCPPLSLPGSMQE
LQDLCKKLHAKIESVDEERYDTEVKLQKTTKELEDLSQKLFDLRGKFKRPPLRRVRMSAD
AMLRALLGSKHKVCMDLRANLKQVKKEDTEKEKDLRDVGDWRKNIEEKSGMEGRKKMFEA
GES*
I would like to get the longest transcript to represent this gene for the following orthologs prediction, but I have no idea. Would you have some suggestion or some scripts for this problem? I'll really appreciate your help. Thank you.
Since these are translated ORFs (protein sequences) you will get longest CDSs from them and not necessarily longest transcripts (cDNAs). Longest transcript can code for shorter CDS (i.e. protein) due to long UTRs.
Hi Satya, thanks a lot for your kind reply. But how to get the best transcript? I need the best one as you suggested to represent this gene. Would you have some scripts? Thanks again.
Hi haoyan14ioz,
https://groups.google.com/forum/#!searchin/trinityrnaseq-users/longest$20isoform|sort:relevance/trinityrnaseq-users/cXM1KiJe7dU/rzGAmSt4jc8J
Thank you so much. I will check it.