Gene ID's for a trinity assembled de novo transcriptome
1
0
Entering edit mode
4.2 years ago
bry.th • 0

So the RNAseq data is for a non-model organism. The transcriptome was assembled using Trinity. However, Trinity has labelled the genes with it's own madeup title (in bold).

>TRINITY_DN41182_c0_g1_i1 len=209 path=[1:0-208] [-1, 1, -2]
ATGGTGAGAACTGCCCATGTGATGGAGACTCAGTATGGCCATCTGTTTGAAAAGGTCATA
GTCAACGACGACCTCTCGACCGCCTTCAGCGAGCTGCGGTTGGCACTAAAGAAAGTGGAG
ACGGAGACTCACTGGGTTCCAGTCAGCTGGACCCACTCCTGAGATCCTCACAGACTGTAA
AGGGAGAAAAGGGAAGGACTTTGACAAAA

>TRINITY_DN41181_c0_g1_i1 len=207 path=[1:0-206] [-1, 1, -2]
TATGGACCCCCTCCTCCTCCCCCTGGCGAGTACGGCGGCCATGCTGAGTCTCCGGTTGTC
ATGGTGTACGGATTGGACCCCGTCAAGATGAACGCAGACCGTGTCTTCAACATCTTCTGT
CTCTATGGCAACGTAGAGCGGGTCAAGTTCATGAAGAGTAAGCCCGGAGCAGCCATGGTG
GAAATGGGAGACTGTTACGCGGTGGAT

Which means when you map the reads to the assembled reference you get

target_id                 length    eff_length  est_counts        tpm
TRINITY_DN34124_c0_g1_i1     205        27.253           0          0
TRINITY_DN34120_c0_g1_i1     236       34.7816          15    14.2884

I need to use the sequence to look up gene ID's but I don't know how to do this. The closest genome I can find is with Ensembl DB for s.orbicularis, or A. percula but I don't know how to use these to convert the trinity output into something meaningful. I'm more comfortable using R, if possible but obviously beggars can't be choosers.

RNA-Seq R Trinity • 2.7k views
ADD COMMENT
1
Entering edit mode

You need to annotate the transcripts yourself using a program like maker (LINK) (eukaryotic genome) or prokka (LINK) (bacterial genome). Be sure to remove any redundancy before you annotate (using something like CD-HIT).

ADD REPLY
0
Entering edit mode

Thanks! Trinotate sounds like what I would need. I'll check it out

ADD REPLY
2
Entering edit mode
4.2 years ago
Dave Carlson ★ 2.0k

Note that Maker is used for gene prediction from whole genome assemblies (often using transcriptomes in the process). If all you have is the transcriptome assembly, something like Trinotate (LINK) might be a more appropriate choice.

ADD COMMENT
0
Entering edit mode

I second Trinotate (+ TransDecoder), as it is tightly integrated with Trinity and does an overall good job. There are other transcriptome annotation pipelines around, like dammit, Annocript, Sma3s, but I never used any of them and I don't know how they perform.

ADD REPLY
0
Entering edit mode

I've used dammit before. It's very fast and easy to use. I think I like Trinotate better, though, mostly because it providers a somewhat wider set of annotation types and because of the integration with Trinity that you mentioned.

ADD REPLY

Login before adding your answer.

Traffic: 2634 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6