Reference Guided Transcriptome Assembly
2
0
Entering edit mode
10.8 years ago

I would like to assemble transcripts of several chromosomes. I do have sequences of these chromosomes from related species and would therefore like to do reference-guided transcriptome assembly.

I have two concerns:

1, which programs should I use? I have read about Cufflinks and have very little experience with it but it seems to provide only gtf file instead of sequences of all my isoforms. Since my reference is related species, I do think that differencies will be too big to be expressed just with gtf/bed file.

Is velvet's columbus option? What program would you recommend?

2, if I use as reference sequences from my related species, their gene content will vastly overlap. I suppose that this will make a lot of my reads impossible to align uniquely. Should I then use only one species reference each time?

Thanks a lot for advice.

reference assembly transcriptome • 4.2k views
ADD COMMENT
0
Entering edit mode
10.8 years ago
Adrian Pelin ★ 2.6k

First thing for me is always try a denovo assembly. Once you get your transcripts, you can use blastn or perhaps tblastx to see which ones are contaminants and which ones are from your chromosomes.

If your reference assembly is well annotated, than just extract your predicted genes, and use cufflinks to calculate FPKMs.

ADD COMMENT
0
Entering edit mode

Thanks, I did both, denovo assembly and blastn and tblastx to see how good my newly assembled transcripts align to chromosomes of my closely related species. Now I want to see how reference guided approach could/could not improve my transcripts.

ADD REPLY
0
Entering edit mode

By looking at calculated FPKMs, you can see which predicted genes are expressed and which ones are not.

ADD REPLY
0
Entering edit mode
10.8 years ago
Prakki Rama ★ 2.7k

As far i know, the GTF file that you have generated might contain the coordinates of your reference species and even if you extract the regions using the coordinates, it would not generate the transcripts of your species rather it extracts the reference sequences regions.

When assembling transcriptome which does not have a genome but has only reference species transcriptome, you could try a consenus reference assembly, where your reads are mapped on the references and your get a consensus sequences of your species.

you can check this Reference Assembly - Mapping Reads To A Reference Genome for much more info on how to do it

-Rama

ADD COMMENT

Login before adding your answer.

Traffic: 995 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6