Get transcript sequence from RNA-seq
0
0
Entering edit mode
8.3 years ago
colin.kern ★ 1.1k

I have cufflinks output from tophat alignments and I want to get the sequences of the transcripts. I've been extracting the sequence from the reference genome, but I'm working in chicken where the reference genome is constructed from the wild type and I'm sequencing a very specialized breed, so I would really like to get the sequences of the transcripts from my RNA-seq data. I've searched around this site and other places and found some solutions like generating vcf files with samtools but they all seem geared towards just getting a single sequence, rather than thousands. I think using a loop with these methods will be extremely slow. Is there any quicker way to get the full set of transcript sequences predicted by cufflinks from the RNA-seq data?

RNA-Seq • 2.6k views
ADD COMMENT
0
Entering edit mode

Read this description :

https://transdecoder.github.io/

And read these papers:

RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome

http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-12-323

MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3789545/

These people worried about splicing:

http://www.cs.colostate.edu/~asa/pdfs/spliceGrapherXT.pdf#page=1&zoom=auto,-73,798

http://journals.plos.org/plosone/article?id=10.1371%2Fjournal.pone.0156132

ADD REPLY
0
Entering edit mode

I don't see how TransDecoder is useful here. It seems like it requires already having a fasta file of the transcript sequences, or if you input a gtf it extracts the sequences from the genome which is what I don't want to do.

RSEM is not suitable as I'm interesting in novel transcripts, and RSEM aligns to a known transcript set rather than the whole genome (unless I'm misunderstanding).

I am not sure MITIE is good for my purpose either. It says it will report a small set of optimal transcripts from a set of RNA-seq libraries, however I'm interested in finding novel transcripts, especially long non-coding RNA with a focus on tissue-specific transcripts. So I think MITIE would miss picking up many of those.

ADD REPLY

Login before adding your answer.

Traffic: 1899 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6