how to get actual gene sequence after cufflinks
3
1
Entering edit mode
9.8 years ago
Michel Edwar ▴ 80

Hello,

I have a file.bam file with mapped genes to the human genome. I uploaded it to galaxy and performed cuff links. it has produced the three outputs

a) gene expression

b) transcript expression

c) assembled transcriptome.

All I need is the gene sequence , with the gene ID, how can I download that ?

Thanks

RNA-Seq cufflinks • 3.0k views
ADD COMMENT
3
Entering edit mode
9.8 years ago
David Fredman ★ 1.1k

Use the gffread tool that comes bundled with cufflinks

gffread -g genome_reference.fasta -w transcript_sequences.fasta assembled_transcripts.gtf
ADD COMMENT
2
Entering edit mode
9.8 years ago
Manvendra Singh ★ 2.2k

Once your get gtf file from cufflinks convert it to bed by using gtf2bed utility from bedops from [here][1] then fetch fasta sequence from bed file using bedtools

fastaFrombed -fi hg19.fa -bed gtf2bed.bed -fo gtf2bed.fa -s

hth

[1]; http://bedops.readthedocs.org/en/latest/content/reference/file-management/conversion/gtf2bed.html

ADD COMMENT
1
Entering edit mode
9.8 years ago

Maybe you can use NCBI. At the very top of the web page, you have the chance to write the gene ID name. In the left part, you can choose whether a nucleotide, gene or protein should be shown

Yoy did not specify where the gene ID comes from. There are different alternatives to NCBI depending upon where the gene ID is being taken

ADD COMMENT
0
Entering edit mode

This solution would not work. The transcripts are assembled by Cufflinks. They might not be present in public databases. It would be useful if Cufflinks could output the sequence.

ADD REPLY

Login before adding your answer.

Traffic: 1773 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6