Entering edit mode
6.6 years ago
Tania
▴
180
Hi Guys
I used gffread -w transcripts.fa -g /path/to/genome.fa transcripts.gtf
to get my transcripts in fasta format.
What I got is cufflinks ids because this is what is found in cufflinks transcripts.gtf:
>CUFF.1.1 gene=CUFF.1
GTGACTGAACTCTTCACCCCAGTCTGTGGCTTTCCCGTTGCAGTGAGAGCCACGAGCCAAGGTGGGCACT
TGATGTCGGATCTCTTCAACAAGCTGGTCATGAGGCGCAAGGGCATCTCTGGGAAAGGACCTGGGGCTGG
TGAGGGGCCCGGAGGAGCCTTTGCCCGCGTGTCAGACTCCATCCCTCCTCTGCCGCCACCGCAGCAGCCA
CAGGCAGAGGAGGACGAGGACGACTGGGAATCGTAGGGGGCTCCATGACACCTTCCCCCCCAGACCCAGA
How I convert this to something like:
>ENST00000342066.7
CCAGCAGATCCCTGCGGCGTTCGCGAGGGT
or even NCBI codes, fine with me.
I need this to use lncscore
to detect some novel lncRNA.
Thanks
I think this is not what I need. I need my cufflink assembled transcripts in Ensemble format or NCBI?
Did you use the -G option and a known reference GTF file with Cufflinks at some point in this process?
No, should I use that?
From
cufflinks
manual:Great, thanks genomax a lot, appreciated :)