Hello everyone,
I am trying to give function to differentially expressed genes across different samples. But now I have problem extracting fasta sequence for some differentially expressed genes.
I had a gtf file from augustus which I tried to merge it to gtf files generated by cufflink using cuffmerge. Now I realized that with some of the expressed genes I don't have gene name like g123, g6352, ect, but I have IDs like CUFF 2.1, CUFF 12.1, etc.
I already extracted fasta sequence from the genes having appropriate gene names. However I don't know how to extract sequence from other genes using cufflink IDs. Is there any solution to my problem. I will really appreciate your help.
Thank you,
Ambika
Have you tried
gffread
orfastafrombed
. Gffread has options to extract only coding region or proteins many more from gff/gtfNo I haven't tried that. The problem is is my annotation file I don't have those cufflink Ids.
You can also use gtf_to_fasta (provided by tophat package, available in ubuntu repos), if you have gtf file..