gffread with wrong output
0
0
Entering edit mode
6.4 years ago

I am trying to get FASTA sequence from a GTF file having transcripts with multiple exons (filtered from merged_transcripts.gtf from cufflink pipeline. To be precise with class_code "u")

So I am using gffread using this filtered GTF file and the genome file to get the sequence for those transcripts.

But while looking at the output, For a transcript sometime sequence starts from 2nd exon and sometimes 3rd exon.

Can anyone suggest what may be going wrong here?

sequence RNA-Seq assembly • 2.3k views
ADD COMMENT
0
Entering edit mode

Hi Sangram

Can you post some examples?

ADD REPLY
0
Entering edit mode

Sure,

Few lines from GTF file (intergenic.gtf)

1   Cufflinks   exon    1477451 1477950 0   -   0   gene_id "XLOC_004355"    transcript_id "TCONS_00012049"  exon_number "1"     oId "CUFF.237.1"    class_code "u"  tss_id "TSS5794"
1   Cufflinks   exon    1478389 1478918 0   -   0   gene_id "XLOC_004355"    transcript_id "TCONS_00012049"  exon_number "2"     oId "CUFF.237.1"    class_code "u"  tss_id "TSS5794"

gffread syntax:

gffread -g genome.fa intergenic.gtf -w out.fa

So while retrieving sequence using gffread I should be getting sequence having both the exons. But output comes with 2nd exon only.

ADD REPLY
0
Entering edit mode

I doubt the gtf filtering done. Could you once try something as below to be sure

Do a gffread -w on the whole gtf and get the complete transcripts fasta.

Then get the list of transcripts of 'u' class code from the gtf.

Subsequently get those transcript sequences alone from the fasta .

Check whether the issue persist.

ADD REPLY
0
Entering edit mode

After doing this step also, It's giving some similar kind of results. I tried to understand hard why, but couldn't.

So I took a different approach with bed tool after filtration setup to get the sequence of each exon of transcripts and some python script to combine them. Its rather complicated, but worked out for now.

Anyways, Thank you Jeiffin :)

ADD REPLY
0
Entering edit mode

Okay. When you get time, give a try with some other gtf and see how it turns out.

ADD REPLY

Login before adding your answer.

Traffic: 2465 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6