Question

Is Is Feasible To Produce Intron Gff According To Utr Gff And Cds Gff?

3

Entering edit mode

13.6 years ago

Dejian ★ 1.3k

We have two separate GFF files. One is UTR GFF file, describing the 5' and 3' UTR regions. The other is CDS GFF describing coding sequences. Is there a tool to generate the intron GFF according to these two files? Or, how is the intron GFF file usually produced?

gff intron • 9.8k views

ADD COMMENT • link updated 13.6 years ago by Abhi ★ 1.6k • written 13.6 years ago by Dejian ★ 1.3k

score 9 · Answer 1 · 2011-09-08

9

Entering edit mode

13.6 years ago

Haibao Tang 3.0k

If GFF3, use GenomeTools.

Usage: gt gff3 [option ...] [GFF3_file ...]
Parse, possibly transform, and output GFF3 files.

-addintrons add intron features between existing exon features
            default: no

ADD COMMENT • link 13.6 years ago by Haibao Tang 3.0k

2

Entering edit mode

Be careful with this approach though. If exon features are not explicitly defined, then gt will not create any intron features. I've made that mistake a few times.

ADD REPLY • link 13.6 years ago by Daniel Standage 4.1k

0

Entering edit mode

Pretty encouraging. I will try it. Thanks!

ADD REPLY • link 13.6 years ago by Dejian ★ 1.3k

score 0 · Answer 2 · 2011-09-08

Let me answer your two questions in reverse order.

Or, how is the intron GFF file usually produced?

In my experience, different feature types aren't typically stored in separate files--in other words, you don't have a CDS file, a UTR file, an intron file, etc, you simply have a single file with all the features in it. That doesn't mean your approach is incorrect, it just isn't typical and doesn't provide any immediate benefit (unless of course you are running scripts that have been built to expect it).

Is there a tool to generate the intron GFF according to these two files?

Perhaps, but it shouldn't be too difficult to do with minimal scripting experience. Once you have determined the exon coordinates using the CDS and UTR data, then simply create an intron feature to fill in the space between each adjacent pair of exon features.

Haibao mentioned the very useful GenomeTools utility, but to use that you would still first have to determine the exon coordinates and provide them as input. If you can calculate the exon coordinates from a set of CDS and UTR coordinates, then surely you can calculate intron coordinates from a set of exon coordinates.

score 0 · Answer 3 · 2011-10-10

0

Entering edit mode

13.5 years ago

Abhi ★ 1.6k

I have a similar question. If I have a GFF file with CDS and UTR features how can I find out the exon start and end ?

Thanks! -Abhi

ADD COMMENT • link 13.5 years ago by Abhi ★ 1.6k