I am working with GFF file, and found GFFutils is great tools to extract information from it. Although going through manual I am unable to extract the last exon's coordination from each mRNA.
I am working with GFF file, and found GFFutils is great tools to extract information from it. Although going through manual I am unable to extract the last exon's coordination from each mRNA.
What have you tried so far? Let's see the code...
Hi liorglic, thanks for suggesting me this tool. I have tried and able to print the length of all exons from all genes, as I mentined in my previous post that I need only first and last 3 exons' and introns' length information.
Here is my code:
import GFFutils
G = GFFutils.GFFDB('dm3.db')
exon1_count = 0
gene_count = 0
for gene in G.features_of_type('gene'):
gene_exon_count = 0
print(gene)
next = []
# get all grandchildren, only counting the exons
for child in G.childrengene.id,2):
if child.featuretype == 'exon':
next.append(child)
print(len(child))
# if len(next) == 3:
# break
# gene_exon_count += 1
I tried to used break and continue to get till third exon but unable to get last three.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Which GFF are you using? What you want can likely be done outside of GFFutils, i.e., via
grep -B
,cut
, et ceteraHere is sample of GFF I am using Kevin.
I am very much convinced with GFFutils because I need to perform some another tasks also which is described in this post. For instance there is no intron (space between two consecutive exons) information in this gff file but GFFutils will calculate that also. It can be done by basic unix commands but the logics will become super-complicated.