Get the longest transcript length of each gene from transcript fasta file
0
0
Entering edit mode
7.9 years ago
xinl1022 • 0

Hi, everybody,

Now I have a transcript.fasta file. It looks like this:

>FBtr0070000 type=mRNA;loc=X:join(19961689..19961845,19963955..19964071,19964782..19964944,19965006..19965126,19965197..19965511,19965577..19966071,19966183..19967012,19967081..19967223,19967284..19968479); ID=FBtr0070000; name=Nep3-RA; dbxref=FlyBase:FBtr0070000,FlyBase_Annotation_IDs:CG9565-RA,REFSEQ:NM_078693,REFSEQ:NM_078693; score=15; score_text=Strongly Supported; MD5=6c3abf7c6ba8b1392a073c85365b2e75; length=3537; parent=FBgn0031081; release=r6.12; species=Dmel; 
AGAAGTACCACCCCACCACACCACACCACTTCCAAACAGTCCGATTCAAAGTCATAATTTCATTTGCATTTGAATAAATG....

This file contains the information of each transcript from different genes. Including length=xxx and parent=FBgnXXXXXXX. For transcripts from the same gene(have the same value in "parent" category) I want to extract the length of the longest transcript . And finally get a table with Parent gene name and longest transcript length. Can any one provide me with a code for it? Thanks!

RNA-Seq R gene • 4.6k views
ADD COMMENT
0
Entering edit mode

I see you tagged R, so you want to do this in R?

Do you have any programming experience?

ADD REPLY
0
Entering edit mode

Not really... I used R to do gene expression difference analysis.

ADD REPLY
0
Entering edit mode

Have a look at the script linked by Farbod, and if that doesn't work as expected we could write a pretty easy python script for this application...

ADD REPLY
0
Entering edit mode

I finally finish it by perl. Thanks

ADD REPLY
0
Entering edit mode

Hi,

There is a perl program in here.

ADD REPLY
0
Entering edit mode

I figured it out, thanks

ADD REPLY

Login before adding your answer.

Traffic: 2150 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6