Entering edit mode
7.9 years ago
xinl1022
•
0
Hi, everybody,
Now I have a transcript.fasta file. It looks like this:
>FBtr0070000 type=mRNA;loc=X:join(19961689..19961845,19963955..19964071,19964782..19964944,19965006..19965126,19965197..19965511,19965577..19966071,19966183..19967012,19967081..19967223,19967284..19968479); ID=FBtr0070000; name=Nep3-RA; dbxref=FlyBase:FBtr0070000,FlyBase_Annotation_IDs:CG9565-RA,REFSEQ:NM_078693,REFSEQ:NM_078693; score=15; score_text=Strongly Supported; MD5=6c3abf7c6ba8b1392a073c85365b2e75; length=3537; parent=FBgn0031081; release=r6.12; species=Dmel;
AGAAGTACCACCCCACCACACCACACCACTTCCAAACAGTCCGATTCAAAGTCATAATTTCATTTGCATTTGAATAAATG....
This file contains the information of each transcript from different genes. Including length=xxx and parent=FBgnXXXXXXX. For transcripts from the same gene(have the same value in "parent" category) I want to extract the length of the longest transcript . And finally get a table with Parent gene name and longest transcript length. Can any one provide me with a code for it? Thanks!
I see you tagged
R
, so you want to do this in R?Do you have any programming experience?
Not really... I used R to do gene expression difference analysis.
Have a look at the script linked by Farbod, and if that doesn't work as expected we could write a pretty easy python script for this application...
I finally finish it by perl. Thanks
Hi,
There is a perl program in here.
I figured it out, thanks