Identify transcripts code for longest protein from gene annotation file
1
1
Entering edit mode
5.3 years ago

I have reference annotation file of Arabidopsis thaliana and I am interested to identify extract transcipts that code for longest protein isoform and then extract coodinates of that transcript. Forexample gene (AT1G01020) contain 6 transcripts (AT1G01020.1, AT1G01020.2, AT1G01020.3, AT1G01020.4, AT1G01020.5, AT1G01020.6), how can i identify transcript which codes for longest protein and extract its coordinates?

The reference annotation file

Does it depends on number of exons, CDS regions or length of exons?

RNA-Seq Reference annotation file • 1.8k views
ADD COMMENT
0
Entering edit mode
5.3 years ago
JC 13k

Use Arabidopsis in BioMart to filter by "Gene stable ID" for your gene, select the "Structures" in "Attributes" and retrieve the values you need.

ADD COMMENT
0
Entering edit mode

I am amble to select the protein coding transcripts but how I can select the transcrip that codes for longest protein? Seondly if multiple transcipts of variable length code for protein of similar length then which transcript should I need to select? For example gene (AT2G27490) conatin 4 transcripts of variable length but all codes for protein of 232aa so which one I need to select?

ADD REPLY
0
Entering edit mode

You select the larger one from the table, if you need to automatically decide, then you need to code something to query and filter your selection. Deciding which one to use if they have the same length, that is a question you need to define based on what are you trying to do with that information.

ADD REPLY
0
Entering edit mode

Longest transcript doesn't mean it codes for longest protein as it can aslo contain retained introns or part of introns, how can i get the idea of longest protein coding transcript?

ADD REPLY
0
Entering edit mode

by CDS (CoDing Sequence) length

ADD REPLY

Login before adding your answer.

Traffic: 2933 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6