Entering edit mode
5.6 years ago
Javad
▴
150
Dear all,
I have a normal RNA seq count table with around 40000 rows. In my data I would like to seperate "protein coding", long non-coding", "pseudo gene", "misc" RNA etc. to analyse each one of them separately. (I have the Ensemble ID corresponding to each row). is there a straight forward way to do that? Any suggestion is highly appreciated.
Thanks a lot in advance.
If you get the Ensembl GTF file from here (or if you already have it) that information is encoded in it. I linked Human file but if you have some other genome then look for that. Information would be under
gene_biotype
field.gene_id
field has the Ensembl identifiers.