A Question On Grouping Into Rna,Trna,Rrna,Sntna,Scrna,Snorna,Lincs.Etc
2
1
Entering edit mode
13.1 years ago
Hamilton ▴ 290

Hi,

From ~28000 mm9 genes with refseq,ensembl,ucsc known id, gene symbol, and description for RNA-seq data, I am trying to get information of percentage for each group, RNA,tRNA,rRNA,snRNA, lincs. Since there are gene symbol and description for each gene, therefore, if i take a look at genes, roughly i am able to get where particular gene belongs to. But i would like to get an exact percentage how many genes belong to tRNA, how many genes to rRNA/snRNA/...etc

how can i get such information?

next-gen sequencing • 4.2k views
ADD COMMENT
0
Entering edit mode

What is mm9? Is this a build of the mouse genome?

ADD REPLY
0
Entering edit mode

the latest mouse genome

ADD REPLY
5
Entering edit mode
13.1 years ago

If you have Ensembl gene ids you can get the 'gene_biotype' for each. There are many ways to access this information from Ensembl. For example:

This GTF file contains the biotypes for 37,681 genes: ftp://ftp.ensembl.org/pub/release-64/gtf/mus_musculus/Mus_musculus.NCBIM37.64.gtf.gz

NCBI Mouse Build 37 is 'mm9' according to UCSC: http://genome.ucsc.edu/FAQ/FAQreleases.html

Each gene is classified by one of the following in this Ensembl mouse version.

IG_C_gene
IG_D_gene
IG_J_gene
IG_V_gene
Mt_rRNA
Mt_tRNA
lincRNA
miRNA
misc_RNA
polymorphic_pseudogene
processed_transcript
protein_coding
pseudogene
rRNA
snRNA
snoRNA
ADD COMMENT
1
Entering edit mode
13.1 years ago
Eric Fournier ★ 1.4k

Get your list of sequences in GenBank format, and look for ncRNA features. They should have an /ncRNA_class attribute which will tell you whether it represents tRNA, a snoRNA, etc.

ADD COMMENT
0
Entering edit mode

True +1. One need not get the sequences in GenBank format per se, but capture the GenBank features in flatfile or similar format.

ADD REPLY

Login before adding your answer.

Traffic: 2674 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6