MiTranscriptome data access
2
0
Entering edit mode
8.9 years ago
Floris Brenk ★ 1.0k

Hi all,

A while ago this paper was published "The landscape of long noncoding RNAs in the human transcriptome" (http://www.ncbi.nlm.nih.gov/pubmed/25599403). This paper is interesting for almost everyone who works with expression I think. When exploring the website http://www.mitranscriptome.org, there is not a place where you can see or download the expression level of normal protein coding genes/transcripts. At the download section you can download: Assembly GTF File, Assembly BED File, Library Information, LncRNA Expression (FPKM), LncRNA Expression (Normalized Counts) but not the whole expression file for all the genes...

Am I not looking properly? Any ideas?

mitranscriptome expression RNA-seq • 5.2k views
ADD COMMENT
1
Entering edit mode
8.9 years ago
EagleEye 7.6k

I guess MiTranscriptome is mainly focused on known and novel intergenic long-noncoding RNAs. You can find the expression dataset for TCGA which covers almost all protein coding genes in 'synapse' project.

https://www.synapse.org/#!Synapse:syn362400/files/

ADD COMMENT
0
Entering edit mode
8.8 years ago
Jordan Anaya ★ 1.1k

MiTranscriptome is supposed to contain all the transcripts that they identified, which includes lncRNAs as well as protein coding genes. For whatever reason they have decided not to make this data available, despite promising to do so. For example, here is a post from their user forum: https://groups.google.com/forum/#!topic/mitranscriptome/ZfxLEq9mCd4

In their forum they have repeatedly said they will make the data available, and yet months and months go by with no updates. I understand incorporating that much data into their online database may be difficult, but you would think they could at least make a file available for download. They have delayed for so long that you have to start to wonder why they aren't releasing the data...

ADD COMMENT
0
Entering edit mode

Yes I totally agree! I think it is weird and wrong and publication should not be possible anymore without making data public nowadays...

ADD REPLY
0
Entering edit mode

MiTranscriptome includes lncRNAs and transcripts of unknown coding potential (TUCP) which come from their assembly of public data. As EagleEye points out, you can download expression of protein-coding genes from the original publications.

You can find the expression values for lncRNAs and TUCP from either the download link or you can browse the genes (and download the FPKM expression values just for that gene of interest). The browser also has a message that protein-coding genes will be added in the beta version of MiTranscriptome.

There are discrepancies between the names from the bulk download and the single gene query, but you can download the bed file to get the genome coordinates from the downloaded data matrix. More specifically, the IDs from the download link match those in the UCSC Genome Browser for the Entire Assembly, while the single-gene names match those in the UCSC Genome Browser for cancer-specific genes (since the web-interface is mostly designed with cancer-specific gene expression in mind).

ADD REPLY

Login before adding your answer.

Traffic: 1623 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6