Question

Comparing TMM and RSEM RNASeq data

0

Entering edit mode

9.5 years ago

leshaker • 0

Hi,

I am new to the field of RNASeq and I have been reading a lot on different normalizations etc. on Biostars but I couldn't find an answer to the problem I am facing right now:

I have a data set from this paper (dx.plos.org/10.1371/journal.pone.0118528) and as far as I understand it, they used TMM normalization:

Samples were grouped and quantification of transcript abundance was performed on this final read list using Trimmed Means of M-values (TMM) as the normalization method [27]. Output data utilized for all subsequent comparisons was a normalized signal value generated by AvadisNGS.

(p. 5)

Now I want to compare this data with a dataset of TCGA which uses RSEM (https://wiki.nci.nih.gov/display/TCGA/RNASeq+Version+2).

I found some posts on similar questions (e.g. TMM normalisation from RSEM raw counts) but I still don't know how to proceed. Can anyone help me out?

Thank you so much!

Max

TMM RSEM RNA-Seq • 4.0k views

ADD COMMENT • link updated 23 months ago by Ram 44k • written 9.5 years ago by leshaker • 0

score 1 · Answer 1 · 2015-06-04

1

Entering edit mode

9.5 years ago

Ying W ★ 4.3k

I just did a quick look and it seems like it might be possible to use the RAW values from TCGA and run edgeR on that to get TMM values

ADD COMMENT • link 9.5 years ago by Ying W ★ 4.3k

0

Entering edit mode

ok, it tried to do this in R as described in this post RNA-seq normalization: How to use TMM and rpkm() in EdgeR
I used the "mRNAseq_raw_counts" and "mRNAseq_median_length_normalized" data files from TCGA.

However, the resulting matrix contains a lot of NaNs and Inf values and has nothing to do with the original distribution. Any ideas what went wrong?

ADD REPLY • link 9.5 years ago by leshaker • 0

1

Entering edit mode

Double check how many of the _raw_counts and _length_normalized are NaN and 0s. Also, have a look at edgeR's user manual, it goes into a bit of detail into what edgeR is doing.

ADD REPLY • link 9.5 years ago by Ying W ★ 4.3k

0

Entering edit mode

ok I just found out that these three files downloaded from TCGA have the exact same content:

LUSC.mRNAseq_median_length_normalized
LUSC.mRNAseq_raw_counts
LUSC.mRNAseq_RPKM

so there seems to be no length information provided...

ADD REPLY • link 9.5 years ago by leshaker • 0