Question

How To Compare Raw Counts From Rnaseq ?

3

Entering edit mode

11.7 years ago

clausndh ▴ 60

Hi guys,

I have RNAseq dataset with control [C], treatment 1 [T1] and treatment 2 [T2], for each 3 replicates (a,b,c). I already mapped all reads for the 9 Samples and use afterwards htseq-count to get raw counts (counttable).

My problem is that replicates (c) were sequenced 9 months before and I'm not sure if I can work with all 3 replicates (a,b and c) or only with 2 (a and b).
I want to test if the the replicates are similar to each other. So i want to but sure that i don't have a "outlier"-sample.

Has anyone of you got a good idea how to do this in R ? Thanks for your help.

My counttable:

            Ca Cb Cc   T1a T1b    T1c T2a T2b   T2c
 AT1G0****  7  1  40    4    19    42   3   5    24
 AT1G0****  0  0  0     0    3     3    0   0     1
 AT1G0**** 37 28  118   89   64    174  42  47   151
 AT1G0**** 41 36  191   54   50    149  38  43   254

rnaseq counts differential-expression • 4.3k views

ADD COMMENT • link updated 11.7 years ago by Manu Prestat 4.1k • written 11.7 years ago by clausndh ▴ 60

score 3 · Answer 1 · 2013-05-17

3

Entering edit mode

11.7 years ago

Hayssam ▴ 280

See edgeR user manual for examples with the plotMDS function: That would roughly do a multidimensional scaling based on the top varying genes and indicate whether one of your sample depart from the others. As well as how the samples group together.

ADD COMMENT • link 11.7 years ago by Hayssam ▴ 280

score 3 · Answer 2 · 2013-05-17

The first suggestion is similar to @massyah's, which is to say that there are things you can do to see how similar your data sets are to each other. A naive clustering of your data, for example, would hopefully but the replicates from the same condition next to each other.

In addition to the edgeR::plotMDS suggestion, you can look through the vignette for DESeq2. Section 8 of that vignette has some suggestions you can follow to explore the quality of your data.

Also, you might consider including a covariate to indicate the batch of the experiment when doing your linear modeling/differential expression testing to account for such effects.

score 2 · Answer 3 · 2013-05-21

For a quick and dirty try you can upload your count data to Scotty and we'll cluster it for you and you can see if you have an outlier replicate. Sometimes test and controls don't group together though. It depends on how much of an effect there is.

http://euler.bc.edu/marthlab/scotty/scotty.php

Did all of the library preps use exactly the same protocols? I only ask because things can change a lot in 9 months.

score 1 · Answer 4 · 2013-05-17

1

Entering edit mode

11.7 years ago

Ashutosh Pandey 12k

edgeR, DESeq and DEGSeq all can use counts as input. You can do a pairwise comparison between the replicates and get some idea.

ADD COMMENT • link 11.7 years ago by Ashutosh Pandey 12k