Question

Are different colonies from the same cell line valid biological replicates for DESeq 2?

3

Entering edit mode

9.5 years ago

gaelgarcia ▴ 270

Hi all,

I have a question related to this previous post, Technical/Biological Replicates In Rna-Seq For Two Cell Lines, but different in a few ways.

I'd greatly appreciate your help.

I have different cell lines derived from human fibroblasts, which I have grown in vitro and prepared RNA-seq libraries from. I want to find genes differentially expressed between two conditions using DESeq 2.

For group 1 (WT/non-disease), I have 4 lines (4 individuals): 1A, 1B,1C,1D.

For group 2 (disease), I have just 2 lines (2 individuals): 2A, 2B.

Furthermore, from WT/non-disease lines 1A and 1B, I have 3 different colonies from each, grown in separate for 30 days (1A-1, 1A-2, 1A-3 and 1B-1, 1B-2, 1B-3).

From WT/non-disease lines 1C and 1D, I only grew one colony from each for 30 days (1C-1, 1D-1).

For the disease lines 2A and 2B, I have 2 different colonies from one, grown separately for 30 days (2A-1. 2A-2), while for the other, I have 3 different colonies (2B-1, 2B-2, 2B-3).

For each colony, I did only one RNA extraction and library prep + sequencing, so I have no strictly technical replicates, for a total of 13 libraries, each from a distinct colony grown separately for 30 days, albeit some from the same human cell line.

My gut feeling is that each library should be a biological replicate since they were all derived from separate colonies and are not really technical replicates, but I am also aware that there are two levels of biological variation in my experiment -- some colonies are from different humans, and others are colonies from the same human grown in parallel for 30 days.

Should I collapse the colonies from the same human individuals into a single column? Or is it better to keep each colony as a separate biological replicate given that it is capturing more variation for the condition than expected by just being a technical replicate?

DESeq2 RNA-Seq R • 3.3k views

ADD COMMENT • link updated 2.8 years ago by Ram 44k • written 9.5 years ago by gaelgarcia ▴ 270

score 2 · Answer 1 · 2015-07-14

2

Entering edit mode

9.4 years ago

dariober 15k

I would keep each colony separate and account for that in the design matrix.

In my opinion the distinction between "biological" and "technical" replicate is more confusing than anything. You should replicate the levels of variation whose effect you want to estimate. If for example all your "non-disease" samples come from one person and all the "disease" samples from another person you can't separate the effect of the disease from the effect of having two different human beings. If you are interested in that difference (probably you are) than you should replicate "person" within "disease" and "non-disease" (i.e. persons with the disease and without). Similarly, if don't have replicate colonies from the same cell line, you can't tell whether the difference between cell lines is due to the cell line or the colony (and you might not be interested in this). Than whether you call these levels technical or biological can be useful but is irrelevant for the analysis.

ADD COMMENT • link 9.4 years ago by dariober 15k

0

Entering edit mode

Thank you so much for your response, Dario. Your logic seems really adequate to me.

Some colleagues are suggesting to me that by keeping the different colonies separate, I am overestimating the statistical power I have and that I should collapse them per individual... but I really feel that this is relevant variation that should be taken into account.

I think I will go with keeping them separate. What I am unsure of is if having unequal number of samples per individual (cell line) AND per condition is detrimental to the analysis. For two WT lines, I have 3 colonies each, and for the other two WT lines, I just have one colony. On the other hand, for the Disease line, I have 1 line with 3 colonies and 1 line with just 2 colonies.

Do you think I should eliminate some samples in order to even out the matrix across conditions/lines?

Thanks again.

ADD REPLY • link 9.4 years ago by gaelgarcia ▴ 270

0

Entering edit mode

Fantastic description dariober :)

ADD REPLY • link 8.7 years ago by John 13k