We are currently planning to perform an RNA Sequencing on a total of 35 mouse samples. There are 2 conditions: treatment and disease status. We would like to identify differential gene expression between the conditions. e.g. Treatment A compare with Treatment B in Cases and Treatment A compare with Treatment B in Controls etc.
We only have enough money to perform sequencing on 12 samples (1 lane + 12 indexing). So we are planning to perform pooling before performing the RNA Seq. The concern we have now is statistically what is the best pooling strategy? When pooling the data together, it is more likely than not that the distribution of the counts no longer follow the negative binomial distribution that is assumed by tools like edgeR and DESeq2. The power of these test will bound to be affected.
Have anyone got experience on pooled RNA Sequencing analysis? What should be take into account when performing the analysis? How should we design the pool? Should we use ERCC spike in? If we should, how should we use it?
Thank you
We were slightly worried that the random selection of samples might be challenged by others. Though now you've mentioned it, it is more or less how we usually do things, randomly selecting a particular amount of samples instead of sequencing the whole population.
As for the litter pooling, we currently have 7,10,9,9 samples per group. If we count the number of litter per groups, then we have 3 little per groups. The problem though the litter size differ quite a lot where some litter can be as much as 5 samples and some were as small as 1.
So do you think pooling should be done for litter or should we just randomly select 3 samples (or by RIN and concentration), and perform the analysis on that, where the remaining samples were used as samples for qtPCR?
It's always best to avoid pooling unless absolutely required. If you have 3 litters per group then just take a sample with a nice concentration/RIN per litter. Note that while you can use the other samples for qPCR, they still won't represent true replicates. There's simply too much correlation between littermates (yes, I realize that this results in needing a lot of cage space for experiments, I've been there).
Thanks Devon, we really want to make sure we have got our experimental design right. Always good to have comments from other people. This helps us a lot!