Hello all,
Apologies for the not so technical question.
I am currently analyzing biological replicates from an RNA-seq experiment, 3 for each condition that I am comparing. These 3 replicates are for a specific tissue and cell type.
Having gotten a list of DEGs, I now want to access the correlation between these genes.
However, even though 3 biological replicates per condition seems to be the norm in these experiments, many papers discuss that we need at least 6 replicates per sample, with 12 being ideal.
You can find the links to the papers and articles I read on the subject.
My question is: does it make sense to perform a Pearson Correlation test for example, or run WGCNA, for a scenario where I have so few replicates? How is correlation usually inferred in experiments with 3x3 replicates?
Literature:
- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4878611/
- https://www.biorxiv.org/content/10.1101/2023.10.25.563901v1
- https://www.ebi.ac.uk/training/online/courses/functional-genomics-ii-common-technologies-and-data-analysis-methods/rna-sequencing/performing-a-rna-seq-experiment/design-considerations/number-of-replicates/
Thank you
There is no general statement that will hold here. If you have human data that are not paired you can easily need dozens or hundreds of replicates to get significances depending on true effect size. Likewise, with cellline replicates and large effects even 2 can be enough to get results. In the end it is a combination of available specimen, money, feasibility and what you can expect in terms of biological effects.
No, the FAQ of WGCNA clearly tells that iirc < 20 samples is pointless. Check its docs.
Thank you ATpoint . My apologies, maybe I should rephrase the question. I read the WGCNA docs already, and for the Pearson correlation test they say 25 is the minimum per condition. However, I have seen published papers where pearson correlation is done with just 3 biological replicates per condition for example. If there are large differences in gene expression between conditions, can a correlation analysis be carried out? Or does it always hold that you need a minimum of samples per condition?
Ex paper with Pearson Corr with 3 replicates: https://www.researchgate.net/figure/Pearson-correlation-coefficient-between-the-three-biological-replicates-The-first_fig2_341033248