Hi,
I want to compare the proportion of cells of interest (CD24+ cells here) across different time points (ctrl, DM12, DM24 and DM36) within different cell types (T cell1 and T cell2 here) as shown in the figure but not sure what is the optimal approach? I have 2 replicates per time point and the proportion of CD24+ cells is very small compared to overall cells.
I assume I cannot use tools like scCODA and compositional analysis as based on the scCODA paper I would need more replicates for rare cell types such as in my case (recommended 8 to 10 samples if the cell type is rare depending on the statistical power you are looking for). But how about other tests like Kruskal-Wallis following by post-hoc comparisons? Do I have to run any tests in this case?
Thanks!
Thanks for the link! It is indeed related to what I am looking for. Still remaining issue is the number of replicates per group. I know the un-official rule of thumb for methods like edgeR to be reliable is having minimum 3 replicates per group. Also in the OSCA link, there are 3 replicates for the example analysis. Do you think I can still safely use similar approach while having 2 replicates? I know it has been said that some methods can also handle less replicate but I cannot rely on their ability to control the errors. Edit: I found this link discussing number of replicates in these methods. Still would be helpful if there is more optimal approach.
edgeR and company require at minimum 2 vs 1. More of course better, but if you are underpowered, then better use a solid framework like this, rather than cooking custom stats which might perform poorly. You can generate some power by not only testing the CD24+ cells, but all identified celltypes / clusters so the dispersion estimation might get more robust. Just like edgeR normally gains power from sharing information across many genes, rather than testing every gene in isolation 1 by 1.
This totally makes sense. Thanks you!