Hi everyone,
I have multiple single-cell RNA seq datasets (BD Rhapsody). The dataset includes cells from the liver obtained from treated mouse samples at different time points. After treatment, each mouse was sacrificed at a time point, its liver was extracted, and cells were sorted.
So, there is only one sample for each mouse. The metadata for samples looks like this:
MouseNumber | Time(Day) | Experiment |
---|---|---|
M1 | Day3 | Exp1 |
M2 | Day3 | Exp1 |
M3 | Day3 | Exp2 |
M4 | Day5 | Exp3 |
M5 | Day5 | Exp3 |
M6 | Day5 | Exp4 |
M7 | Day18 | Exp5 |
M8 | Day18 | Exp5 |
M9 | Day18 | Exp6 |
Libraries for samples were prepared together in 2-3 sample batches/experiments (Experiment column). In each experiment, samples were tagged with SampleTag and sequenced together. For example, on day 3, libraries from M1 and M2 were prepared for the same experiment and sequenced, and one fastq file was generated for that experiment.
Routine single-cell data analysis with leiden clustering showed distinct cell populations but no separation with time points.
However, my colleagues also want to know if the genes increased/ decreased between different time points. Especially the genes that they are interested in. So, I want to perform differential gene expression analysis and compare different time points with each other. I plan to do it with MAST (or I might also try Pseudobulk).
However, I am confused about how to construct the design matrix. How to include Experiment
in the design matrix. Because some experiments have only one sample.
Thank you for your comments and help