Entering edit mode
9.6 years ago
Nicolas Rosewick
11k
HI,
I've a bunch of sample-matched (T0 and T1) data of uninfected and infected (virus) RNA-Seq samples. So:
T0 T1 Patient 1 0.4% 40% Patient 2 0.5% 60% Patient 3 0.2% 35%
where the percentage represent the percentage of cells infected by the virus. In order to detect differentially expressed genes (DEGs) induced/repressed by the presence of the virus, how can I take into account the percentage of infected cells that varies between patient ? In general, I use DESeq for this type of analysis but I don't know how to take into account the percentage of infected cells.
Thanks
I assume that your current model is something of the form
~patient + time
. Have you considered changing that up to~patient + infected_percentage
? That would be my gut reaction of one method. You might also logit transform those percentages, though I really don't have a good feeling about how useful that would be. I guess in general the question become how variable the percentage is (both in T0 and T1) and how much that correlates with expression changes (presumably a PCA plot or heatmap would be helpful here).the problem is that I don't know which gene is targeted by the virus (the aim of the analysis in fact), so it will be difficult to correlate percentage and expression changes..
Right, but using a PCA plot or something like that would be useful. You don't need to know the DE genes beforehand for that.