Hi everybody. I'm recently working on time course RNA-seq in Cancer treatment (my goal is to study how RNA-seq changes over treatment time) . Because I'm at the beginning, I gave a look to Section 4.8 of the edgeR PDF guide. There is something I do not understand. He says that there are "12 embryonic samples collected at 2-hour intervals " , resulting in 60 total values ( so 5 unique samples ? ). Because In my study I have 10 times (from 0 to 9) but not all patients are present in them. Looking at this example it makes me understand that I can not use all of them but just samples that are present in all of the time course I want to study right? So for example ( A,B,C,F are patients ) :
This is good (t1,t2,t3 are the timings) : A-t1 B-t1 C-t1 | A-t2 B-t2 C-t2 | A-t3 B-t3 C-t3 --> t1,t2,t3 \ This is not good right? (t1,t2,t3 are the timings) : A-t1 B-t1 C-t1| A-t2 F-t2 C-t2| A-t3 B-t3 C-t3 F-t3
Because making the sum (as they did in Section 4.8 edgeR) when the sizes are different does not make sense in my opinion, and even if I would apply division by its size , using different patients wouldn't give a bias? Because in my case , some samples does not even have all time course , some yes. So I wonder If I should just use those samples that have all the times from 0 to 9 and ignore the ones that have sparse times (e.g from 0 to 2 and then no more).
Thanks for your time