Hi,
I've to analyze several RNA-Seq samples. I've samples from several runs, unstraned and straned, and several samples sequenced multiple times ( using different library kit ). I used htseq-count to have the read counts and want now to use DESeq to check for differential expression. So I've biological replicates and technical replicates (same sample sequences several times using a different lib kit. Is that correct ?).
So I did a design matrix. In my example, A.1 means sample A, sequencing 1. A.2 : sample A, sequencing 2,... So A is sequenced two times (One unstranded, one stranded), B three times (One unstranded, two stranded), C one time (one unstraned) and D one time (one stranded). ReplicateGroup is used to put together technical replicates.
designTable :
Sample Condition Stranded ReplicateGroup
A.1 Ctrl No A
B.1 Treated No B
C.1 Treated No C
A.2 Ctrl Yes A
B.2 Treated Yes B
B.3 Treated Yes B
D.1 Treated Yes D
After that I use DESeq. countTable is the read count matrix.
cdsFull = newCountDataSet( countTable, designTable )
cdsFull = estimateSizeFactors( cdsFull )
cdsFull = estimateDispersions( cdsFull )
But now I don't know how to fit a model on "condition" "stranded" and "replicateGroup".
like that ?
fit1 = fitNbinomGLMs( cdsFull, count ~ Condition + Stranded + ReplicateGroup )
fit0 = fitNbinomGLMs( cdsFull, count ~ Condition )
pvalsGLM = nbinomGLMTest( fit1, fit0 )
padjGLM = p.adjust( pvalsGLM, method="BH" )
Is it the good way to analyze technical replicated. I read that I have to merge them together.. but I don't think it's a good idea due to the fact that I use different library kits. So I'm stuck...
Thanks a lot in advance
N.