I am using edgeR to look at differential community composition (bacterial taxonomy counts) - this can be treated just like RNA-Seq data. I have a factor (WaterBody) and a covariate (Latitude) for this dataset. There is an interaction effect based on adonis (and biologically it makes sense).
Extra details on data:
- WaterBody: North_Atlantic, North_Pacific, Mediterranean_Sea
- Latitude: 32, 34, 37, 38, 42, 43, 45, 46, 49, 52, 55, 58, 64, 67
- Not all bodies of water were sampled at every latitude.
Below is what I currently have for my script:
y <- DGEList(counts=Counts_Bact, sample = Sample_Data)
y <- edgeR::calcNormFactors(y, method ="TMMwsp")
design <- model.matrix(~WaterBody + WaterBody:Lat, data=Sample_Data)
rownames(design) <- colnames(y)
disp = estimateDisp(y, design, robust=TRUE)
fit <- glmQLFit(disp, design, robust = TRUE)
At this point I have the following coefficients:
- WaterBodyNorth_Atlantic
- WaterBodyNorth_Pacific
- WaterBodyNorth_Mediterranean_Sea
- WaterBodyNorth_Atlantic:Lat
- WaterBodyNorth_Pacific:Lat
- WaterBodyNorth_Mediterranean_Sea:Lat
So now to reiterate my question. I wanted to compare bacterial taxonomy counts for certain latitudes within a body of water as well as across bodies of water (say which bacteria are differentially present in North Atlantic at 67N vs 45N and then North Pacific vs North Atlantic at 45N).
My understanding is what I have above is incorrect and I am going about it all wrong. It also may just simply not be possible? Was hoping to get input since looking at references below haven't quite gotten me there:
Ref 2: https://www.bioconductor.org/packages/devel/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf
Any advice/insight would be greatly appreciated!