edgeR analysis w/ factor (Body of Water) and covariate data (Latitude) - Can you look at Latitude w/in Body of Water & Across Body of Water?
1
0
Entering edit mode
2.4 years ago
Gina • 0

I am using edgeR to look at differential community composition (bacterial taxonomy counts) - this can be treated just like RNA-Seq data. I have a factor (WaterBody) and a covariate (Latitude) for this dataset. There is an interaction effect based on adonis (and biologically it makes sense).

Extra details on data:

  • WaterBody: North_Atlantic, North_Pacific, Mediterranean_Sea
  • Latitude: 32, 34, 37, 38, 42, 43, 45, 46, 49, 52, 55, 58, 64, 67
  • Not all bodies of water were sampled at every latitude.

Below is what I currently have for my script:

y <- DGEList(counts=Counts_Bact, sample = Sample_Data)
y <- edgeR::calcNormFactors(y, method ="TMMwsp")
design <- model.matrix(~WaterBody + WaterBody:Lat, data=Sample_Data)
rownames(design) <- colnames(y)
disp = estimateDisp(y, design, robust=TRUE)
fit <- glmQLFit(disp, design, robust = TRUE)

At this point I have the following coefficients:

  • WaterBodyNorth_Atlantic
  • WaterBodyNorth_Pacific
  • WaterBodyNorth_Mediterranean_Sea
  • WaterBodyNorth_Atlantic:Lat
  • WaterBodyNorth_Pacific:Lat
  • WaterBodyNorth_Mediterranean_Sea:Lat

So now to reiterate my question. I wanted to compare bacterial taxonomy counts for certain latitudes within a body of water as well as across bodies of water (say which bacteria are differentially present in North Atlantic at 67N vs 45N and then North Pacific vs North Atlantic at 45N).

My understanding is what I have above is incorrect and I am going about it all wrong. It also may just simply not be possible? Was hoping to get input since looking at references below haven't quite gotten me there:

Ref 1: https://bioconductor.org/packages/release/workflows/vignettes/RNAseq123/inst/doc/designmatrices.html#overview-of-models-fitted

Ref 2: https://www.bioconductor.org/packages/devel/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf

Any advice/insight would be greatly appreciated!

edgeR • 1.1k views
ADD COMMENT
0
Entering edit mode
2.4 years ago

Hey Gina,

This seems like a situation where a compound model term may be the simplest solution. So, that would be, e.g., creating a group variable as a compound of WaterBody and Latitude. Then, your model formula would just be:

~ group

After that, you could just compare different terms on a pairwise basis.

If you also need a comparison of form North Pacific vs North Atlantic at 45N, then you could re-work the group variable in the desired way and then re-fit the model and conduct the comparison.

Sometimes trying to get the perfect formula is just impossible and creating interactions can result in over-fitting the model.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 1780 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6