I have been working with EdgeR
and its documentation to perform analyses on RNAseq data. I, however, find it difficult to understand the 'input' that should go into several of the EdgeR funcitons.
Starting page 10 of the documentation, the example shows DGEList
making object y
(a DGEList
object) and then using this object in subsequent analyses. See below:
group <- c(1,1,2,2)
y <- DGEList(counts=x, group=group)
The documentation then goes into filtering (section 2.7) via the following commands:
keep <- filterByExpr(y, group=group)
y <- y[keep, , keep.lib.sizes=FALSE]
Section 2.8.3 (p. 15) also shows normalization of the library sizes with the follwing:
y <- calcNormFactors(y)
Question 1: Should I always perform this process of filtering and calculating the norm factors before using the DGEList
object in any analyses? In other words, there is no need for me to make a y_copy
object of the pre-filtered data, correct?
These are the functions I am interested in:
estimateDisp
- to estimate dispersion (2.11.2 - pg. 21)exactTest
- differentially expressed genes/tags between 2+ groups (2.10.2 - pg. 20)glmQLFit
- for quasi-likelihood F-tests (2.11.3 - pg. 22)
Question 2: For the above functions: should they all receive the same input object (e.g. the same copy of the object) or should they receive distinct but identical objects (e.g. after calcNormFactors
you would have the following code):
y_two <- y
y_three <- y
and then use the different y
objects for each of the above functions?
I originally tried with the former approach (same copy of the object) but, after all of genes returned as positive for sufficiently high L2F change and sufficiently low p-values, figured that perhaps they should be considered independently. I, however, wanted to double-check that my deduction was correct in case any of the three above analyses were somehow connected/dependent upon one another.
NOTE: I now realize that
exactTest
andglmQLFit
are, in fact, contingent onestimateDisp
(the code throws an error otherwise). I still, however want to confirm thatexactTest
andglmQLFit
are not contingent upon one another.