I am performing a meta analysis based on 10 different data sets (derived from different platforms and technologies (RNAseq as well as Microarray)). DGE analysis was pretty much straight forward and mostly performed using limma package. The experimental setup was nothing special, just a treatment (heat) vs. control design. So in the end I got my results such as logFC and the corresponding SE values stored in separate data frames (topTable).
Now I want to perform meta analysis. Primarily a random effects model variant but for some reasons I would also like to compare these results with those of a meta analysis based on combined p-values (fisher's method).
For the first step (REM) I used the rma.uni function from the metafor package. So far so good. I got my results for the genes and drew my conclusions from this. But somehow this still remains kind of a blackbox to me as I don't really understand how the resulting logFC value is calculated. By doing some research I finally found out that the measure used for logFC calculation is called "GEN". I even had a look at the function's code to find out how this measure is calculated but I have to admit that this piece of code is exceeding my R knowledge by far. Can anyone explain (or at least share a link with) me what is going on here?
Here are some values of an randomly chosen example gene: AT4G27670 (pls ignore the negative yi value, will check this afterwards)
yi <- c(8.447936, 8.037122, 9.343436, 14.670613, 9.004982, 9.756535, 6.466615, 12.855800, -7.905433, 9.404006)
sei <- c(0.3903574, 0.3014562, 0.5994067, 0.9482127, 1.1346424, 0.2865533, 0.6341623, 0.6235811, 2.4409382, 1.4918650)
metafor::rma(yi=yi, sei=sei, method="REML")
Model Results:
estimate se zval pval ci.lb ci.ub
8.2515 1.7171 4.8054 <.0001 4.8860 11.6170 ***
How would I get the 'combined' logFC value of 8.2515 by hand?
The second (part of the) question: As mentioned before I would also like to perform a combined p-value meta analysis on the same data which I never did before. For getting the combined p-values of each gene I used the sumlog function from the metap package. But how would you recommend me to get the corresponding logFC values? Simply taking the median (or even mean) of all 10 logFC values from the single DGE studies? I wasn't able to find any code examples, only found the esc package which sounded promising to me but I think calculation here is somehow similar to calculations done by rma.uni function from the metafor package, which is why I would like to understand what is done there.
A (combined) p-value example for the same gene (AT4G27670) looks like this:
adjp <- c(3.648044e-07, 1.146830e-27, 3.941075e-12, 6.468057e-06, 3.904612e-05, 5.053005e-08, 5.761013e-07, 9.892212e-08, 3.053724e-02, 1.463310e-04)
sumlog(adjp)$p
3.906543e-66
One attempt to calculate the combined logFC value I tried was done by using this formula (which is basically used in the FE model):
It worked pretty fine but the results slightly differ from the results of of pooled logFC value using the REML procedure (REM meta analysis approach), which is, of course, what I expected. So would you consider using this formula as an valid approach with respect to the meta analysis (fisher's method)?