Hi folks!
I've got two questions about including surrogate variables in expression analyses. In the GTEx documentation, they say:
"A set of covariates identified using the Probabilistic Estimation of Expression Residuals (PEER) method (Stegle et al., PLoS Comp. Biol., 2010 ), calculated for the normalized expression matrices (described below). For eQTL analyses, the number of PEER factors was determined as function of sample size (N): 15 factors for N<150, 30 factors for 150≤ N<250, 45 factors for 250≤ N<350, and 60 factors for N≥350, as a result of optimizing for the number of eGenes discovered. For sQTL analyses, 15 PEER factors were computed for each tissue."
Source: https://www.gtexportal.org/home/documentationPage#staticTextAnalysisMethods
What I am wondering is: isn't it excessive to include 60 PEER factors (covariates), even if sample size is > 350?
My second question (not related to PEER but to surrogate variables in general) is.. I ran sva
's num.sv
function to estimate how many surrogate variables I should include in my analyses (to see if it would be any value near what was estimated by GTEx), based on my model of interest and data. Using the leek model, it's coming up with 573, whereas with the 'be' model, only 1. So even though I've got 500+ samples, I think I'm including only 1, as the be model supports this. Do you think this is sensible?
library(sva)
library(DESeq2)
edata <- counts(dds, normalized=TRUE) # get normalized counts from DESeq2's dds object to pass to sva
mod1 <- model.matrix(~ Institution + RIN + Sex + PMI_hrs + Age_bins
+ PC1 + PC2 + PC3 + PC4 + PC5 + Disease_status, data=pd)
mod0 = model.matrix(~ 1, data=pd) # null model
n.sv = num.sv(edata,mod1,method="leek") # 573
n.sv = num.sv(edata,mod1,method="be") # 1
What are PC1-5 ?
Hi Asaf! Sorry I didn't clarify: those would be the first 5 population covariates calculated in plink.
The discrepancy between
be
andleek
is, on face-value, worrying. However, i am not too familiar with PEER's usage. You could try to contact Oliver [Stegle]. I presented at a conference with him in Milan (Italy) back in 2014 and he is good natured.