Hi thereļ¼
I'm trying to find the co-methylation sites using WGCNA. Methylation status were assessed using the Illumina EPIC microarray. When I run WGCNA's pickSoftThreshold() on 810135 sites (after some common preprocessing steps, such as filter out probes with bad detection P value), server killed this process and reported "task 1 failed - "cannot allocate vector of size 30.2Gb" ".
the code for picksoftthreshold is
sft1=pickSoftThreshold(resids2,blockSize=5000,powerVector =c(seq(4,10,by=1),seq(10,20,by=2)),networkType="signed",verbose=5)
memory of my server: 220GiB
i have tested what if i use a smaller block size, such as 2000, in this step. It result in 12min for each block, so maybe 4800min(more than 3 days ) for the whole data.As this step takes too much time and eats too much memory, how do you guys deal with it? maybe some basic filters? or can i pick the soft threshold on data subsets? one study noted that they get the power from the calculation by the scale-free topology criterion on data subsets.However,they didn't give any details about how they select the subsets(title:Mosaic Epigenetic Dysregulation of Ectodermal Cells in Autism Spectrum Disorder).
Any suggestion?
I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below: