Dear all,
I have run WGCNA and in the first step of it I tried to remove samples with too many missing values using the following code:
gsg = goodSamplesGenesMS(multiExpr, verbose = 3);
gsg$allOK
if (!gsg$allOK) {
Print information about the removed genes:
if (sum(!gsg$goodGenes) > 0)
printFlush(paste("Removing genes:", paste(names(multiExpr[[1]]$data)[!gsg$goodGenes],
collapse = ", ")))
for (set in 1:exprSize$nSets)
{ if (sum(!gsg$goodSamples[[set]]))
printFlush(paste("In set", setLabels[set], "removing samples",
paste(rownames(multiExpr[[set]]$data)[!gsg$goodSamples[[set]]], collapse = ", ")))
Remove the offending genes and samples
multiExpr[[set]]$data = multiExpr[[set]]$data[gsg$goodSamples[[set]], gsg$goodGenes];
}
Update exprSize
exprSize = checkSets(multiExpr) }
I want to know what is the cutoff of WGCNA for missing data to remove a sample?
In the tutorial of WGCNA it has mentioned that samples with too many missing data will be removed, but not mentioned exactly the used cutoff.
I will appreciate any help
Nazanin