Hi,
I'm running WGCNA to my RNA-seq data following the UCLA tutorial by the authors of the package but I'm going through a small issue when checking for excessive missing values and identification of outliers in my sample.
The script in the tutorial is
gsg = goodSamplesGenes(datExpr0, verbose = 3);
gsg$allOK
# if FALSE:
if (!gsg$allOK)
{
# Optionally, print the gene and sample names that were removed:
if (sum(!gsg$goodGenes)>0)
printFlush(paste("Removing genes:", paste(names(datExpr0)[!gsg$goodGenes], collapse = ", ")));
if (sum(!gsg$goodSamples)>0)
printFlush(paste("Removing samples:", paste(rownames(datExpr0)[!gsg$goodSamples], collapse = ", ")));
# Remove the offending genes and samples from the data:
datExpr0 = datExpr0[gsg$goodSamples, gsg$goodGenes]
}
It shows me the info that must be removed but at the end there are a lot of "NA" columns that aren't at the original data (datExpr0 - I've checked it).
[...] LOC110812820, LOC110814046, LOC110812396, LOC110814102, LOC110814183, LOC110814197, LOC110814177, LOC110814128, LOC110814152, LOC110812490, LOC110814289, LOC110814288, LOC110814252, LOC110814293, LOC110814359, LOC110814411, LOC110814324, LOC110814352, LOC110814327, LOC110814437, LOC110814361, LOC110813173, LOC110814443, LOC110814441, LOC110814528, LOC110814501, LOC110814526, LOC110814495, LOC110814487, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
So obviously when running the last line of the script where it says to actually remove them from the data I get the following error:
Error in `[.data.frame`(datExpr0, gsg$goodSamples, gsg$goodGenes) :
undefined columns selected
I've found nothing online about such error so, does anybody have a clue about what's possibly going on?
Thanks in advance,
In which step do you have to load the test data