Hi,
Im trying to do a WGCNA on my data. I have 3 columns:
gene1 expression_value_for_condition_1 expression_value_for_condition_2
im trying to follow the commands on the WGCNA tutorial to generate the heatmap and matrix but im getting this error:
Error in goodGenes(datExpr, goodSamples, goodGenes, minFraction = minFraction, :
Too few genes with valid expression levels in the required number of samples.
further:
gsg = goodSamplesGenes(ecvssmc, verbose = 3)
Flagging genes and samples with too many missing values...
..step 1
Error in goodGenes(datExpr, goodSamples, goodGenes, minFraction = minFraction, :
Too few genes with valid expression levels in the required number of samples.
> gsg$allOK
Error: object 'gsg' not found
i have installed the gsg package and loaded it also.
is this a case of too few samples?
Why are you running WGCNA on two samples ?
Goutham-are you the same guy who was with me at NCBS? i notice youve been making random comments on my posts. your comment here is misleading and appears to be giving others the wrong impression of what my file is like, which is not very helpful. i intented to mean that this was a sample file with those titles. note i said these are the columns. not the actual samples themselves.
The error is very much related to the number of samples. The error says "
Too few genes with valid expression levels in the required number of samples
."WGCNA is recommended to run on at least 15 samples but you have only 2 samples. I was trying to point out that.
And
gsg
is an object you were trying to create, so no need to install gsg package which might introduce unknown naming conflicts.Ok yes - I understand what youre trying to say. But like i mentioned these are the column headers. I have more than 40 samples so I was not sure why I was getting the error.
You need a 'header-field' for each sample - an mxn table as commonly said. Every column needs explicitly the name of the sample from which the (normalized!) counts are derived.
each of my columns has headers. here is a sample of how my data looks like:
edit: apologies. image attached.
What is your starting point? You performed counting using htseq-count or similar?
these are PCR Ct values
I was under the impression that you had microarray intensities or RNA-seq counts. I'm not sure that qPCR Cts is a datatype valid for WGCNA and would recommend contacting the WGCNA authors or use the Bioconductor forum.
Since all fields are just following each other it's not clear to me how your data looks like. Formatting is important.
once you read the data in to an R object, can you post the output of
dim(object)
? to see if the data has loaded properly ? It should not be a problem what values are they as long as they are numeric. It should work.Random comments? He addresses exactly the issue you are facing.