I have also asked this question on stackoverflow: https://stackoverflow.com/q/64187464/14385969
Basically I am a student in a Bioinformatics class that is using R to determine the top 10 differentiated genes and construct a heatmap of our preffered dataset. From the GEO database I chose GSE117588. Our prof spoon-feeds us code to punch into R, and I run into trouble when inputting the following command:
group <- c(rep("G1",3), rep("G2",3))
counts <- data1
cds <- DGEList( counts , group)
names(cds)
head(cds$counts) # original count matrix
cds$samples # contains a summary of your samples
sum(cds$all.zeros) # How many genes have 0 counts across all samples
cds <- calcNormFactors(cds, method="upperquartile")
cds$samples
In response to cds <- DGEList( counts , group)
it returns error message 'lib.size' must be numeric. I have downloaded edgeR and am not sure what to input to debug. I have tried to troubleshoot with a similar question here wherein you add row.names=1
to offset the problems R has with the geneID's (which I'm assuming are the non-numeric part of my dataset it is having trouble with). Please let me know! I'm new to bioinformatics and coding and am eager to learn how to solve and troubleshoot my assignments better.