Hi!
I have a question. Can we select two specific disease types from a Large ExpressionSet (Microarray) object to make a plotMDS?
Normalized.data:
I have a data set of 558 microarray samples. This dataset is normalised and is called "normalized.data". One of its phenoData is: "disease label:ch1" and it consists of 14 disease types. In order to see all the disease types I have used the following code:
levels(factor(normalized.data$`disease label:ch1`))
Result:
[1] "ATYPICAL_PD" "CBD" "CONTROL" "DRD" "DRD-DYT5"
[6] "GENETIC_UNAFFECTED" "GPD" "HD" "HD_HD_BATCH" "IPD"
[11] "MSA" "PD_DEMENTIA" "PSP" "Vascular dementia"
My aim is to make a plotMDS chart using only those samples that belong to the category: CONTROL and IPD. I do not want the rest of the diseases to be seen in the graph.
I have tried the following code but it does not work for me (I put all the code to be reproducible with the selected data):
GSE99039_RAW <- getGEO('GSE99039', GSEMatrix = TRUE)
varLabels(GSE99039[[1]])
getGEOSuppFiles("GSE99039")
untar("GSE99039/GSE99039_RAW.tar", exdir = 'data/')
gse <- ReadAffy(celfile.path = "data/", phenoData=phenoData(GSE99039))
normalized.data <- affy::rma(gse)
plotMDS(exprs(normalized.data),
col = c("blue", "red")[as.factor(normalized.data$`disease label:ch1` == "CONTROL" & "IPD")], pch = 19)
Result:
Error in h(simpleError(msg, call)):
error in evaluating the argument 'x' in selecting a method for function 'as.factor': only operations for numeric, complex or logical variables are possible.
I can't get it to select only the CONTROL and IPD samples. Any idea? What is the correct way to select that particular data?
Thank you very much for your help,
Yes, you should be able to perform ordination analyses on this type of data. However, to help us, please provide a minimal reproducible example of the data and the code for each step until the error. All we see is 1 R command, no data, and no error.
Here is a great post about how to write reproducible example in R. Also, please use markdown to format your posts.
Thanks for the suggestion. I have modified the post to make it clearer. I hope it will help.
Cleaner, yes, but you haven't added any data to try and reproduce your error or a workaround which was the crux of my comment.
Regardless, an MDS plot is just a scatter plot. You can use the
vegan::vegdist
function to generate a dissimilarity matrix where you treat samples as sites and loci as observations. You can then take the MDS coordinates from the output object to plot inggplot2
. This is how I generally do it anyway asggplot
offers so much more control.The error seems to be related to data types being fed to the
plotMDS
function, but it's unclear to me.Sorry. Now I have understood you. I have already put the code as it is with the dataset I used. I don't quite understand what it would be like to do it with ggplot. Could you give me an example? Thank you very much for your help.
The workflow would be as follows:
Thank you for all. I will try this workflow.