Error with plotMDS in R
1
0
Entering edit mode
5 months ago
egascon ▴ 60

Hi!

I have a question. Can we select two specific disease types from a Large ExpressionSet (Microarray) object to make a plotMDS?

Normalized.data:

enter image description here

I have a data set of 558 microarray samples. This dataset is normalised and is called "normalized.data". One of its phenoData is: "disease label:ch1" and it consists of 14 disease types. In order to see all the disease types I have used the following code:

levels(factor(normalized.data$`disease label:ch1`))

Result:

 [1] "ATYPICAL_PD"        "CBD"                "CONTROL"            "DRD"                "DRD-DYT5"          
 [6] "GENETIC_UNAFFECTED" "GPD"                "HD"                 "HD_HD_BATCH"        "IPD"               
[11] "MSA"                "PD_DEMENTIA"        "PSP"                "Vascular dementia" 

My aim is to make a plotMDS chart using only those samples that belong to the category: CONTROL and IPD. I do not want the rest of the diseases to be seen in the graph.

I have tried the following code but it does not work for me (I put all the code to be reproducible with the selected data):

GSE99039_RAW <- getGEO('GSE99039', GSEMatrix = TRUE)
varLabels(GSE99039[[1]])
getGEOSuppFiles("GSE99039")
untar("GSE99039/GSE99039_RAW.tar", exdir = 'data/')
gse <- ReadAffy(celfile.path = "data/", phenoData=phenoData(GSE99039))
normalized.data <- affy::rma(gse)
plotMDS(exprs(normalized.data), 
            col = c("blue", "red")[as.factor(normalized.data$`disease label:ch1` == "CONTROL" & "IPD")], pch = 19)

Result:

Error in h(simpleError(msg, call)): 
  error in evaluating the argument 'x' in selecting a method for function 'as.factor': only operations for numeric, complex or logical variables are possible.

I can't get it to select only the CONTROL and IPD samples. Any idea? What is the correct way to select that particular data?

Thank you very much for your help,

plotMDS RStudio • 992 views
ADD COMMENT
1
Entering edit mode

Yes, you should be able to perform ordination analyses on this type of data. However, to help us, please provide a minimal reproducible example of the data and the code for each step until the error. All we see is 1 R command, no data, and no error.

Here is a great post about how to write reproducible example in R. Also, please use markdown to format your posts.

ADD REPLY
0
Entering edit mode

Thanks for the suggestion. I have modified the post to make it clearer. I hope it will help.

ADD REPLY
0
Entering edit mode

Cleaner, yes, but you haven't added any data to try and reproduce your error or a workaround which was the crux of my comment.

Regardless, an MDS plot is just a scatter plot. You can use the vegan::vegdist function to generate a dissimilarity matrix where you treat samples as sites and loci as observations. You can then take the MDS coordinates from the output object to plot in ggplot2. This is how I generally do it anyway as ggplot offers so much more control.

The error seems to be related to data types being fed to the plotMDS function, but it's unclear to me.

ADD REPLY
0
Entering edit mode

Sorry. Now I have understood you. I have already put the code as it is with the dataset I used. I don't quite understand what it would be like to do it with ggplot. Could you give me an example? Thank you very much for your help.

ADD REPLY
0
Entering edit mode

The workflow would be as follows:

dists <- vegdist(data, method = "bray", k = 2)
nmds <- metaMDS(dists)
stressplot(nmds)

## I've forgotten the structure of the output metaMDS object, but use str(nmds) to identify where the MDS axes coordinates. 
## The row numbers should correspond to input data, so you can append your metadata for colouring points. 

ggplot(merged_meta_nmds, aes(x = mds1, y = mds2, colour = disease label)) +
    geom_point()
ADD REPLY
0
Entering edit mode

Thank you for all. I will try this workflow.

ADD REPLY
0
Entering edit mode
5 months ago
rfran010 ★ 1.3k

if I remember correctly, I believe R can get confused with the == "" & ""

I would try:

plotMDS(exprs(normalized.data), 
        col = c("blue", "red")[as.factor(normalized.data$`disease label:ch1` == "CONTROL" & normalized.data$`disease label:ch1` == "IPD")], pch = 19)

If this doesn't solve, hopefully there's a different error message that appears at least.

ADD COMMENT
0
Entering edit mode

Hi,

I tried with your code and no error has ocurred but, in te graph, all samples come out. They are not filtered.

Result:

enter image description here

Is it possible what can't do this?

ADD REPLY
0
Entering edit mode

My guess is you need to subset the data argument:

ADD REPLY
0
Entering edit mode

I have thought about it. I'm surprised there isn't a simpler way to do that analysis. I thought about taking the normalised matrix, passing it to a dataframe and creating a subset of the data with dpyrl selecting only CONTROL and IPD. But my question is: Does plotMDS allow a dataframe as input? Or do I have to add the exprs() function?

ADD REPLY
0
Entering edit mode

It should allow a dataframe. If it specifically needs a matrix, then you can you can probably use as.matrix()

Also, not sure if you came across this post Bioconductor, how to select a subset of samples in an ExpressionSet?

ADD REPLY
0
Entering edit mode

Also, I think when you subset, you want to use an or operator instead of the and operator, because it looks like you are filtering the same field.

ADD REPLY
0
Entering edit mode

Thank you for all. I will report back with the results.

ADD REPLY

Login before adding your answer.

Traffic: 2796 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6