Hello everybody
Can any one help me understand the MDS plot for two groups of samples like control vs. treated we generate from edgeR or Deseq2. I understand the figure as it separates the control from treated very well, but if we have the y-axis as the logFC, I don't get what the x-axis is? What is the x-axis in such 2D figure?
Thanks
Thank a lot Kevin. Can you please give me an example of what could a dimension 1 and 2 be, just as an example to get the idea more?
thanks.
Hi! You can say that each dimension represents similarity / dis-similarity between your samples of interest, but that's effectively it. They are not directly measuring any parameter in your data. Euclidean distances are abstract and unitless, and in your situation they are an abstraction of the values supplied in your input data-matrix, i.e., gene expression. You should see a very similar separation of your dataset by doing simple hierarchical clustering with Euclidean distance as the distance metric.
Think of it another way: given the expression levels of your input genes, how similar and dis-similar to each other are my samples based on the expression of these genes?
For downstream applications, I prefer to do PCA, as you can better quantify the similarities / dis-similarities with component loadings to each axis / 'dimension', and, thus, infer which specific genes are responsible for segregation along a particular axis.
Trust that further helps.
Kevin
That's very helpful, thanks Kevin a lot !
No problem - happy to have helped.