Entering edit mode
13 months ago
ali
•
0
Hello everyone,
I am writing to inquire about my PCA plot following the meta-analysis of three PCOS studies. I am concerned about the proximity of some normal samples to the PCOS samples in my PCA plot. I would like to understand whether this proximity could affect my differentially expressed gene (DEG) results. Should I consider omitting certain samples or taking any other actions to address this issue?
Thank you.
Please provide the PCA in the post. We cannot comment without seeing it
yes you right thanks I forgot , Iam editing
Just to confirm, are the points samples taken from the three studies? And what did you use to plot the PCA, is this overall gene expression profile? And how much variation are the PC's explaining?
Are the samples taken from the the same/similar treatments among studies?
yes I used RMA normalized gene expression data and my data was micro array. after batch effect removal, using normalized matrix, I plotted PCA diagram. And how much variation are the PC's explaining? I cant understand your question .sorry
The eigenvalues of a principal component represent how much of the total variation is explained by that one axis. So you usually get a rough idea of how much of the variation PCx and PCy are explaining. It may be that PC1 and PC2 aren't explaining very much variation and there is nothing to worry about. See here for a quick explanation.
If the samples represent the entire expression profile of each individual then this is about what I'd expect. I don't work in human health, but I wouldn't expect global expression to change except for more advanced and damaging illnesses. I would have thought you're more likely to see more nuanced changes in gene expression. Say a handful of transcripts being differentially expressed. Do you have any positive controls (i.e., genes you know should be differentially expressed)?
How was created this PCA ? Did you correct for batch effect ? and have you looked at the PCA for independant datasets ?
yes I did batch effect with combat and it was okay. but, I didnot look at at the PCA for independent datasets!
Try to color the same PCA plot by studies, maybe there is still a strong technical effect even after correction. In addition, we do not know on which kind of data you applied PCA (I guess normalized gene expression but from which technique, samples ?). Did you applied PCA on the whole gene set? did you try to remove the least variable genes ? There is a lot of variables that can influence a PCA plot, and you need to give more details.
this plot has been drawn by using RMA normalized data after batch effect correction. it worth noting that 3 different studies were used for my meta-analysis including two affymetrix with GPL570 and one with other chip. these data are PCOS pathogens but, they were obtained from two different profiles (granulosa cells and Cumulus cells). I thinks this variation might be related to two different profiles
in addition, I have performed PCA on the whole normalized genes.!
To personally work on oocytes, granulosa cells and cumulus cells show different transcriptomic profiles and react differently to different pathologies of the reproductive system. I strongly not recommend performing a meta-analysis including different tissues.
Those values are very small...and why is it not centered around 0 for PC1? I suspect that you did something wrong. But I also agree with the poster below that I'm not sure you should necessarily expect drastically different expression.
Thanks for your valuable responses.