Entering edit mode
5.7 years ago
Morteza Hadizadeh
▴
60
Hi every one,
I have two GSE number, one about breast cancer in men and the other about breast cancer in women, how do I get different genes between men and women from two combined datasets?
I would be grateful if guide me.
Your post is lacking in information. Which GSEs? Are they the same microarray platform? Are the study designs similar (apart from the fact that one relates to male breast cancer, while the other female)?
How do you believe you should conduct such an analysis? Of which sources of bias should you make note?
Dear Dr. Blighe,
GSE numbers are microarray platform, (GSE50512 and GSE22133), GSE50512 has a platform (SWEGENE_BAC_32K_Full) and GSE22133 has three platforms (SWEGENE_BAC_32K_Full, SWEGENE H_v2.1.1 55K & SWEGENE_BAC_33K_Full). Male and female cancer cells have responded differently by exposure to a nanoparticle. So we want to obtain different genes between men and women to Maybe we find a reason for this different behavior.
Well, at least they are all of a similar array type (judging by their names).
I would process and normalise each study independently to produce normalised, log2 expression values. After that, I would standardise these to the Z-scale and then 'unify' (combine) the datasets into a single dataset based on genes that match between them. This is not perfect, though, and batch effects will likely still exist.
Dear Dr. Blighe,
Thank you so much for your valuable guidance and I would appreciate your immediate attention to this matter.
There is nothing to which to attend. The experimental set-up is questionable. If you tried to publish, likely questions would be asked that relate to the different array types, with possible rejection of the work.
Relating to Wouter's comment, the main issue is that all of your male breast cancer samples are in just 1 study, GSE50512, so, there is some level of confounding. I suggest that you process the datasets independently and then compare the results between male and female. You could think of performing some form of a meta-analysis across the studies, for example.
Thank you for taking the trouble to help me. I do appreciate it.
In each study, all samples are patient (There is no control group) therefore I can’t process the datasets independently. I am a beginner in Meta-analysis, The result of meta-analysis is common genes. Is not it?
How did the authors process their data if there are no controls? It is possible, but you may want to look at the published manuscript(s) associated with the studies.
Yes, generally, a meta analysis is about comparing the results from independent 2 studies in order to see if the results are similar.
As a rule of thumb: no you cannot compare across two platforms. The only valid comparison is for samples ran with the same platform, in the same batch, by the same core/technician, at the same time, with samples treated exactly the same prior to the experiment.
If any of these parameters is different you cannot distinguish technical artifacts vs biological signal. Based on your description your phenotype of interest is confounded by a technical difference.
Dear Dr. DeCoster,
Thank you so much for your valuable guidance. As you said, the technical difference should not be ignored.
But you are ignoring them anyway... Remove batch effects in meta-analysis
You cannot correct batch effects if they are confounded by your biological effect.
I really appreciate your concern, my question about meta-analysis was only a learning aspect.