Hello guys,
I'm currently mining 21 microarrays of tumor samples after chemotherapy. The response of the tumor from the chemotherapy is graded by Good or Poor depending on the percentage of necrosis of the tumor. There are 10 "Good" and 11 "Poor" ranked samples. I want to find out the differentially expressed genes(DEGs) behind the "good" or "poor" response.
However, after processing the data using the rma function of affy package in R and determining the DEGs by using SAM. No positive gene showed up. The SAM plot is like that (please see fig.) ![SAM plot][1] [1]: https://ibb.co/nsyV35 There are actually over 600 DEGs lFClog1.5l>1 when ignore the FDR.But when concerning FDR, none of then are less the 0.05. T_T
I would really appreciate it if some could answer me that what makes a SAM plot like that happen? Biological replication?individual bias? Is there any step-back I can make on it?
Did you try limma instead of SAM? I think limma is kind of the standard tool to use for DEG analysis (not SAM).
The fact is, other than using SAM, I have processed the data using Excel to calculate the average, foldchange and p-value. For FDR i used q-value(i)=p(i)*length(p)/rank(p), but no FDR was under 0.05. I don't think change the algorithm can make such a big change. But I will try since I can learn to use limma by the way:)
Changing the algorithm can make a HUGE difference.
When this happens you should try with other tools as @b.nota suggested. Some suggestions:
Limma: https://bioconductor.org/packages/release/bioc/html/limma.html
DESeq2: https://bioconductor.org/packages/release/bioc/html/DESeq2.html
EdgeR: https://bioconductor.org/packages/release/bioc/html/edgeR.html
With all those replicates I highly doubt that your results are not skewed by some bias, so perhaps a different algorithm will point it out.
Besides this, what is the log fold change threshold you're testing against?
Op is having microarrays, only Limma is applicable. another viable option is cyberT: https://www.ncbi.nlm.nih.gov/pubmed/22600740
Thanks @b.nota and Macspider, When the cutoff is log1.5 and have 800 DEGs with p<0.05 but FDR>0.05 If the cutoff were set to be log2(FC)>1, there would be 300 DEGs, still with FDR larger than 0.05.
The fact is, other than using SAM, I have processed the data using Excel to calculate the average, foldchange and p-value. For FDR i used q-value(i)=p(i)*length(p)/rank(p), but no FDR was under 0.05. I don't think change the algorithm can make such a big change. But I will try since I can learn to use limma by the way:)
About the SAM plot, do you happen to know how to interpret this plot?
IMO, a SAM plot like that occurs because your data almost perfectly fits the null hypothesis. Could you elaborate a bit more though: are you saying your samples were resected after chemo, or that they were resected prior to chemo and the response to treatment was followed up?
Thanks for the reply russhh,
The sample were osteosarcoma biopsies before the surgery and after the preoperative chemotherapy and was graded I to IV according to the viable tumor cell, in another word, the less the viable tumor, the better the response to the therapy. Actually these data were GSE87437 available on GEO. Very rare samples. Like I said, I tired to mining the data and dig out the genes contribute to better chemo response. Mission failed=.=
I did try like Devon said below, however, limma couldn't save me~haha.
What you have said make a lot sense, because the samples were not significantly distinguished, unlike tumor vs non-tumor. I guess this might be the reason.
Regards