Hi all,
I have to combine two datasets obtained using two different platforms (Illumina and Affymetrix). The combined dataset contains gene expression for 11 cell types. For my purpose, I do not need to find the differentially expressed genes of one cell type to the others, but I need to find the upregulated genes of each cell type. To do this, I ranked ~20000 genes for each sample, and selected genes that were ranked within the top 20% of the ~20000 genes for 80% of the replicates of each cell type (all the cell types have >=5 replicates). However, I am not sure how to estimate the statistical significance (e.g., FDR) for my selected genes. Any advice is appreciated. Also, does anybody know any methods that suit my purpose?
Thank you very much.
Wendy
up-regulated relative to what?
Maybe you can use Combat within the SVA package on bioconductor (http://www.bioconductor.org/packages/release/bioc/html/sva.html). It can help to merge data sets from different batches with different conditions and it also contain functions for p-value calculation. The problem is, you might find it difficult to map the probe ids to generate the required data structure