Looking at RNA-seq data from CCLE (Cancer Cell Line Encyclopedia) from Broad Institute and microarray expression data from NCI-60 cell lines, there is a discrepancy in the "relative" expression level of genes of my interest. Thus I was wondering if there is a "best practice" when we try to pull data from different datasets to compare the expression levels.
One example (not an exception - but it's not clear either how common it is)
For microarray expression data, I took the median expression intensity as the expression level of the gene of interest, when there are multiple probes for the same gene.
For cell line A498 and 7860 (or 786O),
RNA-seq data for MCL1 expression is 37 and 62, respectively - thus the relative level is that in 7860 there is 2x the expression.
Microarray data for MCL1 expression is 9.19 and 8.92, respectively - thus the relative level is comparable between the two cell lines. [ for A498: median is the average of probes 214056_a and 200796s, while for 7860, median is the average of probes 214057_a and 200796s ].
Any suggestions ???