Is there any comparison between Microarray and RNA-seq data that would be valid? If I have z-scores for both and they are similar does this have any meaning? Can anyone point me to good ways to document quantitatively that both types of data seem to show the same general trend. I know I can document separately, but I am wondering if there is any way to bring the data together.
Wondering if anyone has seen any publications where summary statistics were combined (per gene) across platforms to combine RNAseq and microarray data?
I wouldn't recommend directly combining the data - even if you try to use something like quantile normalization to make the distributions similar, there will still be obvious biases in signal between the two technologies.
Instead, I think it would be more informative to compare differentially expressed genes (or pathways, etc), assuming that you have both relevant groups in your microarray and RNA-Seq data. For example, you want to compare lists using a venn diagram. Or, as a technical benchmark you would calculate the correlation between fold-change and/or p-values from the separate analysis for RNA-Seq versus microarray data.
Have a look at some of the MAQC papers, such as this one. They often needed to compare different sequencing and array technologies to each other, so you can get an idea about what methods have proved useful. In general, it's best to threshold things so you get genes that are being accurately measured (i.e., nothing with very low expression) and then compare log2 fold changes or ranks. For ranks, not that the dynamic range of a microarray is lower than sequencing. Also, the correspondence of the fold changes will vary by the tool (e.g., you'll get better correspondence with DESeq2 than with DESeq).
Wondering if anyone has seen any publications where summary statistics were combined (per gene) across platforms to combine RNAseq and microarray data?