Question

Correlating multiple CuffDiff outputs?

1

Entering edit mode

9.0 years ago

steve ★ 3.5k

I have two sets of CuffDiff outputs, set up like this:

CuffDiff 1: Sample 1a vs. Sample 1b

CuffDiff 2: Sample 2a vs. Sample 2b

This gives me log2(FoldChange) values for both comparisons, and both comparisons have the same gene lists.

Is there a way to calculate the correlation between the log2(FoldChange) values per gene for the two data sets? Is this comparison meaningful?

Ultimately I would like to be able to plot this. Right now I've made a plot with the log2(FoldChange) for Sample 1 comparison along the x axis, and log2(FoldChange) for Sample 2 comparison along the y, per gene, but it is messy. I am thinking that this might help to give a more meaningful and understandable visualization.

RNA-Seq R • 1.7k views

ADD COMMENT • link updated 9.0 years ago by Devon Ryan 105k • written 9.0 years ago by steve ★ 3.5k

score 1 · Answer 1 · 2016-07-30

I assume you actually had more than 4 samples total, but if not the results are largely useless for anything except determining sequencing depth and replicate number for the real experiment.

Regarding your question, yes the cor() function will do that, though I would strongly encourage you to make a scatter plot too. Regarding whether such a correlation is meaningful, I guess it depends on how much you want to make out of it. I don't know if cuffdiff shrinks fold-changes (it probably doesn't), so my guess is that a very low correlation between lowly expressed genes between the two groups is going to completely muck up any interpretation. You'd probably be better off doing some prefiltering based on expression level.