Question

Sample Correlation in RNA-Seq data

0

Entering edit mode

10.3 years ago

kaihami • 0

Hello,

I think it might be a very silly question, therefore sorry about my ignorance.

Imagine a RNA-seq data set with n samples, and I want to determine the correlation between each other.

After count normalization, we can perform a correlation between each sample, using Pearson, Spearman, or other one. In a gene to gene correlation I do can understand how these tests works.

But if I have a huge data set (n samples) with m genes, how the correlation test works? Can anybody answer me please?

Regards,

Correlation RNA-Seq • 7.8k views

ADD COMMENT • link updated 3.1 years ago by Ram 45k • written 10.3 years ago by kaihami • 0

Ram · Answer 1 · 2015-07-29

1

Entering edit mode

10.3 years ago

ethan.kaufman ▴ 380

Correlation is a pairwise measure. You can calculate correlation between two samples (by considering each gene as an independent observation) but not between n samples. To get a sense of the overall concordance of your dataset, I would calculate all pairwise correlations, which would generate a symmetric nxn correlation matrix, and should identify any outlier samples. The corrplot function in R provides a nice heatmap-style visualization of this.

ADD COMMENT • link updated 3.1 years ago by Ram 45k • written 10.3 years ago by ethan.kaufman ▴ 380

0

Entering edit mode

Yep, a really silly question, I thank you ethan. I don't know why I haven't seen it before lol

ADD REPLY • link updated 3.1 years ago by Ram 45k • written 10.3 years ago by kaihami • 0