Question

find pairwise transcriptional similarities between samples

0

Entering edit mode

3.0 years ago

fifty_fifty ▴ 90

As the result of bulk RNA sequencing, I have a count matrix that has 20 samples. I want to find similarities in transcriptomes between samples 1 and 11, 2 and 12, 3 and 13, etc. What is the best way to show that 1 and 11 are more similar than 1 and 12?

RNA-seq r • 831 views

ADD COMMENT • link updated 3.0 years ago by Michael 55k • written 3.0 years ago by fifty_fifty ▴ 90

0

Entering edit mode

Maybe some clustering? I'd start with a principal component analysis (PCA), and see if the samples you think that are similar between them cluster close to each other.

ADD REPLY • link 3.0 years ago by iraun 6.2k

score 2 · Accepted Answer · 2022-06-14

There are several ways, I think two main lines of approaches:

clustering of the correlation matrix, can be visualized as a heatmap (based on functions cor, hclust, image, heatmap and the like this can be done easily in R)
Multi-dimensional scaling (MDS) or other forms of dimensionality recution, e.g. see the function plotMDS in the edgeR package, possibly methods like t-SNE or UMAP that have become popular for single-cell applications could be used as well but may be a bit of an overkill.

In all cases, normalized and possibly further log-, square-root, or box-cox transformed counts TPM or CPM should be used. If you are unsure about the transformation, try several.