Using Leiden algorithm to cluster bulk RNA samples
1
0
Entering edit mode
2.7 years ago
nhaus ▴ 420

Hello,

I am currently analyzing a publically available dataset with ~200 samples. I want to cluster the samples based on their expression to identify disease endotypes. I already tried out hclust and NMF. I know that the Leiden algorithm is often used in single cell analysis and performs quite well there, so my idea was to also try this out. Ultimately, I would simply pretend that my bulk RNAseq samples are "cells" so that I can use Seurat to perform the clustering steps.

However, I did not find any papers in the literature that used the Leiden algorithm to perform bulk RNA seq clustering. This makes me wonder, if I am overlooking something and that the Leiden algorithm or my approach (pretend samples are cells so I can use Seurat) is not suitable.

I would appreciate any insights or comments on this. Thanks!

clustering RNAseq Leiden • 1.5k views
ADD COMMENT
4
Entering edit mode
2.7 years ago

As far as I know it is usually simply not needed to apply Leiden clustering to most bulk RNA-seq experiments because the usual clustering methods seem to work ok. What is the reason that you feel hclust, for example, doesn't do it for your data?

That being said, I don't see obvious reasons why not to apply the graph-based clustering. I would recommend not to use Seurat, though, since your data isn't actual single-cell data and some of the default settings of Seurat are meant to work with the sparse, zero-rich matrices of single cells. It's probably better to use the graph building and community detection (igraph package) methods separately, you can read more about it here and here

This article by Tran et al. (2020) seems to make use of graph-based clustering for bulk RNA-seq data, too (although with a different goal, I believe); they may give some hints for practical considerations.

ADD COMMENT
0
Entering edit mode

Thank you for your comment! The resources you linked are very helpful and I have to say that I did not find them before. hclust worked okay for me, but I thought I might try out several algorithm to see if the identified clusters are "robust", i.e. insensitive to the algorithm I am using.

ADD REPLY

Login before adding your answer.

Traffic: 1819 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6