Genes for clustering
0
0
Entering edit mode
7.1 years ago

Hi everyone, I have a group of samples which is supposed to be biologically homogeneous. I want to cluster the genes to see which are highly expressed and which are lowly expressed across all samples. I tried hierarchal clustering but it got stuck because there are so many genes. I don’t want to use pca as I want to capture the genes that are uniformly expressed across the samples, not the ones which are most variable. Any suggestions on how to choose the genes to cluster for my purpose? Thanks

clustering hierarchical clustering • 1.8k views
ADD COMMENT
1
Entering edit mode

What's the data ? What do you mean with hierarchical clustering got stuck ?

ADD REPLY
0
Entering edit mode

The data is rna-seq. There were too many genes and the program is still running. It runs nicely with fewer genes (I have 30,000) Thanks

ADD REPLY
0
Entering edit mode

It should not take that much time. However, you can filter non variable genes and hope that reduces number.

ADD REPLY
0
Entering edit mode

What's the size of the data, the amount of RAM your computer has and the algorithm you use and its implementation ? I presume the data is a 30000 x p matrix. What's p ? Even for large p, this shouldn't take long unless your computer is underpowered (i.e. not enough RAM) and/or you use a bad/inefficient implementation of the algorithm.

ADD REPLY

Login before adding your answer.

Traffic: 2031 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6