Clustering large data
2
0
Entering edit mode
6.7 years ago
rishi ▴ 10

I have multiple datasets of genomic data, of several thousand rows and 20 columns. Each dataset is a cell line/tissue type, and each row is a genomic feature.

I wrote my own algorithm which worked on the raw dataset to cluster all the genomic feature rows of the large dataset matrices in a supervised way, to obtain 7 classes. Now I want to validate this with a supervised clustering algorithm, if I can get the somewhat similar 7 classes as well, with the same approach/idea as I used in my clustering.

What are the best ways to do this, and how can I do so?

genome R clustering • 1.4k views
ADD COMMENT
2
Entering edit mode
ADD COMMENT
0
Entering edit mode
6.7 years ago
arta ▴ 670

You can use ConsensusClustering or Gap Statistics in R to compare your classification and unsupervised clustering methods.

ADD COMMENT

Login before adding your answer.

Traffic: 2295 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6