Looking to get PATTERN CLUSTERING of gene expression data with multiple time points
2
0
Entering edit mode
7.0 years ago
Hushus ▴ 20

Hello all,

This is the 3rd thread concerning this issue and we have made steady progress. I sincerely appreciate the inputs of cpad0112 and h.mon for their efforts

Have: * RNA expression data for 12 time points, 1 replicate * List of genes of interest and their expression data http://m.uploadedit.com/bbtc/1512924884525.txt [Sample data, does not contain all 100 genes]

Want: Expression pattern clustering like this where x axis is time and y axis is relative expression (or something like it) enter image description here

Have tried: * Timecourse (R) which apparently does not accept if there is only 1 replicate [C: Cannot get result from TIMECOURSE (R PROGRAM)]

Question: Do you know of ANOTHER program that can achieve a similar result?

Possible solutions: 1) Take polynomial trend line of each line (gene) of expression data, cluster them accordingly manually. Problem: do not know how to get polynomial equation of each line of data using R.

RNA-Seq pattern cluster expression • 3.6k views
ADD COMMENT
1
Entering edit mode
7.0 years ago

A simple way to go about it may be k-means clustering, like

dat<- read.table('http://m.uploadedit.com/bbtc/1512924884525.txt')
kmeans(dat, centers= 3)

To get a reasonable guess for the number of clusters you could look at the between_SS / total_SS statistic, start with a small number and increase until the between_SS / total_SS stops increasing sharply.

Another useful method may be non-negative matrix factorization as implemented in e.g. the NMF package. The vignettes of NMF are quite useful. An example could be:

library(NMF)
dat<- read.table('http://m.uploadedit.com/bbtc/1512924884525.txt')
xnmf<- nmf(dat, rank= 4) # Again, you need to "guess" the number of "pseudogenes"
xnmf@fit@H

The H matrix essentially gives you a small set of "pesudogenes" that together describe well the (much larger) full set of genes.

ADD COMMENT
0
Entering edit mode

Oooo thank you for your detailed input. I will definitely try this out.

How would you plot the results?

ADD REPLY
0
Entering edit mode
7.0 years ago
Hushus ▴ 20

ANSWER: DOWNLOAD MeV SOFTWARE USE K-MEANS CLUSTERING, PEARSON.

YOU GET THE FOLLOWING: enter image description here

GOOD ENOUGH FOR ME.

ADD COMMENT

Login before adding your answer.

Traffic: 2507 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6