Entering edit mode
7.8 years ago
Kurban
▴
230
Hello guys, I have transcriptome data from low temperature treated samples with different time length. And I got different number of DEGs for each time point of strass challenge. Now I want to cluster all these differentially expressed genes. In some papers they did this analysis by heatmap based on genes’ foldchage, and others do this on RPKM value. How can I do the cluster? Is there any good tools and papers? Should I do the cluster on log2 (foldchage) or RPKM value ?
You can do both, or even more. I usually get best hierarchical clustering results, using the z-scores of log2 RPKM (or log2 CPM) values.
I use the heatmap.2 function from R gplots. You can try different clustering methods, for example ward.D is pretty good. Or different distance measures if necessary.
hi @ b.nota, i checked the heatmap.2 in gplots package and got the heatmap of my data based on FPKM values. but the number of my input genes are more than a thousand,and i want to extract the clustering result of the heatmap, how can i do that?
Do you mean you want the clusters that are formed after clustering?
Check previous post about this, you'll need cutree command for that.
How To Get The Subclusters From The Object Of Hclust() Using Cutree() According To The Order On The Map Produced By Heatmap.2?
There are some examples of clustering (in R) in DESeq2 tutorial (https://www.bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.pdf), page 26f.
thank you @ e.rempel, DESeq2 dose cluster DEGs based on the count data, but it only accept integer value.
That's because count data is in integers. Why isn't your data in integers? If you have used salmon/sailfish you should have a look at the tximport package for getting your data into DESeq2.
hi @ WouterDeCoster, i know that count data is in integers, but i want to use RPKM/FPKM value for heatmap
Check out the Mfuzz R package.