Hi everyone , I recently finished to replicate a work made on RNA-seq data on genes involved in Tumor Educated Platelet. Now at the very end of the article (from which I replicated the analysis) they mention the following : Unsupervised hierarchical clustering was performed by Ward clustering and Pearson distances. Non-random partitioning, and corresponding p value, of unsupervised hierarchical clustering was determined using a Fisher’s exact test. Now I was wondering how do I compute this final part? What I have is a count matrix ( non-normalized and normalized) with Cancer and Control samples (285 in total) as columns and 5000 genes as rows, and a factor of 2 ( 1 = tumor , 0 = HC ). I have another factor too of 6 that specifies even the specific tissue of the tumor. For what I got this is useful in order to give more significance to the clustering , but I really don't know how to do this in R.
Ali, check this tutorial on unsupervised gene expression clustering, you may find answers to most of your questions by walking through that post.
Tnx I will give it a look.