I have some gene expression data which I would like to cluster using R.
I have a matrix of exon IDs (rows) and tissue names (columns) that I would like to cluster according to their expression values (data elements). However, the problem is that each exon id (e.g. 184049260_184049394_ENSG00000114867) belongs to a specific gene id (e.g. ENSG00000114867), and therefore I want to constrain the clustering so that the exons remain grouped according to their gene. The exons are allowed to cluster only within their gene group.
I came across an R package called flexclust which contains a method called k-centroids cluster analysis (kcca) which I thought might be useful. However, the documentation for kcca does not explain the function of each argument very well.
Does anyone know if there is package in R that is suitable for this task? And if so, could you explain how to implement it?
Do you want to cluster the tissues based on their expression? Or cluster the exons?
I would like to do both. But I am more interested in clustering the exons.