Finding co-regulated genes is actually quite a difficult task, using straight clustering like k-means is in my experience not that productive on straight expression data. As I am sure you're aware k-means is by far and away the most commonly used clustering method. Any one clustering method has it's problems and the lack of associated statistics and statistically informed decision making when doing things like picking cluster numbers for the partitioning are big problems.
Now to what I do. Firstly no harm in doing what has been proposed, but what I bet you will find is profiles that are mainly segregated on the basis of variations in expression magnitude rather than expression shape and that is a big problem biologically. When you think about what (I believe) you are looking for, you are trying to find genes that share expression profiles, that is shapes. I often show a simulated example in lectures of why you will often end up co-clustering genes based on magnitude variation rather than similarity in shape if you use the standard approach. The way past this is actually very simple and we have found it incredibly useful in identifying co-regulated genes in Drosophila PNS development.
Take your original expression matrix [genes x conditions] and unitise the vectors, that is make the matrix magnitude invariant. The way to do this is to normalise the rows by the length of their vectors
#for an expression matrix sim_class, get the length sqrt(sum of squares)
norm_factors <- sqrt(apply(sim_class^2,1,sum));
#divide the rows by the norm factor
normalised_sim_class <- sim_class/norm_factors;
Now clustering this expression matrix pulls out genes that have the same shape irrespective of magnitude. Biologically this means you are pulling together genes that might be direct targets of a particular set of transcription factors, with identical profiles but simply a different response scale. This is what you often find in reality, it is more often the case that co-regulated genes are responding in different scales, but in our experience with similar expression profile shapes. I hope that's of some use, even if only to run alongside your current analyses to see the differences this approach produces.
I'm curious how the clustering result from magnitude normalization (using Euclidean distance?) compares with clustering on the raw data using correlation...