In a dataset of protein expression values (expression data as log2) we found 3 clusters using k-means. Now we would like to first perform an ANOVA and next a post-test (eg Tukeys) to test each pair of cluster grouping for differentially expressed proteins.
Unfortunately, no resources I found discussed this rather simple case but only much more complicated cases with multiple group combinations (and eg two treatments).
With the HybridMTest package, the ANOVA went fine and I have now the FDR for differentially expressed proteins between the 3 groups (6 samples per group). But now I'm stuck on how to calculate the posttest for every protein (= rows, n = 3878) between all of the 3 groups.
I could not find an appropriate package or function, maybe one of you could offer a hint on how to solve this? I would like to obtain as a result a df with the protein_id, the comparison group, the FDR and the logFoldChange.
Many thanks! (and sorry for providing the data only in a very descriptive way. I dont know how to create example expression data but will look into it)
Sebastian
Data description:
expression_df: rownames(expression_df) = protein_id, colnames(expression_df) = sample_id
pheno_df: rownames(pheno_df) = sample_id, pheno_df$cluster = cluster group (1, 2, or 3)
anova_results: rownames(protein_id), anova_results$comparison = eg "cluster 1vs 2", anova_results$FDR = FDR controlled ANOVA result, anova_results$logFC = logFC