Hi, I am working in microbiome analysis and am new in using pheatmap package. I have approx 160 samples from 2 field trials (2 years) with two different treatments (control and pathogen_treated). I calculated the relative abundance of the top50 OTUs for each tissue treatment (for example, top50 OTUs of 40 healthy tissues in 2017, ... same for each treatment per each year), Then I summed the RA of each OTU for all samples in the same year and of the same treatment, so 160 samples are now represented in 4 sets (healthy2016, pathogen2016, healthy2017, pathogen2017) and since the top50 are relatively different in each set, so finally I have a table summed RA of 91 OTUs across 4 sets of samples.
Healthy_2016 pathogen_2016 Healthy_2017 pathogen_2017
K_Bacteria (OTU92) 0.0836997 0 0 0
Enterococcus (OTU65) 0 0 0.4952662 0.0990601
Leuconostoc (OTU37) 0.1895487 0.1126888 0 0
Lactococcus (OTU106) 0 0 0.0425752 0
Lactococcus (OTU36) 0.2293602 0 1.0179649 0.7535034
Exiguobacterium (OTU38) 0 0 0.261444 0
Sphingobacterium (OTU71) 0.103809 0 0.0763031 0.1448188
I created a heatmap using the following code:
```{r fig.height=13, fig.width=8}
# create heatmap using pheatmap
pheatmap(dat, color = colorRampPalette(rev(brewer.pal(n = 7, name =
"RdYlBu")))(100), kmeans_k = NA, breaks = NA, border_color = "grey60",
cellwidth = NA, cellheight = NA, scale = "row", cluster_rows = TRUE,
cluster_cols = TRUE, clustering_distance_rows = "euclidean",
clustering_distance_cols = "euclidean", clustering_method = "complete", cutree_rows = NA, cutree_cols = 2, legend = TRUE, legend_breaks = -2:2,legend_labels = c("-1.5", "-1", "0", "1", "1.5"))
Then I need to add 2 annotation_col to the heatmap, first based on treatment so column 1 and 3 will be represented as healthy tissues and 2, 4 as pathogen-treated tissues, second annotation bar should represent the year of the field-trial, so columns 1 and 2 will be labeled fieldtrial2016 and 3 and 4 labeled fieldtrial2017. I checked different solutions online but none worked. So, I believe what I need is to create a dataframe of my-metadata to use it in col annotation. For example
Samples Treatment Year
Healthy_2016 HealthyTissue FieldTrial2016
pathogen_2016 FG_Tissue FieldTrial2016
Healthy_2017 HealthyTissue FieldTrial2017
pathogen_2017 FG_Tissue FieldTrial2017
I do not know how to create a dataframe of these repeated data to use in annotation_col
Your help is much appreciated