Entering edit mode
4.2 years ago
ccha97
▴
60
Hello, I am getting the gpar element must not equal zero error when I run the code for my pheatmap.
Error in check.length("fill") :
'gpar' element 'fill' must not be length 0
I understand it's something to do with matching the annotation columns of the matrix to the row names of the rownames of the dataframe, however am not sure how to go about it in my situation?
Here is my current code for my kmeans clustered heatmap:
resSigind = res[ which(res$padj < 0.05 & res$log2FoldChange > 0), ]
resSigrep = res[ which(res$padj < 0.05 & res$log2FoldChange < 0), ]
resSig = rbind(resSigind, resSigrep)
allSig_genes <- rownames(resSig)
library("pheatmap")
library(cluster)
rld <- rlog(dds)
rld_sign <- assay(rld)[allSig_genes,]
topVarGenes <- head(order(rowVars(rld_sign), decreasing = TRUE), 100)
set.seed(1234)
k <- pheatmap(rld_sign[topVarGenes,], scale="row",kmeans_k = 3)
clusterDF <- as.data.frame(factor(k$kmeans$cluster))
colnames(clusterDF) <- "Cluster"
OrderByCluster <- rld_sign[topVarGenes,][order(clusterDF$Cluster),]
pheatmap(OrderByCluster,
scale="row",annotation_row = clusterDF,
show_rownames = FALSE,cluster_rows = FALSE)
Yes, the DESeqDataSet and results seem to be fine, I have made a similar heatmap using them before.
Output of the following, please:
If I recall correctly, these two objects have to be synchronised, with
rownames(clusterDF) == colnames(OrderByCluster)
I tried to make the rownames equal the colnames, but get this error:
Sorry, my apologies, this is row annotation; so, it would have to be:
Note also the two equals signs,
==
, i.e., this is just a conditional statement, not a blind assign command, which we should not do.Essentially, your
clusterDF
object needs rownames so that it can be aligned toOrderByCluster
. Here is the entry from the pheatmap docs:It seems to run, but outputs false and leaves me with the same error.
I'm not sure why it isn't working, as it seems to work fine when I originally I used rld instead of rld_sign? For context my OrderByCluster and clusterDF looks like this:
Thanks, so, the rownames of
clusterDF
need to be set, and they should align with those rownames ofOrderByCluster
. In your screenshot, the rownames ofclusterDF
are just 1, 2, 3, 4, 5, et cetera.This probably occurred in this situation because you just have a single column in your annotation, but i'm not sure.
If you are 100% certain that
clusterDF
is already perfectly aligned withOrderByCluster
, then, to solve this, you probably just need to do:It seems like there's duplicate row names (genes) in my OrderByCluster non-unique values when setting 'row.names': �100041546�, �110257�, �110612�, �11656�, �11826�, �12346�, �12349�, �12516�, �14934�, �15122�, �15129�, �17067�, �17105�, �17523�, �18054�, �18854�, �19746�, �20194�, �20343�, �20533�, �21349�, �216197�, �231507�, �238447�, �246727�, �246730�, �27028�, �328563�, �328780�, �435784�, �620017�, �66141�, �668139�, �72289�, �76933�, �777780�Error in
.rowNamesDF<-
(x, value = value) : duplicate 'row.names' are not allowedThis is strange because these should be the top 100 variable genes - I've tried running unique(rld_sign) and distinct(rld_sign), but the problem still persists (not sure why considering these duplicates seem to have the same values as one another)
EDIT: I was able to fix the duplicated row name issue (realised the duplicates were contained in my allSig_genes, however now my issue is that rows which are not in the same cluster are not grouped together
Glad that you got it working at last; however, this issue:
...will require some debugging of your general workflow.
I would just implore that you ensure 100% that your annotation objects are always perfectly aligned with your main expression data, i.e., in both length and order. Many functions are not intelligent enough to align them for you, and may just assume that they are aligned.
Hi Kevin, thanks for that - you were right in the sense that they weren't aligned. Upon closer inspection, I realised that when I ran rownames(clusterDF) <- rownames(OrderByCluster), it overwrites the rownames (ENTREZ IDs) of clusterDF in the same order as they are in OrderByCluster.
However, the other column - the cluster number (e.g. 1, 2, 3) does not change accordingly with those rownames (that is, the cluster numbers are associated with the original rownames of clusterDF). Therefore the genes aren't being assigned to the correct cluster (e.g. gene 20343 is meant to be in cluster 1, however after changing rownames is in cluster 2).
I've tried to look up what function to use, e.g. the match function but I don't think it's quite what I need? Another idea is to have rownames of clusterDF and OrderByCluster in the same order (e.g. ascending or descending), but a lot of the forum posts I've been looking at only specify how to order dfs by columns, rather than the actual rowname.
match()
should work here, but there are also new functions that you could use in dplyr: https://www.r-bloggers.com/how-to-perform-merges-joins-on-two-or-more-data-frames-with-base-r-tidyverse-and-data-table/.