Hi all,
I'm trying to cluster 128 genes from mRNA-seq data so as to see which genes group together based on their expression profiles across different samples. I'm using pvclust (which clusters columns) to do this. So my genes are in the columns and samples in the rows. My input file is a csv file and looks like this:
> Sample GATA3 KMT2E SOX10 CREB3L2 SOX5 ETV6 ATN1 ETS2
> 1 6.73609 7.59656 0.352607 17.8663 1.86339 19.2949 56.9042 11.8808
> 2 18.9784 11.8289 1.00279 34.0411 2.09856 22.2998 56.7117 16.2549
> 3 86.9037 9.12542 5.43191 9.04106 1.94622 28.1369 70.0857 43.5062
> 4 111.871 7.14345 39.377 6.45569 4.96795 58.6333 59.5696 16.3631
> 5 63.4973 13.3015 124.078 6.86142 10.1776 49.313 99.137 13.8555
and my code is:
data=read.table("pvclust_input.csv", sep=",", header=TRUE, fill=TRUE)
data_mod<-data[ ,2:128]
data_matrix<-data.matrix(data_mod)
library(pvclust)
result <- pvclust(data_matrix, method.dist="cor", method.hclust="average", nboot=1000)
When I execute this, it gives me the following error and warnings:
> result <- pvclust(test2_matrix, method.dist="cor", method.hclust="average", nboot=1000)
Bootstrap (r = 0.4)... Done.
Bootstrap (r = 0.6)... Done.
Bootstrap (r = 0.6)... Done.
Bootstrap (r = 0.8)... Done.
Bootstrap (r = 0.8)... Done.
Bootstrap (r = 1.0)... Done.
Bootstrap (r = 1.0)... Done.
Bootstrap (r = 1.2)... Done.
Bootstrap (r = 1.2)... Done.
Bootstrap (r = 1.4)... Done.
Error in solve.default(crossprod(X, X/vv)) :
Lapack routine dgesv: system is exactly singular: U[2,2] = 0
In addition: There were 11 warnings (use warnings() to see them)
> warnings()
Warning messages:
1: inappropriate distance matrices are omitted in computation: r = 0.4
2: inappropriate distance matrices are omitted in computation: r = 0.6
3: inappropriate distance matrices are omitted in computation: r = 0.6
4: inappropriate distance matrices are omitted in computation: r = 0.8
5: inappropriate distance matrices are omitted in computation: r = 0.8
6: inappropriate distance matrices are omitted in computation: r = 1
7: inappropriate distance matrices are omitted in computation: r = 1
8: inappropriate distance matrices are omitted in computation: r = 1.2
9: inappropriate distance matrices are omitted in computation: r = 1.2
10: inappropriate distance matrices are omitted in computation: r = 1.4
11: In lsfit(X, zz, 1/vv, intercept = FALSE) : 'X' matrix was collinear
I don't understand the error. Please help!
Thanks!!
Dear Diana,
Finally did you find the solution for these errors/warnings?
please let me know
Diana, use r = 1.0 could solve the problem (ordinary bootstrap instead of multiscale bootstrap option).
Did you try with a smaller nboot value first? How many memory do you have?