I am trying to analyze ENCODE ChIP-seq data with diffbind to find differential peaks. However, I got error in dba.count step:
Error in SummarizedExperiment(assays = SimpleList(counts = countData), :
the rownames and colnames of the supplied assay(s) must be NULL or identical to those of the
SummarizedExperiment object (or derivative) to construct
Can anyone help me with this? Thank you very much!
What versions are you using (output of sessionInfo())?
What are the steps in your script, from creating a new DBA object up to and including calling dba.count()? Are you doing any type of normalization using spike-ins or parallel factors? Are you specifically requesting the data as a SummarizedExperiment object using DBA_DATA_SUMMARIZED_EXPERIMENT?
Thank you very much for your reply! Please find the following script. I am new to ChIP-seq analysis and Diffbind package, I haven't done any normalization yet, I just follow the manual of Diffbind, the nornmalization step in Diffbind manual is after counting reads.
ES_Bruce ES_Bruce Diff NT Full-Media 1 bed
ES_Bruce ES_Bruce Diff NT Full-Media 2 bed
CH12 CH12 Diff Diff Full-Media 1 bed
CH12 CH12 Diff Diff Full-Media 2 bed
Computing summits...
Sample: H3K27me3_mm10/bam/ES_Bruce_Diff_1.bam125
Sample: H3K27me3_mm10/bam/ES_Bruce_Diff_2.bam125
Sample: H3K27me3_mm10/bam/CH12_Diff_1.bam125
Sample: H3K27me3_mm10/bam/CH12_Diff_2.bam125
Sample: H3K27me3_mm10/bam/ES_Bruce_input_1.bam125
Sample: H3K27me3_mm10/bam/ES_Bruce_input_2.bam125
Sample: H3K27me3_mm10/bam/CH12_input_1.bam125
Sample: H3K27me3_mm10/bam/CH12_input_2.bam125
Re-centering peaks...
Sample: H3K27me3_mm10/bam/ES_Bruce_Diff_1.bam125
Reads will be counted as Single-end.
Sample: H3K27me3_mm10/bam/ES_Bruce_Diff_2.bam125
Reads will be counted as Single-end.
Sample: H3K27me3_mm10/bam/CH12_Diff_1.bam125
Reads will be counted as Single-end.
Sample: H3K27me3_mm10/bam/CH12_Diff_2.bam125
Reads will be counted as Single-end.
Sample: H3K27me3_mm10/bam/ES_Bruce_input_1.bam125
Reads will be counted as Single-end.
Sample: H3K27me3_mm10/bam/ES_Bruce_input_2.bam125
Reads will be counted as Single-end.
Sample: H3K27me3_mm10/bam/CH12_input_1.bam125
Reads will be counted as Single-end.
Sample: H3K27me3_mm10/bam/CH12_input_2.bam125
Reads will be counted as Single-end.
Error in SummarizedExperiment(assays = SimpleList(counts = countData), :
the rownames and colnames of the supplied assay(s) must be NULL or identical to those of the
SummarizedExperiment object (or derivative) to construct
In addition: Warning messages:
1: In serialize(data, node$con) :
'package:stats' may not be available when loading
2: In serialize(data, node$con) :
'package:stats' may not be available when loading
3: In serialize(data, node$con) :
'package:stats' may not be available when loading
4: In serialize(data, node$con) :
'package:stats' may not be available when loading
5: In serialize(data, node$con) :
'package:stats' may not be available when loading
6: In serialize(data, node$con) :
'package:stats' may not be available when loading
7: In serialize(data, node$con) :
'package:stats' may not be available when loading
8: In serialize(data, node$con) :
'package:stats' may not be available when loading
sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
I aslo tried to set bSummarizedExperiment = TRUE in dba step, and got the following error,
ESC_CH12_H3K27me3 <- dba(sampleSheet=samples,bSummarizedExperiment = TRUE)
ES_Bruce ES_Bruce Diff NT Full-Media 1 bed
ES_Bruce ES_Bruce Diff NT Full-Media 2 bed
CH12 CH12 Diff Diff Full-Media 1 bed
CH12 CH12 Diff Diff Full-Media 2 bed
Error in SummarizedExperiment(assays = assays, rowRanges = peaks, colData = meta) :
the rownames and colnames of the supplied assay(s) must be NULL or identical to those of the
RangedSummarizedExperiment object (or derivative) to construct
Just so we're on the same page, could you do a BiocManager::update() to update to the current versions?
Assuming the problem persists after the update, could you send me a link to your ESC_CH12_H3K27me3.peak object? I can look at it to see if three is anything unusual. Otherwise I may need access to your bam files to get to the bottom of this.
At the beginning I was thought maybe bed file format is incorrect, so I have tried to change the original ENCODE bed file to bed6, or bed file with 5 columns like the one in Diffbind manual, however, none of them work. I am not familiar with BAM format, I found there's people post other issues(not with Diffbind), which the problem is caused by BAM file has no header, I checked both the ENCODE bam file and the bam file in Diffbind vignette have header, so I don't know where is problem.
I really appreciate you taking time to help me! Thank you very much!
<h6>#</h6>
Following is the updated sessioninfo
sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
Random number generation:
RNG: Mersenne-Twister
Normal: Inversion
Sample: Rounding
It looks like you are using the same SampleID for multiple samples. Two samples share the same ID [ES_Bruce], and the other two share the same ID [CH12]. There is a requirement that each sample have a unique ID.
Try making each SampleID unique in your samplesheet. I'm adding an internal check to throw an error if an attempt is made to use duplicate IDs.
Could you provide some more information?
What versions are you using (output of
sessionInfo()
)?What are the steps in your script, from creating a new DBA object up to and including calling
dba.count()
? Are you doing any type of normalization using spike-ins or parallel factors? Are you specifically requesting the data as aSummarizedExperiment
object usingDBA_DATA_SUMMARIZED_EXPERIMENT
?Thank you very much for your reply! Please find the following script. I am new to ChIP-seq analysis and Diffbind package, I haven't done any normalization yet, I just follow the manual of Diffbind, the nornmalization step in Diffbind manual is after counting reads.
ES_Bruce ES_Bruce Diff NT Full-Media 1 bed ES_Bruce ES_Bruce Diff NT Full-Media 2 bed CH12 CH12 Diff Diff Full-Media 1 bed CH12 CH12 Diff Diff Full-Media 2 bed
Computing summits... Sample: H3K27me3_mm10/bam/ES_Bruce_Diff_1.bam125 Sample: H3K27me3_mm10/bam/ES_Bruce_Diff_2.bam125 Sample: H3K27me3_mm10/bam/CH12_Diff_1.bam125 Sample: H3K27me3_mm10/bam/CH12_Diff_2.bam125 Sample: H3K27me3_mm10/bam/ES_Bruce_input_1.bam125 Sample: H3K27me3_mm10/bam/ES_Bruce_input_2.bam125 Sample: H3K27me3_mm10/bam/CH12_input_1.bam125 Sample: H3K27me3_mm10/bam/CH12_input_2.bam125 Re-centering peaks... Sample: H3K27me3_mm10/bam/ES_Bruce_Diff_1.bam125 Reads will be counted as Single-end. Sample: H3K27me3_mm10/bam/ES_Bruce_Diff_2.bam125 Reads will be counted as Single-end. Sample: H3K27me3_mm10/bam/CH12_Diff_1.bam125 Reads will be counted as Single-end. Sample: H3K27me3_mm10/bam/CH12_Diff_2.bam125 Reads will be counted as Single-end. Sample: H3K27me3_mm10/bam/ES_Bruce_input_1.bam125 Reads will be counted as Single-end. Sample: H3K27me3_mm10/bam/ES_Bruce_input_2.bam125 Reads will be counted as Single-end. Sample: H3K27me3_mm10/bam/CH12_input_1.bam125 Reads will be counted as Single-end. Sample: H3K27me3_mm10/bam/CH12_input_2.bam125 Reads will be counted as Single-end. Error in SummarizedExperiment(assays = SimpleList(counts = countData), : the rownames and colnames of the supplied assay(s) must be NULL or identical to those of the SummarizedExperiment object (or derivative) to construct In addition: Warning messages: 1: In serialize(data, node$con) : 'package:stats' may not be available when loading 2: In serialize(data, node$con) : 'package:stats' may not be available when loading 3: In serialize(data, node$con) : 'package:stats' may not be available when loading 4: In serialize(data, node$con) : 'package:stats' may not be available when loading 5: In serialize(data, node$con) : 'package:stats' may not be available when loading 6: In serialize(data, node$con) : 'package:stats' may not be available when loading 7: In serialize(data, node$con) : 'package:stats' may not be available when loading 8: In serialize(data, node$con) : 'package:stats' may not be available when loading
Matrix products: default
locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
system code page: 65001
attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base
other attached packages: [1] QuantitativeChIPseqWorkshop_0.1.2 csaw_1.28.0 BiocManager_1.30.16
[4] GreyListChIP_1.26.0 DiffBind_3.4.3 SummarizedExperiment_1.24.0
[7] Biobase_2.54.0 MatrixGenerics_1.6.0 matrixStats_0.61.0
[10] GenomicRanges_1.46.1 GenomeInfoDb_1.30.0 IRanges_2.28.0
[13] S4Vectors_0.32.3 BiocGenerics_0.40.0
loaded via a namespace (and not attached): [1] amap_0.8-18 colorspace_2.0-2 rjson_0.2.21 hwriter_1.3.2
[5] ellipsis_0.3.2 XVector_0.34.0 rstudioapi_0.13 ggrepel_0.9.1
[9] bit64_4.0.5 AnnotationDbi_1.56.2 fansi_0.5.0 mvtnorm_1.1-3
[13] apeglm_1.16.0 splines_4.1.2 cachem_1.0.6 geneplotter_1.72.0
[17] Rsamtools_2.10.0 annotate_1.72.0 ashr_2.2-47 png_0.1-7
[21] compiler_4.1.2 httr_1.4.2 assertthat_0.2.1 Matrix_1.3-4
[25] fastmap_1.1.0 limma_3.50.0 htmltools_0.5.2 tools_4.1.2
[29] coda_0.19-4 gtable_0.3.0 glue_1.6.0 GenomeInfoDbData_1.2.7
[33] systemPipeR_2.0.5 dplyr_1.0.7 ShortRead_1.52.0 Rcpp_1.0.8
[37] bbmle_1.0.24 vctrs_0.3.8 Biostrings_2.62.0 rtracklayer_1.54.0
[41] stringr_1.4.0 lifecycle_1.0.1 irlba_2.3.5 restfulr_0.0.13
[45] gtools_3.9.2 XML_3.99-0.8 edgeR_3.36.0 zlibbioc_1.40.0
[49] MASS_7.3-54 scales_1.1.1 BSgenome_1.62.0 parallel_4.1.2
[53] RColorBrewer_1.1-2 yaml_2.2.1 memoise_2.0.1 ggplot2_3.3.5
[57] emdbook_1.3.12 bdsmatrix_1.3-4 latticeExtra_0.6-29 stringi_1.7.6
[61] RSQLite_2.2.9 SQUAREM_2021.1 genefilter_1.76.0 BiocIO_1.4.0
[65] caTools_1.18.2 BiocParallel_1.28.3 truncnorm_1.0-8 rlang_0.4.12
[69] pkgconfig_2.0.3 bitops_1.0-7 lattice_0.20-45 invgamma_1.1
[73] purrr_0.3.4 GenomicAlignments_1.30.0 htmlwidgets_1.5.4 bit_4.0.4
[77] tidyselect_1.1.1 plyr_1.8.6 magrittr_2.0.1 DESeq2_1.34.0
[81] R6_2.5.1 snow_0.4-4 gplots_3.1.1 generics_0.1.1
[85] metapod_1.2.0 DelayedArray_0.20.0 DBI_1.1.2 pillar_1.6.5
[89] survival_3.2-13 KEGGREST_1.34.0 RCurl_1.98-1.5 mixsqp_0.3-43
[93] tibble_3.1.6 crayon_1.4.2 KernSmooth_2.23-20 utf8_1.2.2
[97] jpeg_0.1-9 locfit_1.5-9.4 grid_4.1.2 blob_1.2.2
[101] digest_0.6.29 xtable_1.8-4 numDeriv_2016.8-1.1 munsell_0.5.0
I aslo tried to set bSummarizedExperiment = TRUE in dba step, and got the following error,
Thank you very much for your help!
Just so we're on the same page, could you do a
BiocManager::update()
to update to the current versions?Assuming the problem persists after the update, could you send me a link to your
ESC_CH12_H3K27me3.peak
object? I can look at it to see if three is anything unusual. Otherwise I may need access to your bam files to get to the bottom of this.Thank you very much for your reply and suggestions! I tried update then re-run the analysis, but it is still not working. Please find the following link for the "ESC_CH12_H3K27me3.peak": https://drive.google.com/file/d/1Fc8nhLqKlCRrDqGxWAmZIz1IVE2S7QHG/view?usp=sharing
And the ENCODE bam file link is below: ES-Bruce4 https://www.encodeproject.org/files/ENCFF481UTW/@@download/ENCFF481UTW.bam https://www.encodeproject.org/files/ENCFF513YXS/@@download/ENCFF513YXS.bam CH12 https://www.encodeproject.org/files/ENCFF672MBQ/@@download/ENCFF672MBQ.bam https://www.encodeproject.org/files/ENCFF758OWH/@@download/ENCFF758OWH.bam
At the beginning I was thought maybe bed file format is incorrect, so I have tried to change the original ENCODE bed file to bed6, or bed file with 5 columns like the one in Diffbind manual, however, none of them work. I am not familiar with BAM format, I found there's people post other issues(not with Diffbind), which the problem is caused by BAM file has no header, I checked both the ENCODE bam file and the bam file in Diffbind vignette have header, so I don't know where is problem.
I really appreciate you taking time to help me! Thank you very much!
<h6>#</h6>Following is the updated sessioninfo
Matrix products: default
Random number generation: RNG: Mersenne-Twister Normal: Inversion Sample: Rounding
locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
system code page: 65001
attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base
other attached packages: [1] DiffBind_3.4.7 SummarizedExperiment_1.24.0 Biobase_2.54.0
[4] MatrixGenerics_1.6.0 matrixStats_0.61.0 GenomicRanges_1.46.1
[7] GenomeInfoDb_1.30.1 IRanges_2.28.0 S4Vectors_0.32.3
[10] BiocGenerics_0.40.0
loaded via a namespace (and not attached): [1] bitops_1.0-7 RColorBrewer_1.1-2 numDeriv_2016.8-1.1 tools_4.1.2
[5] utf8_1.2.2 R6_2.5.1 irlba_2.3.5 KernSmooth_2.23-20
[9] DBI_1.1.2 colorspace_2.0-2 apeglm_1.16.0 tidyselect_1.1.1
[13] compiler_4.1.2 cli_3.1.1 DelayedArray_0.20.0 rtracklayer_1.54.0
[17] caTools_1.18.2 scales_1.1.1 SQUAREM_2021.1 mvtnorm_1.1-3
[21] mixsqp_0.3-43 stringr_1.4.0 digest_0.6.29 Rsamtools_2.10.0
[25] XVector_0.34.0 jpeg_0.1-9 pkgconfig_2.0.3 htmltools_0.5.2
[29] BSgenome_1.62.0 fastmap_1.1.0 invgamma_1.1 bbmle_1.0.24
[33] limma_3.50.0 htmlwidgets_1.5.4 rlang_1.0.1 rstudioapi_0.13
[37] BiocIO_1.4.0 generics_0.1.2 hwriter_1.3.2 BiocParallel_1.28.3
[41] gtools_3.9.2 dplyr_1.0.7 RCurl_1.98-1.5 magrittr_2.0.2
[45] GenomeInfoDbData_1.2.7 Matrix_1.3-4 Rcpp_1.0.8 munsell_0.5.0
[49] fansi_1.0.2 lifecycle_1.0.1 stringi_1.7.6 yaml_2.2.2
[53] MASS_7.3-54 zlibbioc_1.40.0 gplots_3.1.1 plyr_1.8.6
[57] grid_4.1.2 parallel_4.1.2 ggrepel_0.9.1 bdsmatrix_1.3-4
[61] crayon_1.4.2 lattice_0.20-45 Biostrings_2.62.0 locfit_1.5-9.4
[65] pillar_1.7.0 rjson_0.2.21 systemPipeR_2.0.5 XML_3.99-0.8
[69] glue_1.6.1 ShortRead_1.52.0 GreyListChIP_1.26.0 latticeExtra_0.6-29
[73] BiocManager_1.30.16 png_0.1-7 vctrs_0.3.8 gtable_0.3.0
[77] purrr_0.3.4 amap_0.8-18 assertthat_0.2.1 ashr_2.2-47
[81] ggplot2_3.3.5 emdbook_1.3.12 restfulr_0.0.13 coda_0.19-4
[85] truncnorm_1.0-8 tibble_3.1.6 GenomicAlignments_1.30.0 ellipsis_0.3.2