DiffBind normalization error: invalid argument type (list) - cannot make it work, everything seems correct
1
0
Entering edit mode
5 days ago
buffealo ▴ 130

I am trying to conduct peak calling on a publicly available dataset.

I posted before, but no matter what I tried (I followed the manual in variations, ChatGPT, forums, etc.), I could not make DiffBind work for it. This dataset I am trying to analyze is almost identical to the one used in the very beginning of its own documentation.

I have different cell lines, and I want to conduct the analysis by grouping distinct cell lines and obtaining cell line-specific information as well as group-specific information. I tried using different metadata files, which can be seen below.

==> formatted_metadata_diffbind_myc.tsv <==
SampleID    Tissue  Factor  Condition   Treatment   Replicate   bamReads    ControlID   Peaks   PeakCaller
M7_1    mcf7_myc_rep1_peaks.narrowPeak  macs    mcf7_myc_rep1_bowtie_sorted_q20_dupmarked.bam   MCF7in
M7_2    mcf7_myc_rep2_peaks.narrowPeak  macs    mcf7_myc_rep2_bowtie_sorted_q20_dupmarked.bam   MCF7in
M7_3    mcf7_myc_rep3_peaks.narrowPeak  macs    mcf7_myc_rep3_bowtie_sorted_q20_dupmarked.bam   MCF7in
T4_1    t47d_myc_rep1_peaks.narrowPeak  macs    t47d_myc_rep1_bowtie_sorted_q20_dupmarked.bam   T47Din
T4_2    t47d_myc_rep2_peaks.narrowPeak  macs    t47d_myc_rep2_bowtie_sorted_q20_dupmarked.bam   T47Din
T4_3    t47d_myc_rep3_peaks.narrowPeak  macs    t47d_myc_rep3_bowtie_sorted_q20_dupmarked.bam   T47Din
M231_1  mdamb231_myc_rep1_peaks.narrowPeakn macs    mdamb231_myc_rep1_bowtie_sorted_q20_dupmarked.bam   MDAMB231in
M231_2  mdamb231_myc_rep2_peaks.narrowPeakn macs    mdamb231_myc_rep2_bowtie_sorted_q20_dupmarked.bam   MDAMB231in
M231_3  mdamb231_myc_rep3_peaks.narrowPeakn macs    mdamb231_myc_rep3_bowtie_sorted_q20_dupmarked.bam   MDAMB231in
BT_1    bt549_myc_rep1_peaks.narrowPeak macs    bt549_myc_rep1_bowtie_sorted_q20_dupmarked.bam  BT549in
BT_2    bt549_myc_rep2_peaks.narrowPeak macs    bt549_myc_rep2_bowtie_sorted_q20_dupmarked.bam  BT549in
BT_3    BT549   MYC TNBC    Non 3   bt549_myc_rep3_bowtie_sorted_q20_dupmarked.bam  BT549in bt549_myc_rep3_peaks.narrowPeak macs

==> metadata_diffbind_fixed.csv <==
"SampleID","Condition","Replicate","Peaks","bamReads","bamControl"
"M7_1","ER+",1,"mcf7_myc_rep1_peaks.narrowPeak","mcf7_myc_rep1_bowtie_sorted_q20_dupmarked.bam","mcf7_input_bowtie_sorted_q20_dupmarked.bam"
"M7_2","ER+",2,"mcf7_myc_rep2_peaks.narrowPeak","mcf7_myc_rep2_bowtie_sorted_q20_dupmarked.bam","mcf7_input_bowtie_sorted_q20_dupmarked.bam"
"M7_3","ER+",3,"mcf7_myc_rep3_peaks.narrowPeak","mcf7_myc_rep3_bowtie_sorted_q20_dupmarked.bam","mcf7_input_bowtie_sorted_q20_dupmarked.bam"
"T4_1","ER+",1,"t47d_myc_rep1_peaks.narrowPeak","t47d_myc_rep1_bowtie_sorted_q20_dupmarked.bam","t47d_input_bowtie_sorted_q20_dupmarked.bam"
"T4_2","ER+",2,"t47d_myc_rep2_peaks.narrowPeak","t47d_myc_rep2_bowtie_sorted_q20_dupmarked.bam","t47d_input_bowtie_sorted_q20_dupmarked.bam"
"T4_3","ER+",3,"t47d_myc_rep3_peaks.narrowPeak","t47d_myc_rep3_bowtie_sorted_q20_dupmarked.bam","t47d_input_bowtie_sorted_q20_dupmarked.bam"
"M231_1","TNBC",1,"mdamb231_myc_rep1_peaks.narrowPeak","mdamb231_myc_rep1_bowtie_sorted_q20_dupmarked.bam","mdamb231_input_bowtie_sorted_q20_dupmarked.bam"
"M231_2","TNBC",2,"mdamb231_myc_rep2_peaks.narrowPeak","mdamb231_myc_rep2_bowtie_sorted_q20_dupmarked.bam","mdamb231_input_bowtie_sorted_q20_dupmarked.bam"
"M231_3","TNBC",3,"mdamb231_myc_rep3_peaks.narrowPeak","mdamb231_myc_rep3_bowtie_sorted_q20_dupmarked.bam","mdamb231_input_bowtie_sorted_q20_dupmarked.bam"
"BT_1","TNBC",1,"bt549_myc_rep1_peaks.narrowPeak","bt549_myc_rep1_bowtie_sorted_q20_dupmarked.bam","bt549_input_bowtie_sorted_q20_dupmarked.bam"
"BT_2","TNBC",2,"bt549_myc_rep2_peaks.narrowPeak","bt549_myc_rep2_bowtie_sorted_q20_dupmarked.bam","bt549_input_bowtie_sorted_q20_dupmarked.bam"
"BT_3","TNBC",3,"bt549_myc_rep3_peaks.narrowPeak","bt549_myc_rep3_bowtie_sorted_q20_dupmarked.bam","bt549_input_bowtie_sorted_q20_dupmarked.bam"

==> metadata_diffbind_myc.csv <==
SampleID,Tissue,Condition,Replicate,Factor,Treatment,PeakCaller,Peaks,bamReads,bamControl,ControlID
M7_1,MCF7,ER+,1,MYC,Non,macs,mcf7_myc_rep1_peaks.narrowPeak,mcf7_myc_rep1_bowtie_sorted_q20_dupmarked.bam,mcf7_input_bowtie_sorted_q20_dupmarked.bam,MCF7in
M7_2,MCF7,ER+,2,MYC,Non,macs,mcf7_myc_rep2_peaks.narrowPeak,mcf7_myc_rep2_bowtie_sorted_q20_dupmarked.bam,mcf7_input_bowtie_sorted_q20_dupmarked.bam,MCF7in
M7_3,MCF7,ER+,3,MYC,Non,macs,mcf7_myc_rep3_peaks.narrowPeak,mcf7_myc_rep3_bowtie_sorted_q20_dupmarked.bam,mcf7_input_bowtie_sorted_q20_dupmarked.bam,MCF7in
T4_1,T47D,ER+,1,MYC,Non,macs,t47d_myc_rep1_peaks.narrowPeak,t47d_myc_rep1_bowtie_sorted_q20_dupmarked.bam,t47d_input_bowtie_sorted_q20_dupmarked.bam,T47Din
T4_2,T47D,ER+,2,MYC,Non,macs,t47d_myc_rep2_peaks.narrowPeak,t47d_myc_rep2_bowtie_sorted_q20_dupmarked.bam,t47d_input_bowtie_sorted_q20_dupmarked.bam,T47Din
T4_3,T47D,ER+,3,MYC,Non,macs,t47d_myc_rep3_peaks.narrowPeak,t47d_myc_rep3_bowtie_sorted_q20_dupmarked.bam,t47d_input_bowtie_sorted_q20_dupmarked.bam,T47Din
M231_1,MDAMB231,TNBC,1,MYC,Non,macs,mdamb231_myc_rep1_peaks.narrowPeak,mdamb231_myc_rep1_bowtie_sorted_q20_dupmarked.bam,mdamb231_input_bowtie_sorted_q20_dupmarked.bam,MDAMB231in
M231_2,MDAMB231,TNBC,2,MYC,Non,macs,mdamb231_myc_rep2_peaks.narrowPeak,mdamb231_myc_rep2_bowtie_sorted_q20_dupmarked.bam,mdamb231_input_bowtie_sorted_q20_dupmarked.bam,MDAMB231in
M231_3,MDAMB231,TNBC,3,MYC,Non,macs,mdamb231_myc_rep3_peaks.narrowPeak,mdamb231_myc_rep3_bowtie_sorted_q20_dupmarked.bam,mdamb231_input_bowtie_sorted_q20_dupmarked.bam,MDAMB231in
BT_1,BT549,TNBC,1,MYC,Non,macs,bt549_myc_rep1_peaks.narrowPeak,bt549_myc_rep1_bowtie_sorted_q20_dupmarked.bam,bt549_input_bowtie_sorted_q20_dupmarked.bam,BT549in
BT_2,BT549,TNBC,2,MYC,Non,macs,bt549_myc_rep2_peaks.narrowPeak,bt549_myc_rep2_bowtie_sorted_q20_dupmarked.bam,bt549_input_bowtie_sorted_q20_dupmarked.bam,BT549in
BT_3,BT549,TNBC,3,MYC,Non,macs,bt549_myc_rep3_peaks.narrowPeak,bt549_myc_rep3_bowtie_sorted_q20_dupmarked.bam,bt549_input_bowtie_sorted_q20_dupmarked.bam,BT549in

==> myc_metadata.csv <==
SampleID,Tissue,Factor,Condition,Treatment,Replicate,bamReads,ControlID,bamControl,Peaks,PeakCaller
M7_1,MCF7,MYC,ER+,Non,1,mcf7_myc_rep1_bowtie_sorted_q20_dupmarked.bam,MCF7in,mcf7_input_bowtie_sorted_q20_dupmarked.bam,mcf7_myc_rep1_peaks.narrowPeak,raw
M7_2,MCF7,MYC,ER+,Non,2,mcf7_myc_rep2_bowtie_sorted_q20_dupmarked.bam,MCF7in,mcf7_input_bowtie_sorted_q20_dupmarked.bam,mcf7_myc_rep2_peaks.narrowPeak,raw
M7_3,MCF7,MYC,ER+,Non,3,mcf7_myc_rep3_bowtie_sorted_q20_dupmarked.bam,MCF7in,mcf7_input_bowtie_sorted_q20_dupmarked.bam,mcf7_myc_rep3_peaks.narrowPeak,raw
T4_1,T47D,MYC,ER+,Non,1,t47d_myc_rep1_bowtie_sorted_q20_dupmarked.bam,T47Din,t47d_input_bowtie_sorted_q20_dupmarked.bam,t47d_myc_rep1_peaks.narrowPeak,raw
T4_2,T47D,MYC,ER+,Non,2,t47d_myc_rep2_bowtie_sorted_q20_dupmarked.bam,T47Din,t47d_input_bowtie_sorted_q20_dupmarked.bam,t47d_myc_rep2_peaks.narrowPeak,raw
T4_3,T47D,MYC,ER+,Non,3,t47d_myc_rep3_bowtie_sorted_q20_dupmarked.bam,T47Din,t47d_input_bowtie_sorted_q20_dupmarked.bam,t47d_myc_rep3_peaks.narrowPeak,raw
M231_1,MDAMB231,MYC,TNBC,Non,1,mdamb231_myc_rep1_bowtie_sorted_q20_dupmarked.bam,MDAMB231in,mdamb231_input_bowtie_sorted_q20_dupmarked.bam,mdamb231_myc_rep1_peaks.narrowPeak,raw
M231_2,MDAMB231,MYC,TNBC,Non,2,mdamb231_myc_rep2_bowtie_sorted_q20_dupmarked.bam,MDAMB231in,mdamb231_input_bowtie_sorted_q20_dupmarked.bam,mdamb231_myc_rep2_peaks.narrowPeak,raw
M231_3,MDAMB231,MYC,TNBC,Non,3,mdamb231_myc_rep3_bowtie_sorted_q20_dupmarked.bam,MDAMB231in,mdamb231_input_bowtie_sorted_q20_dupmarked.bam,mdamb231_myc_rep3_peaks.narrowPeak,raw
BT_1,BT549,MYC,TNBC,Non,1,bt549_myc_rep1_bowtie_sorted_q20_dupmarked.bam,BT549in,bt549_input_bowtie_sorted_q20_dupmarked.bam,bt549_myc_rep1_peaks.narrowPeak,raw
BT_2,BT549,MYC,TNBC,Non,2,bt549_myc_rep2_bowtie_sorted_q20_dupmarked.bam,BT549in,bt549_input_bowtie_sorted_q20_dupmarked.bam,bt549_myc_rep2_peaks.narrowPeak,raw
BT_3,BT549,MYC,TNBC,Non,3,bt549_myc_rep3_bowtie_sorted_q20_dupmarked.bam,BT549in,bt549_input_bowtie_sorted_q20_dupmarked.bam,bt549_myc_rep3_peaks.narrowPeak,raw

I also tried one with exact paths.

I came until this part:

peaks heatmap

But I cannot continue with normalizing. I am getting errors such as these:

dbObj <- dba.normalize(dbObj)
Error in sum(sapply(pv$peaks, nrow)) : invalid argument 'type' (list)

I am literally desperate. If you can help me I will more than appreciate. Thank you.

normalization deseq2 chipseq diffbind • 981 views
ADD COMMENT
0
Entering edit mode
5 days ago
Rory Stark ★ 2.1k

If your peaks are in narrowPeak format, it may help to set the PeakCaller to "narrow".

ADD COMMENT
0
Entering edit mode

Thank you so much. Which metadata format should I proceed with, you think?

ADD REPLY
0
Entering edit mode

Rory Stark, I tried several variations, and I am getting these errors. These errors also occurred when I tried with the metadata files I wrote above. I have proper index .bai files, and also, my reads are paired-end.

> myc <- dba.analyze("myc_chip_metadata_rory.csv")
Loading sample sheet...
M7_1 MCF7 MYC ER+ Non 1 narrow
M7_2 MCF7 MYC ER+ Non 2 narrow
M7_3 MCF7 MYC ER+ Non 3 narrow
T4_1 T47D MYC ER+ Non 1 narrow
T4_2 T47D MYC ER+ Non 2 narrow
T4_3 T47D MYC ER+ Non 3 narrow
M231_1 MDAMB231 MYC TNBC Non 1 narrow
M231_2 MDAMB231 MYC TNBC Non 2 narrow
M231_3 MDAMB231 MYC TNBC Non 3 narrow
BT_1 BT549 MYC TNBC Non 1 narrow
BT_2 BT549 MYC TNBC Non 2 narrow
BT_3 BT549 MYC TNBC Non 3 narrow
Applying Blacklist/Greylists...
Genome detected: Hsapiens.UCSC.hg38
Applying blacklist...
Blacklist error: Error in h(simpleError(msg, call)): error in evaluating the argument 'i' in selecting a method for function '[': 'q_groups' must be a     CompressedIntegerList object
Unable to apply Blacklist/Greylist.

> myc
12 Samples, 115749 sites in matrix (171059 total):
       ID   Tissue Factor Condition Treatment Replicate Intervals
1    M7_1     MCF7    MYC       ER+       Non         1     79308
2    M7_2     MCF7    MYC       ER+       Non         2     76243
3    M7_3     MCF7    MYC       ER+       Non         3     83461
4    T4_1     T47D    MYC       ER+       Non         1      5966
5    T4_2     T47D    MYC       ER+       Non         2     24230
6    T4_3     T47D    MYC       ER+       Non         3        26
7  M231_1 MDAMB231    MYC      TNBC       Non         1     62073
8  M231_2 MDAMB231    MYC      TNBC       Non         2     80625
9  M231_3 MDAMB231    MYC      TNBC       Non         3     40988
10   BT_1    BT549    MYC      TNBC       Non         1     57094
11   BT_2    BT549    MYC      TNBC       Non         2     50309
12   BT_3    BT549    MYC      TNBC       Non         3     43601

> myc <- dba(sampleSheet = "myc_chip_metadata_rory.csv")
M7_1 MCF7 MYC ER+ Non 1 narrow
M7_2 MCF7 MYC ER+ Non 2 narrow
M7_3 MCF7 MYC ER+ Non 3 narrow
T4_1 T47D MYC ER+ Non 1 narrow
T4_2 T47D MYC ER+ Non 2 narrow
T4_3 T47D MYC ER+ Non 3 narrow
M231_1 MDAMB231 MYC TNBC Non 1 narrow
M231_2 MDAMB231 MYC TNBC Non 2 narrow
M231_3 MDAMB231 MYC TNBC Non 3 narrow
BT_1 BT549 MYC TNBC Non 1 narrow
BT_2 BT549 MYC TNBC Non 2 narrow
BT_3 BT549 MYC TNBC Non 3 narrow

> myc <- dba.count(myc)
Computing summits...
Re-centering peaks...
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Hata: Error processing one or more read files. Check warnings().
In addition: There were 17 warnings (use warnings() to see them)

> warnings()
Warning messages:
1: In mclapply(arglist, fn, ..., mc.preschedule = TRUE, mc.allow.recursive = TRUE) :
  all scheduled cores encountered errors in user code
2:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
3:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
4:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
5:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
6:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
7:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
8:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
9:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
10:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
11:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
12:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
13:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
14:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
15:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
16:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
17:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors



> library(BiocParallel)
> register(SerialParam())
> myc <- dba(sampleSheet = samples)   
M7_1 MCF7 MYC ER+ Non 1 narrow
M7_2 MCF7 MYC ER+ Non 2 narrow
M7_3 MCF7 MYC ER+ Non 3 narrow
T4_1 T47D MYC ER+ Non 1 narrow
T4_2 T47D MYC ER+ Non 2 narrow
T4_3 T47D MYC ER+ Non 3 narrow
M231_1 MDAMB231 MYC TNBC Non 1 narrow
M231_2 MDAMB231 MYC TNBC Non 2 narrow
M231_3 MDAMB231 MYC TNBC Non 3 narrow
BT_1 BT549 MYC TNBC Non 1 narrow
BT_2 BT549 MYC TNBC Non 2 narrow
BT_3 BT549 MYC TNBC Non 3 narrow


> myc <- dba.count(myc)
Computing summits...
Re-centering peaks...
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Reads will be counted as Paired-end.
Hata: Error processing one or more read files. Check warnings().
In addition: There were 17 warnings (use warnings() to see them)

> warnings()
Warning messages:
1: In mclapply(arglist, fn, ..., mc.preschedule = TRUE, mc.allow.recursive = TRUE) :
  all scheduled cores encountered errors in user code
2:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
3:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
4:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
5:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
6:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
7:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
8:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
9:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
10:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
11:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
12:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
13:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
14:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
15:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
16:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
17:   error in evaluating the argument 'x' in selecting a method for function 'assay': BiocParallel errors
ADD REPLY

Login before adding your answer.

Traffic: 2258 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6