DiffBind normalization error: invalid argument type (list) - cannot make it work, everything seems correct
1
0
Entering edit mode
4 days ago
buffealo ▴ 130

I am trying to conduct peak calling on a publicly available dataset.

I posted before, but no matter what I tried (I followed the manual in variations, ChatGPT, forums, etc.), I could not make DiffBind work for it. This dataset I am trying to analyze is almost identical to the one used in the very beginning of its own documentation.

I have different cell lines, and I want to conduct the analysis by grouping distinct cell lines and obtaining cell line-specific information as well as group-specific information. I tried using different metadata files, which can be seen below.

==> formatted_metadata_diffbind_myc.tsv <==
SampleID    Tissue  Factor  Condition   Treatment   Replicate   bamReads    ControlID   Peaks   PeakCaller
M7_1    mcf7_myc_rep1_peaks.narrowPeak  macs    mcf7_myc_rep1_bowtie_sorted_q20_dupmarked.bam   MCF7in
M7_2    mcf7_myc_rep2_peaks.narrowPeak  macs    mcf7_myc_rep2_bowtie_sorted_q20_dupmarked.bam   MCF7in
M7_3    mcf7_myc_rep3_peaks.narrowPeak  macs    mcf7_myc_rep3_bowtie_sorted_q20_dupmarked.bam   MCF7in
T4_1    t47d_myc_rep1_peaks.narrowPeak  macs    t47d_myc_rep1_bowtie_sorted_q20_dupmarked.bam   T47Din
T4_2    t47d_myc_rep2_peaks.narrowPeak  macs    t47d_myc_rep2_bowtie_sorted_q20_dupmarked.bam   T47Din
T4_3    t47d_myc_rep3_peaks.narrowPeak  macs    t47d_myc_rep3_bowtie_sorted_q20_dupmarked.bam   T47Din
M231_1  mdamb231_myc_rep1_peaks.narrowPeakn macs    mdamb231_myc_rep1_bowtie_sorted_q20_dupmarked.bam   MDAMB231in
M231_2  mdamb231_myc_rep2_peaks.narrowPeakn macs    mdamb231_myc_rep2_bowtie_sorted_q20_dupmarked.bam   MDAMB231in
M231_3  mdamb231_myc_rep3_peaks.narrowPeakn macs    mdamb231_myc_rep3_bowtie_sorted_q20_dupmarked.bam   MDAMB231in
BT_1    bt549_myc_rep1_peaks.narrowPeak macs    bt549_myc_rep1_bowtie_sorted_q20_dupmarked.bam  BT549in
BT_2    bt549_myc_rep2_peaks.narrowPeak macs    bt549_myc_rep2_bowtie_sorted_q20_dupmarked.bam  BT549in
BT_3    BT549   MYC TNBC    Non 3   bt549_myc_rep3_bowtie_sorted_q20_dupmarked.bam  BT549in bt549_myc_rep3_peaks.narrowPeak macs

==> metadata_diffbind_fixed.csv <==
"SampleID","Condition","Replicate","Peaks","bamReads","bamControl"
"M7_1","ER+",1,"mcf7_myc_rep1_peaks.narrowPeak","mcf7_myc_rep1_bowtie_sorted_q20_dupmarked.bam","mcf7_input_bowtie_sorted_q20_dupmarked.bam"
"M7_2","ER+",2,"mcf7_myc_rep2_peaks.narrowPeak","mcf7_myc_rep2_bowtie_sorted_q20_dupmarked.bam","mcf7_input_bowtie_sorted_q20_dupmarked.bam"
"M7_3","ER+",3,"mcf7_myc_rep3_peaks.narrowPeak","mcf7_myc_rep3_bowtie_sorted_q20_dupmarked.bam","mcf7_input_bowtie_sorted_q20_dupmarked.bam"
"T4_1","ER+",1,"t47d_myc_rep1_peaks.narrowPeak","t47d_myc_rep1_bowtie_sorted_q20_dupmarked.bam","t47d_input_bowtie_sorted_q20_dupmarked.bam"
"T4_2","ER+",2,"t47d_myc_rep2_peaks.narrowPeak","t47d_myc_rep2_bowtie_sorted_q20_dupmarked.bam","t47d_input_bowtie_sorted_q20_dupmarked.bam"
"T4_3","ER+",3,"t47d_myc_rep3_peaks.narrowPeak","t47d_myc_rep3_bowtie_sorted_q20_dupmarked.bam","t47d_input_bowtie_sorted_q20_dupmarked.bam"
"M231_1","TNBC",1,"mdamb231_myc_rep1_peaks.narrowPeak","mdamb231_myc_rep1_bowtie_sorted_q20_dupmarked.bam","mdamb231_input_bowtie_sorted_q20_dupmarked.bam"
"M231_2","TNBC",2,"mdamb231_myc_rep2_peaks.narrowPeak","mdamb231_myc_rep2_bowtie_sorted_q20_dupmarked.bam","mdamb231_input_bowtie_sorted_q20_dupmarked.bam"
"M231_3","TNBC",3,"mdamb231_myc_rep3_peaks.narrowPeak","mdamb231_myc_rep3_bowtie_sorted_q20_dupmarked.bam","mdamb231_input_bowtie_sorted_q20_dupmarked.bam"
"BT_1","TNBC",1,"bt549_myc_rep1_peaks.narrowPeak","bt549_myc_rep1_bowtie_sorted_q20_dupmarked.bam","bt549_input_bowtie_sorted_q20_dupmarked.bam"
"BT_2","TNBC",2,"bt549_myc_rep2_peaks.narrowPeak","bt549_myc_rep2_bowtie_sorted_q20_dupmarked.bam","bt549_input_bowtie_sorted_q20_dupmarked.bam"
"BT_3","TNBC",3,"bt549_myc_rep3_peaks.narrowPeak","bt549_myc_rep3_bowtie_sorted_q20_dupmarked.bam","bt549_input_bowtie_sorted_q20_dupmarked.bam"

==> metadata_diffbind_myc.csv <==
SampleID,Tissue,Condition,Replicate,Factor,Treatment,PeakCaller,Peaks,bamReads,bamControl,ControlID
M7_1,MCF7,ER+,1,MYC,Non,macs,mcf7_myc_rep1_peaks.narrowPeak,mcf7_myc_rep1_bowtie_sorted_q20_dupmarked.bam,mcf7_input_bowtie_sorted_q20_dupmarked.bam,MCF7in
M7_2,MCF7,ER+,2,MYC,Non,macs,mcf7_myc_rep2_peaks.narrowPeak,mcf7_myc_rep2_bowtie_sorted_q20_dupmarked.bam,mcf7_input_bowtie_sorted_q20_dupmarked.bam,MCF7in
M7_3,MCF7,ER+,3,MYC,Non,macs,mcf7_myc_rep3_peaks.narrowPeak,mcf7_myc_rep3_bowtie_sorted_q20_dupmarked.bam,mcf7_input_bowtie_sorted_q20_dupmarked.bam,MCF7in
T4_1,T47D,ER+,1,MYC,Non,macs,t47d_myc_rep1_peaks.narrowPeak,t47d_myc_rep1_bowtie_sorted_q20_dupmarked.bam,t47d_input_bowtie_sorted_q20_dupmarked.bam,T47Din
T4_2,T47D,ER+,2,MYC,Non,macs,t47d_myc_rep2_peaks.narrowPeak,t47d_myc_rep2_bowtie_sorted_q20_dupmarked.bam,t47d_input_bowtie_sorted_q20_dupmarked.bam,T47Din
T4_3,T47D,ER+,3,MYC,Non,macs,t47d_myc_rep3_peaks.narrowPeak,t47d_myc_rep3_bowtie_sorted_q20_dupmarked.bam,t47d_input_bowtie_sorted_q20_dupmarked.bam,T47Din
M231_1,MDAMB231,TNBC,1,MYC,Non,macs,mdamb231_myc_rep1_peaks.narrowPeak,mdamb231_myc_rep1_bowtie_sorted_q20_dupmarked.bam,mdamb231_input_bowtie_sorted_q20_dupmarked.bam,MDAMB231in
M231_2,MDAMB231,TNBC,2,MYC,Non,macs,mdamb231_myc_rep2_peaks.narrowPeak,mdamb231_myc_rep2_bowtie_sorted_q20_dupmarked.bam,mdamb231_input_bowtie_sorted_q20_dupmarked.bam,MDAMB231in
M231_3,MDAMB231,TNBC,3,MYC,Non,macs,mdamb231_myc_rep3_peaks.narrowPeak,mdamb231_myc_rep3_bowtie_sorted_q20_dupmarked.bam,mdamb231_input_bowtie_sorted_q20_dupmarked.bam,MDAMB231in
BT_1,BT549,TNBC,1,MYC,Non,macs,bt549_myc_rep1_peaks.narrowPeak,bt549_myc_rep1_bowtie_sorted_q20_dupmarked.bam,bt549_input_bowtie_sorted_q20_dupmarked.bam,BT549in
BT_2,BT549,TNBC,2,MYC,Non,macs,bt549_myc_rep2_peaks.narrowPeak,bt549_myc_rep2_bowtie_sorted_q20_dupmarked.bam,bt549_input_bowtie_sorted_q20_dupmarked.bam,BT549in
BT_3,BT549,TNBC,3,MYC,Non,macs,bt549_myc_rep3_peaks.narrowPeak,bt549_myc_rep3_bowtie_sorted_q20_dupmarked.bam,bt549_input_bowtie_sorted_q20_dupmarked.bam,BT549in

==> myc_metadata.csv <==
SampleID,Tissue,Factor,Condition,Treatment,Replicate,bamReads,ControlID,bamControl,Peaks,PeakCaller
M7_1,MCF7,MYC,ER+,Non,1,mcf7_myc_rep1_bowtie_sorted_q20_dupmarked.bam,MCF7in,mcf7_input_bowtie_sorted_q20_dupmarked.bam,mcf7_myc_rep1_peaks.narrowPeak,raw
M7_2,MCF7,MYC,ER+,Non,2,mcf7_myc_rep2_bowtie_sorted_q20_dupmarked.bam,MCF7in,mcf7_input_bowtie_sorted_q20_dupmarked.bam,mcf7_myc_rep2_peaks.narrowPeak,raw
M7_3,MCF7,MYC,ER+,Non,3,mcf7_myc_rep3_bowtie_sorted_q20_dupmarked.bam,MCF7in,mcf7_input_bowtie_sorted_q20_dupmarked.bam,mcf7_myc_rep3_peaks.narrowPeak,raw
T4_1,T47D,MYC,ER+,Non,1,t47d_myc_rep1_bowtie_sorted_q20_dupmarked.bam,T47Din,t47d_input_bowtie_sorted_q20_dupmarked.bam,t47d_myc_rep1_peaks.narrowPeak,raw
T4_2,T47D,MYC,ER+,Non,2,t47d_myc_rep2_bowtie_sorted_q20_dupmarked.bam,T47Din,t47d_input_bowtie_sorted_q20_dupmarked.bam,t47d_myc_rep2_peaks.narrowPeak,raw
T4_3,T47D,MYC,ER+,Non,3,t47d_myc_rep3_bowtie_sorted_q20_dupmarked.bam,T47Din,t47d_input_bowtie_sorted_q20_dupmarked.bam,t47d_myc_rep3_peaks.narrowPeak,raw
M231_1,MDAMB231,MYC,TNBC,Non,1,mdamb231_myc_rep1_bowtie_sorted_q20_dupmarked.bam,MDAMB231in,mdamb231_input_bowtie_sorted_q20_dupmarked.bam,mdamb231_myc_rep1_peaks.narrowPeak,raw
M231_2,MDAMB231,MYC,TNBC,Non,2,mdamb231_myc_rep2_bowtie_sorted_q20_dupmarked.bam,MDAMB231in,mdamb231_input_bowtie_sorted_q20_dupmarked.bam,mdamb231_myc_rep2_peaks.narrowPeak,raw
M231_3,MDAMB231,MYC,TNBC,Non,3,mdamb231_myc_rep3_bowtie_sorted_q20_dupmarked.bam,MDAMB231in,mdamb231_input_bowtie_sorted_q20_dupmarked.bam,mdamb231_myc_rep3_peaks.narrowPeak,raw
BT_1,BT549,MYC,TNBC,Non,1,bt549_myc_rep1_bowtie_sorted_q20_dupmarked.bam,BT549in,bt549_input_bowtie_sorted_q20_dupmarked.bam,bt549_myc_rep1_peaks.narrowPeak,raw
BT_2,BT549,MYC,TNBC,Non,2,bt549_myc_rep2_bowtie_sorted_q20_dupmarked.bam,BT549in,bt549_input_bowtie_sorted_q20_dupmarked.bam,bt549_myc_rep2_peaks.narrowPeak,raw
BT_3,BT549,MYC,TNBC,Non,3,bt549_myc_rep3_bowtie_sorted_q20_dupmarked.bam,BT549in,bt549_input_bowtie_sorted_q20_dupmarked.bam,bt549_myc_rep3_peaks.narrowPeak,raw

I also tried one with exact paths.

I came until this part:

peaks heatmap

But I cannot continue with normalizing. I am getting errors such as these:

dbObj <- dba.normalize(dbObj)
Error in sum(sapply(pv$peaks, nrow)) : invalid argument 'type' (list)

I am literally desperate. If you can help me I will more than appreciate. Thank you.

normalization deseq2 chipseq diffbind • 745 views
ADD COMMENT
0
Entering edit mode
4 days ago
Rory Stark ★ 2.1k

If your peaks are in narrowPeak format, it may help to set the PeakCaller to "narrow".

ADD COMMENT
0
Entering edit mode

Thank you so much. Which metadata format should I proceed with, you think?

ADD REPLY

Login before adding your answer.

Traffic: 1767 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6