Hello,
I am seeking some advice on how to pair a normalization method with spike-in yeast DNA in a time course CUT&RUN experiment with 5 timepoints and 3 replicates per time point. I am using MACS3 to perform peak calls and DiffBind to test for differential binding (using the DESeq2 workflow) between the 0 hr timepoint and subsequent timepoints.
I have tried (a) normalizing tag counts according to spike-in library size and (b) normalizing according to spike-in relative log expression (RLE). In Figure 32 on page 55 of the Diffbind vignette, these two normalization method produce nearly identical results.
However, in my case, I am getting different results. Namely, one time point always produces 0 differentially bound sites, and choice of normalization method determines at which time point the 0-result occurs:
When I normalize by spike-in library size, I obtain the following results table (Group, refers to the time point of comparison, Group2 refers to the 0 hr timepoint, and DB.DESeq2 shows number of differentially bound sites FDR<0.05 at that contrast):
But when I normalize by spike-in RLE, I obtain the following results table:
When I look at the normalized tag count profiles for the top 5 sites identified in each normalization setting, the tag count profiles display different patterns which likely account for the differences in the dba results.
Shown below are my spike-in library sizes per replicate (colored by timepoint). Note that the 0 hr and 24 hr groups have the most variability in the spike-in library size between replicates.
I have two questions that I would like feedback on:
- How do we choose an appropriate normalization method to pair with spike-in data? I understand that library size vs. RLE normalization methods have different underlying assumptions, but I don't know how to assess whether these assumptions are reasonable in the spike-in data. For example, an assumption of RLE normalization is balanced differential expression; this doesn't make any sense to me in the context of a spike-in.
- Why are my results so different between these two normalization methods? Is the variability in spike-in library sizes driving a wedge between these normalization methods somehow?
I would appreciate any advice! Thanks! -Ethan
Thanks for the thoughtful comment! Here are a few things I wanted to add:
I'll also add that the lists of differentially bound sites between spike-in library size and spike-in RLE normalizations are highly concordant (Jaccard index = 0.87), but the results of individual contrasts and tag count profiles are quite different between these two methods.
Diffbind suggests (in the absence of spike-in or parallel factor controls) normalizing using RLE with large background bins where any local enrichment is diluted out. When I perform background normalization, all the contrasts show non-zero differentially bound site calls (more in line with what I would expect). The tag count profiles for the top 10 genes in each contrast are shown below. Might background normalization be a better strategy in this case?
I would really inspect different methods using the MA-plots. I personally have never been an advocate for the background binning strategy as you are essentially normalizing on mostly the noise outside of peaks which I find a little odd. As said, try different methods, inspect MA-plots, see how the different methods center the farright part of the plots at the y-axis, and then decide for the method which, with your knowledge of the experiment and expectations towards it, fits it best.
Many thanks! I appreciate your feedback!