I apply similar code that I used for RNA-seq (DeSeq2) to ATAC-seq to make a heatmap, so I am not use if it still makes sense. The function here is Heatmap(), not pheatmap(). Instead of count matrix for genes, here I use consensus_peaks.mLb.clN.featureCounts.txt, a file from nf-core/ATAC-seq pipeline, which I think count for peaks. There are total 8 samples, 4 conditions and 2 replicates for each condition. Would you please have a comment? Could I say some peaks in column 1 and 2 are closed meanwhile these peaks are opened in column 3 and 4? Thank you so much!
Could I say some peaks in column 1 and 2 are closed meanwhile these peaks are opened in column 3 and 4?
All this really lets you say is that those peaks have different levels of signal between your samples. z-scores can be misleading, as very robust relative changes for stable features (peaks) with a z-score of 2-3 can actually obfuscate very modest absolute differences in signal for any given feature if variability for that feature across samples is low.
That's not to say that's necessarily the case here - z-scored heatmaps just should not be the only way you look at such data to determine if groups of features are meaningfully different. As an aside, removing the rownames from your Heatmap will make it render much more quickly.
For that paper/figure:
We plotted the Log2 fold change for each peak and generated a heatmap from this data to visualize the overall differences in accessibility after insulin treatment. In general, peaks that were more closed in the vehicle-treated samples were more open in the insulin-treated samples (Fig. 1B).
Emphasis mine. Making a heatmap of log2 fold changes doesn't make sense to me, since you could just plot the actual fold changes and have a more clear effect. I expect the text may be inaccurate, but it's confusing at minimum.
I'd take a peak at DiffBind for some additional visualization and analysis methods.
Heatmaps in general are going to run into the same issue when row-scaling is used. They're fine to show, but other plots may give a better indication of the magnitude difference of a given peak(set) between samples or groups. See dba.plotProfile for an example of such a plot. My main point is to explore other visualization options other than just heatmaps.
That heatmap is in the preprint article. The publish paper didn't have that heatmap any more. Sorry but I still don't get what you mean here: "Making a heatmap of log2 fold changes doesn't make sense to me, since you could just plot the actual fold changes and have a more clear effect".
Thank you for your help! Do you think this heatmap from DiffBind do a better job?
Code in the page 20 of the manual: https://bioconductor.org/packages/devel/bioc/vignettes/DiffBind/inst/doc/DiffBind.pdf
Do you think if 2 ATAC-seq samples that is not significant different can make this error? They may pretty similar. Unfortunately, this post haven't had a reply.
Help with DiffBind Generating report-based DBA object... Error: No valid contrasts/methods specified.
Heatmaps in general are going to run into the same issue when row-scaling is used. They're fine to show, but other plots may give a better indication of the magnitude difference of a given peak(set) between samples or groups. See
dba.plotProfile
for an example of such a plot. My main point is to explore other visualization options other than just heatmaps.I will answer your other post separately.
That heatmap is in the preprint article. The publish paper didn't have that heatmap any more. Sorry but I still don't get what you mean here: "Making a heatmap of log2 fold changes doesn't make sense to me, since you could just plot the actual fold changes and have a more clear effect".