Hi all, I have a ATAC-seq data matrix where the rows are consensus peaks and each column represents either condition A or B. There are a total of 8 columns in the matrix: 4 samples for condition A and four samples for condition B. Each entry of the matrix is obtained by converting the raw count to CPM followed by log2 transformation and then performing the quantile normalization. I am wondering what would be best way to do the differential peak analysis between conditions A and B?
The differential expression (or peak) analysis programs I know (such as DESeq2) require raw counts as input and not the transformed and normalized data that I have. I did not generate this data and this is all I have.
Thanks a lot in advance for any suggestions. Any help would be greatly appreciated.
Yes, limma-trend, so using
trend=TRUE
in theeBayes()
(ortreat()
function. My usual word of caution though, if you do not have the actual raw data at hand your analysis is, strictly speaking, not reproducible. Also, in case you would find that normalization performance was not good, e.g. diagnosed by MA-plots after the DE analysis, then there is little to no way to change that since you do not have the raw counts. Your choice to go along with that or not.Thank you both for the suggestion. I will try limma as per your recommendation. I would like to get the actual raw data (i have to get the authorization), but all the ATAC-seq QC and analysis is published. They, however, did not do the differential peak analysis like the way I wanted to do.