Question

Normalizing a fold change matrix

0

Entering edit mode

5 months ago

Guy Shan • 0

Hello friends, My goal is to normalize a fold change matrix, to cluster samples together.

My data comprises MeRIP-seq, similar to chip-seq but for detecting methylation. i have analyzed the data, and used MACS for peak calling. So, now I have a [samples * peak positions] matrix, where the matrix values are the fold change from MACS callpeak command.

I want to try different dimensionality reductions and clustering methods, in order to cluster the samples together, but I need to normalize the matrix first.

I did not find a method that I feel comfortable with yet.

I have tried Deseq2, which I know is not meant for this type of data, and indeed the results do not look amazing.

I tried fitting my data to different distributions with no success.

Do you have any suggestions?

CHIP-seq change fold clustering MeRIP-seq • 313 views

ADD COMMENT • link 5 months ago by Guy Shan • 0

1

Entering edit mode

The typical type of data matrix you use is one that contains read counts instead of fold changes. I cannot say that I ever saw a ChIP-seq like analysis with fold changes as input. It's also imo not recommended since fold changes without stats are meaningless. FCs can be large due to noise without that the data actually support the large FC. Consider using counts, e.g. obtained with featureCounts.

ADD REPLY • link 5 months ago by ATpoint 86k

0

Entering edit mode

Thx for the reply.

I will look more into it, but the issue with read count matrix is that the read count is also meaningless without comparing it to the read count of the same input segment. And when you compare them in every method I know, you get a fold change.

One possible solution I thought of is to convert the fold change matrix into 1, 0, and -1, representing: peak in IP, no peak, and peak in input.

Although I would love to find a way that also incorporates the size of the peaks.

ADD REPLY • link 5 months ago by Guy Shan • 0