Hello everyone,
Newbie aspiring bioinformatician here.
For context:
I currently try to see the differences betwen ATAC-seq, Dnase-seq and other methods on the same cancer cell line. with differences I mean where do they find peaks etc. The data for this is taken from the ENCODE database.
I was intending To do some K means clustering on a matrix produced by deeptools and then plot a heat map to find out how many different groups can be found.
The problem is that the methods all produce signals with different strength and when I plot them in one heatmap that becomes a problem.
Now the question is how do I scale or normalize the data of 3 methods that were used on the same cell line, to make them directly comparable and if I scale the data do I run the kmeans before or after the scaling?
If things should be unclear or you need me to provide more information I will gladly do so.
Thanks in advance.