Question

what is the best way to calculate significant differences in tss profiles?

1

Entering edit mode

5.7 years ago

Lila M ★ 1.3k

Hi everybody, I have different samples that I used to calculate the coverage over specific data set. I did it using deeptools, so my final output is something like this

s1  1,923575916 1,92340092  1,918008392 1,915392666 ...
s2  1,993446863 1,931309701 1,925363124 1,935658019 ...
s3  2,03417052  2,042601134 2,029136552 1,996391637 ...
s4  2,107394697 2,107865284 2,093484711 2,070557165 ...

I would like to test if there are differences on the coverage (by p-value). my first thought was to apply a ks-test, but then I decide to reject it as it test if two samples have the same statistical distribution and not if there are different or not. Does any one else try to address this question before? Thanks!

coverage differences tss • 1.4k views

ADD COMMENT • link updated 5.5 years ago by Biostar 20 • written 5.7 years ago by Lila M ★ 1.3k

1

Entering edit mode

What kind of data is this? How did you produce the counts and what is the experimental setup? Are there replicates?

ADD REPLY • link 5.7 years ago by ATpoint 85k

0

Entering edit mode

Hi, there are no replicates, they are ChiP-seq samples in different conditions.

ADD REPLY • link 5.7 years ago by Lila M ★ 1.3k

1

Entering edit mode

Echoing ATpoint for a second, you are trying to test for differences between singletons, which isn't possible base-by-base between singletons. You require replicates measurements of coverage in order to do a KS-test or something similar. A p-value isn't a very strong statistic, since 'coverage' values (based on counts) aren't normally distributed. That said... what you want to look for is 'peak-finding' algortihms (like MACCS) more so than you want to look for adjusted p-values at specific bases. Even then, it is highly recommended to have biological replicates of each condition under study unless you already have a database of coverage values for this transcription factor to draw a distribution from to calculate your p-values. Big companies get away with singleton replicates in screening studies because they have those databases to draw from.

ADD REPLY • link 5.5 years ago by mrals89 ▴ 60

score 2 · Answer 1 · 2019-03-18

For ChIP-seq differential binding use any of the established tools such as csaw or diffbind (or others). These will require replicates. Please use google and the search function to look for posts and opinions on unreplicated data. This question has literally been asked dozens of times before ;-) Do not try to make home-brew statistics, please read the available resources first. E.g. in the csaw manual page 44 what to do without replicates.