Small percentage of overlapping chip-seq peaks
2
0
Entering edit mode
8.2 years ago
atsalaki ▴ 20

I have downloaded ENCODE chip-seq peaks for HepG2 cell line with FOXA2(TF). I found this paper https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2794179/#!po=21.4286 i took the chip-seq data for HepG2 in FOXA2 they have in this paper I used venn diagramm to find the overlapping peaks and for my suprise the overlapping peaks weren so much as i expected, could this happen. I expected one cycle inside the other in biggest percentage enter image description here

ChIP-Seq • 3.3k views
ADD COMMENT
2
Entering edit mode

What are these numbers that you use to overlap the peaks? Is it from one chromosome only? What distance does Venny allow to count a peak as overlapping?

ADD REPLY
0
Entering edit mode

All the chromosomes , no distance computed venny it just takes the unique entries from the two lists and correlates them to see how they fit.

ADD REPLY
0
Entering edit mode

But what are those numbers? What do they represent? Where do they come from? Peaks should be identified by genomic coordinates.

ADD REPLY
1
Entering edit mode

They used hg18, did you?

ADD REPLY
0
Entering edit mode

All the chromosomes , no distance computed venny it just takes the unique entries from the two lists and correlates them to see how they fit.

ADD REPLY
1
Entering edit mode

In other words, the results are completely meaningless.

ADD REPLY
1
Entering edit mode
8.2 years ago

I don't know about this TF and cell line, but 10% overlap [438 / (1297 + 438 + 2476)] doesn't surprise me much really, it's small but still not unusual, especially since the data come from different labs (right?).

Also, looking at overlap by number of peaks can be misleading since you give equal weight to all peaks, even those that are at the boundary of significance and might be gained or lost depending on sequencing depth, peak calling sensitivity etc.

(I'm still looking for a good way to assess consistency of peaks between two or preferably more replicates)

ADD COMMENT
1
Entering edit mode

Isn't this what the IDR package was made for? https://www.encodeproject.org/software/idr/

ADD REPLY
0
Entering edit mode

Did you find a good way to find consistency between replicates?

ADD REPLY
0
Entering edit mode
8.2 years ago
igor 13k

According to the screenshot, you are using Venny. Venny is used for overlapping discrete values. This is good for things like genes that have specific names. Peaks are usually identified by genomic coordinates and they span regions of different size. If you overlap ranges with Venny, you will not get an overlap unless the regions are identical. If the two peaks are off by even 1 base, Venny will not consider them overlapping.

ADD COMMENT

Login before adding your answer.

Traffic: 2009 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6