Hi All,
I have a set of genomic locations I would like to intersect with the relevant DNase Hypersensitive peaks from ENCODE. My question is simple, but the answer has been oddly hard to come by: what do the scores in column 7 mean? There is no mention of how the scores are calculated in the documentation and a search for the peak caller mentioned documentation (I-Max) came up empty. Here's a sample of the lines from one of the files (full file at https://www.encodeproject.org/files/ENCFF001YNU/@@download/ENCFF001YNU.bed.gz).
chr1 3002740 3002890 . 0 . 25 6.60995 -1 -1
chr1 3058240 3058390 . 0 . 15 3.99799 -1 -1
chr1 3085640 3085790 . 0 . 68 60.6165 -1 -1
My first inclination was that the scores indicated the number of cleavage sites in the 150 base windows but they range up into the 1000's, so that is clearly incorrect. Average tag density maybe? Hoping someone can steer me in the right direction. Thanks!
Thanks for the thoughts! These are for the peaks, though, not the hotspots. I did check the word document and that is where I found the allusion to I-Max. (That document, by the way, appears to have been copy-pasted straight from the UCSC browser track description.) That said, I think I answered my own question after looking over the tracks corresponding to the files I'm using -- adding my answer below.
I'm also interested in some DNase Hypersensitive peaks from ENCODE which refer to the "I-Max" peak finding algorithm. Did you figure out if this is some sort of obscure allusion or perhaps a typo? (I'm interested because I'm trying to replicate their peak calling process with another dataset). Thanks in advance!