Entering edit mode
5.5 years ago
hivemind
▴
20
Hello, I have some problems figuring out how to interpret DNase enrichment files (bed format) downloaded from ENCODE. The first three columns are straight forward (chrom, start, stop), but I don't understand what the 4th and 5th column values are and how to interpret them.
The bed files look similar to this:
chr1 181393 181399 i 0.000805238
chr1 181399 181401 i 0.000517216
chr1 181401 181403 i 0.000211336
chr1 181403 181408 i 0.000134738
chr1 181408 181411 i 8.5816e-05
chr1 181411 181412 i 3.45057e-05
chr1 181412 181415 i 2.18045e-05
Does anybody know how to interpret these the last columns?
Thanks.
Is the 4th column always "i"? The 5th column is almost certainly p-values or FDRs from peak calling.
It seems to be the case, at least for the files I inspected so far.
I was wondering, if I could use the values of the 5th column for further filtering. I guess for that I need to know what I'm dealing with.
They are likely already filtered to some degree, especially since ENCODE tends to use IDR whenever possible.