What are the 4th and 5th column in ENCODE DNase enrichment files?

0

Entering edit mode

5.5 years ago

hivemind ▴ 20

Hello, I have some problems figuring out how to interpret DNase enrichment files (bed format) downloaded from ENCODE. The first three columns are straight forward (chrom, start, stop), but I don't understand what the 4th and 5th column values are and how to interpret them.

The bed files look similar to this:

chr1    181393  181399  i   0.000805238
chr1    181399  181401  i   0.000517216
chr1    181401  181403  i   0.000211336
chr1    181403  181408  i   0.000134738
chr1    181408  181411  i   8.5816e-05
chr1    181411  181412  i   3.45057e-05
chr1    181412  181415  i   2.18045e-05

Does anybody know how to interpret these the last columns?

Thanks.

DNase ENCODE bed chromatin accessibility • 1.2k views

ADD COMMENT • link 5.5 years ago by hivemind ▴ 20

0

Entering edit mode

Is the 4th column always "i"? The 5th column is almost certainly p-values or FDRs from peak calling.

ADD REPLY • link 5.5 years ago by jared.andrews07 ★ 18k

0

Entering edit mode

It seems to be the case, at least for the files I inspected so far.

I was wondering, if I could use the values of the 5th column for further filtering. I guess for that I need to know what I'm dealing with.

ADD REPLY • link 5.5 years ago by hivemind ▴ 20

0

Entering edit mode

They are likely already filtered to some degree, especially since ENCODE tends to use IDR whenever possible.

ADD REPLY • link 5.5 years ago by jared.andrews07 ★ 18k

Login before adding your answer.