I have a set of short sequences obtained from doing DNase footprinting and I want to filter them by removing sequences that don't contain a known TF motif. I have done this with FIMO but we have decided to use Homer for doing motif enrichment rather than MEME, so I want to do the filtering with Homer too if possible. I used the "homer2 find" command which I found mentioned in the documentation here, but with no explanation of how to use it or what the output means. I figured out how to use it by just typing "homer2 find" without any arguments which printed some usage information, so I have the output now which looks like this:
chr9:104478785-104478824 -12 TBGCACGCAA Arnt:Ahr(bHLH)/MCF7-Arnt-ChIP-Seq(Lo_et_al.)/Homer + -0.357320
chr9:104478678-104478724 11 TBGCACGCAA Arnt:Ahr(bHLH)/MCF7-Arnt-ChIP-Seq(Lo_et_al.)/Homer + 3.356042
chr9:104478785-104478824 6 CCAGGAACAG AR-halfsite(NR)/LNCaP-AR-ChIP-Seq(GSE27824)/Homer + -1.818086
chr9:104478678-104478724 -6 CCAGGAACAG AR-halfsite(NR)/LNCaP-AR-ChIP-Seq(GSE27824)/Homer + 8.129395
chr9:104478503-104478532 -5 CCAGGAACAG AR-halfsite(NR)/LNCaP-AR-ChIP-Seq(GSE27824)/Homer - -5.348250
chr9:104478095-104478123 -9 CCAGGAACAG AR-halfsite(NR)/LNCaP-AR-ChIP-Seq(GSE27824)/Homer + -1.677003
It seems like the last column is probably some kind of significance score, but I'm not sure how to interpret it. Since it has both positive and negative values, I don't see how it could be something like a log p-value, so I'm not sure what it is exactly and what cut-off might be appropriate, or whether it actually has nothing to do with significance. Is anyone familiar with this output or knows where the documentation for it might be?