Hi,
I have two bed files, and I would like to know if bed file A is enriched in bed file B compared to what is expected, given the genome sizes. I found how to get a p-value for this using Fisher, at least an idea of the baseline: bedtools fisher -a testA.bed -b testB.bed -g chrom_hg19.sizes,
Now I had three questions: Is this the usual way to do enrichment of features genome-wide? My feature in bed file A does not include the sex chromosomes, while bed file B and chrom_hg19.sizes do. Would this influence the estimates? Is there a way to get a log fold change?
Thanks very much!
A log fold change compared to what? Sure, the sex chromosomes will impact the estimates, but you can remove them if you feel it's appropriate. I don't see anything wrong with this approach, it's more or less the same as any GO/Pathway enrichment analysis.
Hi, thank you Jared. 1. Some bed files A have the sex chromosomes, some bed files A do not. So, in cases that do not include the sex chromosomes, I should compute enrichment of bed files A in bed files B after removing the sex files? Then the enrichments of bed files A in bed files B will be all comparable across multiple bed files? 2. What tool is normally used for enrichments of features across the genome (bedtools fisher)? 3. Log fold change is over expected, given the size of the genome. How to compute that? bedtools fisher does not give that, only the p-value...