This seems to be a simple question, but I couldn't find an answer anywhere. How much does the accuracy of effective genome size affect macs2 output for ChiP-Seq data? For example, if I use total genome size as effective genome size, or total bases - Ns, will it really affect the output a lot? Or just trivially?
Thanks for your insights.
-U
Thanks a lot. Actually I am working on H3K4me3 and H3K27me3 data on some newly sequence animal genomes. I tried to get the effective genome size using GEM, but couldn't run it (I'm still new to this kind of analysis), and the documentation of GEM is not great. I found here How do I compute the effective genome size? that genome size -Ns could be an option.
Could you please advise on a easy to run tool to calculate this stats? Thanks again for your answer!
Have you read all of the answers to the linked question? They present several good options - have you tried/checked any of them? If your species of interest have their total haploid DNA content determined here, then you can convert that directly to bp to get the effective genome size. If not, you could try to determine that amount experimentally yourself.
Have you tried just removing all Ns from the genome and counting the remaining bases and using that value?