Hello everyone!
I was wandering about a very basic yet fundamental question related to chIP Input and IP library normalization.
In a situation where the scaled read coverage per genomic bin for IP and Input libraries is low (e.g. on average 5 reads per bin), adding a pseudocount of +1 (for example) would significantly alter a downstream log2 ratio calculation.
Whereas if the read count is very high (or extremely low), it doesn't make much of a difference to add a pseudocount and therefore any log2 ratio value alteration can be regarded as negligible. This is the standard practice I believe.
So how do you account for a low read count situation? Can one artificially increase the number of reads per bin by doing for example, a per kb or per 10kb coverage and then adding a pseudocount?
I would be glad to get your input on how to deal with this issue as this is a critical step before any downstream analyses I believe.
Many thanks in advance!