Entering edit mode
4.7 years ago
olechnwin
▴
60
Hi,
Can someone share their experience on using IDR on more than two replicates? I know Encode pipeline can do this but it's really hard for me to see how they do this. Or if someone can point me to the script used for Encode for more than two replicates that would be great too.
Thank you in advance!
Cannot help with the actual question. Still, if you have an input sample you could use a dedicated replicate-aware peak caller such as PePr => https://github.com/shawnzhangyx/PePr.
Whenever you use replicates to call peaks, be it with IDR or any other tool, be sure that you first call peaks on the individual samples and check that they are reasonable comparable. If you have n=3 and one of them is bad (very low peak numbers, a lot of noise, for whatever reason) I would always exclude them. If you keep it this replicate will reduce the number of reproducible peaks and give many false negatives. Also be sure that sequencing depth is at least somewhat comparable. If one replicate has like 1mio reads and the other two have 50mio then again the low one will probably reduce the total number of reproducible peaks simply due to sequencing depth (so a technical confounder rather than biologicl variation), even though two of the three replicate may be totally fine and very reproducible. ChIP unfortunately is quite variable even if the same lab produces multiple samples with the same antibody. Stringent quality control before any downstream analysis is key.
Appreciate your insights on replicates even if you didn't answer the actual question :-). I noticed how variable is the ChIP, that's why I'm trying to utilize IDR to help with this.
How did you solve this problem? I also have more replicates