I have broad peak files generated by MACS2 for 68 samples.
I want to see if I have reached saturation of new peaks obtained in my samples.
I want to produce a graph similar to the one here -> https://ibb.co/Lnx64xG
My question is does anyone know how to generate this? Is there an R package?
You basically call peaks on 1, 2, 3, (...), all samples and then check how many peaks you get in each of these. You can repeat the calling for each number of samples a couple of times to be more robust against outliers. The obtained peak numbers you plot as a function of the number of samples you used. I do not think there is an R package but this is basically just a loop from 1 to the number of samples, and in each iteration you randomly select the number of desired samples. Should not be too difficult to implement. Please try out some things and feel free to come back with specific questions.