Hello, I am a beginner in understanding Chip-seq data analysis and just got introduced to Chip-seq data by studying the ENCODE portal documentation (https://www.encodeproject.org/data-standards/terms/#concordance). I am focusing on Chip-seq for transcription factors. Although, I still have some question such as:
Why each experiment includes ideally two biological replicates?
How should I choose one BigBed file by experiment (among signal IDR thresholded peaks, conservative IDR thresholded peaks, optimal IDR thresholded peaks)?
Thank you! Also, any beginner-friendly resource recommendation for Chip-seq is very welcome.
Why each experiment includes ideally two biological replicates?
Because each ChIP-seq experiment is tremendously difficult to perform. There's huge technical variability which will lead to both false positive and as well as false negative enrichments in the final data set. Ideally, one should have about 100 replicates to be sure of individual signals, but since that's not feasible for funding reasons, people have settled for 2 replicates.
How should I choose one BigBed file by experiment (among signal IDR thresholded peaks, conservative IDR thresholded peaks, optimal IDR thresholded peaks)?
What's your goal? Ideally you should choose the file that's most appropriate to address your question at hand.
any beginner-friendly resource recommendation for Chip-seq is very welcome.
Thank you. I would like to perform motif discovery using MEME.