I am investigating the differentially expressed genes and the histone modifications and for that our in-house data is not suitable.
I need ChIP-seq ( more the better) data and RNA-seq data for treated/untreated or at least a bit different cell types (or even cell lines).
In mouse encode I found lots of ChIP-seq and RNA-seq data for liver and heart from Bing Ren lab. However, (as expected) they are too different, so that DESeq and edgeR predict lots of differentially expressed genes for RNA-seq.
In roadmap I found excessive suitable data for H1 ES-cells and H1-BMP4 (or hESH1 derived mesendoderm), however I was not able to find what is the difference between those two cell lines. And if they are different enough that RNA-seq analyzing tool will detect the difference. And if there exists a golden standard of the genes that are expressed.
In fact, I need ChIP-seq and RNA-seq data (the more the better in the sense of type of data, sample size can still be small: 2x pro condition) for benchmarking the tools with the possible gold-standard-genes or just for analyzing.
For benchmarking, simulating the data sets would be more useful?
I am quite limited in time to simulate the ChIP-seq for several histones and and RNAseq myself. And I doubt there are simulated data in the community. Therefore, I did not consider simulation at all.