Does anyone know of any datasets other than L1000/CMap dataset (https://clue.io/) that has RNAseq data for cell lines and a large number of compounds/perturbagens? I know there a lot of "one-off" datasets for particular cell lines/tissues and a handful of compounds, but I am primarily interested in exploring machine learning methods on the biggest dataset I can get my hands on.
Note: I am aware of datasets from the likes of COSMIC, but I am currently a part of a commercial institution, so I would need data that is under a flexible license or is just free.
Have you looked at GTEx?
So far as I can tell, GTEx is just a collection of user-submitted assays (different conditions, tissues, cells, compounds, etc) that primarily focuses on tissue level RNA-seq data. Is there a large collection of cellular assays within GTEx?
Lymphoblasts are cell lines I guess.