I have an RNA-seq dataset obtained using HA ribotag for polysome associated RNAs. HA ribotag is expressed in the cells that express iCre which is under the control of promoter of the gene that we study (in iCre/WT mouse line). Cells that express our gene will express iCre and will be tagged with ribotag. This will be a mixture of cell types in a given tissue that would normally express the gene. For example in kidney, it would be a mixture of podocytes, fibroblasts etc.
As I am interested in the expression levels of only kidney fibroblasts that express our gene and not podocytes, is it possible to subtract podocyte RNA-seq dataset (e.g. from GEO) from the ribotag RNA-seq that contains a mixture of podocytes and fibroblasts, in order to obtain pure fibroblast dataset that are expressing our gene.
I think you are looking for "deconvolution".
I like the idea, but how would you know how large the fibroblast fraction of your dataset is?
I don't think a perfect method to fix this exists bioinformatically, but what about single cell RNA-seq? Or FACS prior to library prep? I think that could save you to get a pure fibroblast dataset.