I'm trying to compare gene-expression scRNA-Seq data of a specific cell type from different tissues (and from different studies). Is this OK to compare gene-expressions from these studies given that they use different platforms and have reads of different nt length?
Thanks
You would need to integrate the datasets first. Seurat claims to be able to do that with its recently published anchoring framework, see https://satijalab.org/seurat/v3.1/integration.html.
A different option would be to analyse each dataset independently and then perform metaanalysis. Given that the integration framework from Seurat is relatively new (plus given that each analysis trategy has upsides but also its indivudial biases) I would use several analysis strategies and make sure that the core findings agree to have confidence in your results.
Edit: As geek_y says in his answer one should be smart on which datasets one compares and which results can be expected. Fewer-cells-higher-depth platforms are intrinsically difficult to compare with many-cells-lower-depth platforms.
Combining droplets based protocols (e.g 10x) with well based protocol (e.g SMART-Seq) is not a great idea. You have to analyze them separately as both data sets capture different number of genes and different sequencing depth and different regions of mRNA.
If you are combing datasets from different studies generated by similar approach , applying comBat to correct for batch effects will work. (At least in my experience and the data set I analyzed) which you can always check using UMAP plots if the cells cluster by dataset or by cell-type after correction. Unfortunately there are several ways to do it and you need to try the one that suites better for your dataset.
You would need to integrate the datasets first. Seurat claims to be able to do that with its recently published anchoring framework, see https://satijalab.org/seurat/v3.1/integration.html. A different option would be to analyse each dataset independently and then perform metaanalysis. Given that the integration framework from Seurat is relatively new (plus given that each analysis trategy has upsides but also its indivudial biases) I would use several analysis strategies and make sure that the core findings agree to have confidence in your results.
Edit: As geek_y says in his answer one should be smart on which datasets one compares and which results can be expected. Fewer-cells-higher-depth platforms are intrinsically difficult to compare with many-cells-lower-depth platforms.