Hi All, Please I need your input on how best to go about an rnaseq analysis I am currently working on as I couldnt find any closely related post. I have 5 datasets (4 with UMI counts and 1 with FPKM) to compare. I am taking the z-score of all the dataset separately before passing on to Seurat..
My questions are : - Is this a right direction or there is a better way around? -If it is the right approach, is there a need to do any normalization/log transformation/what normalization approach would be the best before merging or how best can one preprocess the datasets to be able get any valuable insight from the analysis? - Is it possible to convert UMI to FPKM and then follow the Seurat Multiple Dataset Integration guide to go by the comparison?
Thanks
Why not use the recommended workflow? Seurat is designed to work with UMI and FPKM data, not z-scores.
Thanks Igor. Since all the datasets are not in the same units, I thought taking the z-score first should form a basis for comparison(integration).
In the default workflow, Seurat will perform its own scaling.
Thanks alot Igor. I zoomed into how Seurat does this and I think it is like what i need. For the analysis (4 dataset in UMI and 1 in FPKM), I proceeded as in below
With the above, I started the normal data integration steps - FindVariableFeatures,FindIntegrationAnchors(I used "LogNormalize" vs "SCT" as normalization.method),IntegrateData,ScaleData,RunPCA etc.
Does this approach seem like the right one to compare the dataset in different units that I have?
Thanks alot
SCTransform is for UMI data.
The rest seems fine.