Hello,
We are trying to analyze a set of single cell data sets from different sources but we have a problem. One of the data set is in TPM and another one is in FPKM format. It is easy to do batch correction with raw counts (with CCA in Seurat or MNN in Scater) but we have no idea how to deal with this problem.
Do you think we can use TPM and FPKM values for batch correction since they are already normalized. Another option is to convert values to raw counts but we have no idea how to do it.
Thank you in advance.
I am planning to check tximport package to convert FPKM and TPM values to counts. I found the function. I think you mean that one.
No I did not. Sorry to say but what you say does not make any sense.
tximport
is meant to convert transcript abundance estimates to the gene level while correcting for the different lengths of the transcripts which influence the abundances (longer transcripts => higher abundances). As I said neither TPM nor FPKM are unsuited for intersample comparisons. Are these at least two datasets from the same study/lab or two completely different datasets?They are completely different data sets from same biological sample
Then it is probably impossible to do what you aim. See essentially: C: Comparison between scRNA and bulk RNA which should cover the main arguments regardless of the dataset being bulk or single-cell. Most importantly points 2 and 4.