Hi,
I have been given a normalised count matrix (single-cell) as following :
Cell1 Cell2 Cell3 .....................
Gene1 0.0000 0 0.0000
Gene2 0.0000 0 0.0000
Gene3 155.8516 0 0.0000
Gene4 0.0000 0 280.9867
I have no access to raw counts and I want to create a Seurat object with this matrix. I have seen posts about reading TPM in Seurat and manually log transforming however, I have no experience with this type of data and I'm not sure if this is TPM or log(TPM+1) data. Is there a way to understand this?
Secondly, after reading this in Seurat my initial aim is to combining it with my data (which I created Seurat objects with raw counts). Do you think this is feasible to integrate Seurat objects that are created with TPM and raw counts?
Thank you very much!
Hi António,
Thank you very much for your reply.
When I sum up the columns, I got this : 760257.6
So I tried transforming with 2^(log(TPM+1)) and then summing up the columns I got this : 157130.48
Do you have any idea why this could be happening?
You get this value
760257.6
or this157130.48
for all the columns?Just to be clear, when you did the
2^(log(TPM+1))
the valueslog(TPM+1)
represent just the values in your data table. So, when you apply the reversing oflog2(TPM+1)
, you just provide, let's say the value155.8516
(from Cell1, Gene3) to: 2^(155.8516). This was how you did it? (of course you need to do this for all the entries in your data and only then sum the each column independently of each other)If so, this means that you don't have
TPM
neitherlog2(TPM+1)
data. It was used other transformation/normalization.How did you obtain the data in first place? You should talk with the person that transformed/normalized the data in order to know exactly the transformation/normalization used.
António
You need to subtract the pseudocount in order to go from
log2(TPM+1)
toTPM
, by doing(2^log2(TPM+1))-1
, where thislog2(TPM+1)
represents counts log2 transformed on TPM plus one pseudocount.António
Sorry Antonio I was confused. So I got
760257.6
for the columns. And I guess this means it is already TPM?This was a public data and it was very hard to get a clear answer from the authors.
Thank you for your help :)
If it's public, can you share the link to the data? Then, I can check myself. It's easier this way.
António
Can you share your email address please?
Sorry, the data is not public? If it's you can just share the link to the paper/database whatsoever. I prefer not to share my e-mail (though if you search you might find it!).
If this is not the case, just put the data in your google drive or dropbox and share the link (again assuming that the data is public, you should not be afraid to share because it was already published and it's open to anyone).
If this is not the case, I probably will recommend you to not share the data (though I'll not do anything with it).
António