I have an expression dataset in TPM values that I want to convert to FPKM. The problem is that I have no idea how to do it and it seems like my google-skills are not good enough to find an answer either.
Does anyone here have any experience with converting TPM values to FPKM?
There are benefits of FPKM to TPM.
FPKM values of the same gene between different samples are proportional to RPM values of that gene, which is very useful for RPM believers, who value more about the mass of a massenger than the molar concentration when comparing different samples. TPM isn't superior to FPKM. Both of them are just estimations dependent on your prior beliefs.
Any conversion will have some error to it. Theoretically, the math is a quick conversion, but realistically, both of your data sets are produced by different tools with different biases. The conversion is not going to be correct.
Better to try to use both sorts of data, knowing they are simply different.
Maybe log-normalization can get you into similar scales without explicitly worrying about units.
I think the concern is that you may not be able to directly convert them.
For example, I may calculate FPKM from unique alignments from htseq-count and CPM from the sum of transcript counts for the same gene (with estimations that try to correctly assign multi-mapped genes). In that situation, I can't directly convert FPKM from one quantification to match the TPM (or CPM) from the other quantification method.
It is always harder to convert TPM back to FPKM because TPM lost information that FPKM still contains.
If you really want to do that, you have to find the normalizing constant, i.e., the ratio between FPKM and TPM since these two values are proportional to each other in the same sample.
I think everybody has essentially conveyed the right idea: you would typically have those (TPM versus FPKM) with different quantification methods (transcript quantification versus gene quantification).
If you have read counts for transcripts for a transcript TPM value, you could add those at the gene level (which is how you could then calculate FPKM values from the counts provided alongside your TPM values, with your transcript quantification). However, that involved assignment of ambiguous reads between transcripts and multi-mapped reads (which could be between genes). So, your FPKM values with unique reads won't be the same.
Why would you do this, rather than converting FPKM to TPM? There is no benefit to FPKM over TPM and the FPKM to TPM conversion is trivial.
There are benefits of FPKM to TPM. FPKM values of the same gene between different samples are proportional to RPM values of that gene, which is very useful for RPM believers, who value more about the mass of a massenger than the molar concentration when comparing different samples. TPM isn't superior to FPKM. Both of them are just estimations dependent on your prior beliefs.