Dear all, I am new to this, and I’d like suggestion on how can I transform RPM to TPM, actually to [log2(tpm+1)] for all my RNA gene expression file using r.
here is the heder and first row of the data frame that has 63677 rows and 70 columns (samples subjected to RNA-seq):
Not an answer to your question but a remark: TPM/RPKM/FPKM are often not the best normalization method... (But I don't know what you are going to do with the data)
TPM, FPKM, RPKM are calculated from raw read counts. It will be easier to start from raw counts rather than to convert from different calculations.
If RPM is "reads per million total/mapped reads", it's same as TPM (transcripts per million reads) for single-end platforms. For paired-end reads, singletons (only one end is mapped) and paired-mapped read pairs could be treated differently. This depends on how the software handles the numbers. Some tools may treat RPM same as TPM. some may be not. So start from raw data is necessary.
Not an answer to your question but a remark: TPM/RPKM/FPKM are often not the best normalization method... (But I don't know what you are going to do with the data)
If you have mapped read counts from htseq_count or any other tool you can directly calculate rpkm in R using the following script.
https://github.com/sethiyap/CalculateTpmSingleSample/blob/master/CalculateTpmSingleSample.R
Cheers, Pooja