We have run a pilot RNA-Seq study and I used edgeR package to obtain differential expression results. The results output a gene column along with the logCPM, logFC and p-value column. I have a question regarding conversion of log2 scale data to linear data for certain analysis. How can this be accomplished. Is there any R package to do this?
For instance, can I use the base function in R like below:
Linear FC value = 2^(logFC)
Linear CPM value = 2^(logCPM)
rpolicastro , thank you very much for the response. Another point, do I have to include -1 in the formula Linear FC value = 2^(logFC-1) if the log2 transformation would have been performed during logFC calculation as logFC = log2(matrix+1) as to avoid NAs.
No, you absolutely do not need to include -1 in the formula and it is quite wrong to do that. The logFC from edgeR are computed using a sophisticated generalized linear model algorithm that takes a lot of things into account. You should not make ad hoc hacks to change the results from edgeR.
All results output from edgeR are on a log-2 scale, so yes you can unlog the logFC and logCPM values using the formula you give (as rpolicastro already confirmed in his answer).
Let me say philosophically though that I don't agree that unlogged values can be described as "linear". Expression results are usually best analysed on a log scale and unlogged values do not behave in a linear fashion for most purposes.
Note it is especially important not to try to undo edgeR's logFC moderation, as mediated by the prior.count values, if you are working on an unlogged scale. If you really did want unmoderated fold-changes then you would simply set prior.count=0 in the call to exactTest instead of doing your own ad hoc hacking. I can't imagine how that could be a good idea however. Getting infinite fold-changes from tiny counts (0 vs 1 for example) does not help anyone.
That's correct, you can just exponentiate the values with base 2.
rpolicastro , thank you very much for the response. Another point, do I have to include
-1
in the formulaLinear FC value = 2^(logFC-1)
if the log2 transformation would have been performed during logFC calculation aslogFC = log2(matrix+1)
as to avoid NAs.No, you absolutely do not need to include
-1
in the formula and it is quite wrong to do that. The logFC from edgeR are computed using a sophisticated generalized linear model algorithm that takes a lot of things into account. You should not make ad hoc hacks to change the results from edgeR.