Hello guys,
It may seem like a basic question, but this is causing confusion.
What is the difference between fold change and Log fold change?
Regards,
Hello guys,
It may seem like a basic question, but this is causing confusion.
What is the difference between fold change and Log fold change?
Regards,
I find pictures helpful when thinking about these concepts, so quickly made some visualizations for you. They expand on WouterDeCoster's and h.mon's comments. The figures were made in R using with the following code:
#Generate fake data
fake <- data.frame("gene" = as.numeric(c(1:50)), "pre" = as.numeric(seq(1,99,2)), "post"= as.numeric(seq(99,1,-2)))
#Calculate fold change
fake$fold <- fake$post/fake$pre
#Calculate log2 fold change
fake$logfold <- log2(fake$fold)
#Visualize
library(ggplot2)
ggplot(fake, aes(x = gene, y = fold)) + geom_line()
ggplot(fake, aes(x = gene, y = logfold)) + geom_line() + ylab("Log2(FC)")
Let's say you have 50 genes (for the sake of convenience, the genes are called "1", "2", "3", ... , "50") and you measure expression before and after some type of treatment. You then measure the fold change of the genes due to treatment: FC = expression(post) / expression(pre). As russhh indicated, the logFC is simply log2(FC).
In this dataset, half the genes were upregulated and the other half downregulated. Interpreting the untransformed fold change is tricky: here it looks like gene1 had a huge fold change, close to 100, but what about genes30-50? It's hard to tell what their value is, and also by how much they differ from each other.
Performing the log2 transformation scales the data and facilitates the interpretation, as illustrated below.
The differences in expression in genes30-50 are now much more obvious.
If you think that fold-change is the 'expression level' in one set of samples (set A) divided by the 'expression level' in another set (set B), then log-fold-change is the log of that value (typically to base 2).
That is, if FC=, A/B, then log_FC = log(A / B) = log(A) - log(B)
and if log_FC=x, then FC=2^x
To follow up on the value added by the symmetry in log2-transformed fold changes, you might also look at a volcano plot of features, a common way to represent p-values vs log2-fold change.
For example:
Symmetry makes visual interpretation of this type of plot — comparison and labeling of features with significant fold change and p-value thresholds — much simpler. One can see immediately which features are both statistically significant and likely to be significantly up- or down-regulated.
"fold change" is a bad metric and also the term itself can be ambiguous when speaking to different people, as evidenced by the confused entry in wikipedia https://en.wikipedia.org/wiki/Fold_change
"log2 fold change" is symmetric and unambiguous
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Using a log has some advantages here. Without the log, a downregulation gets squeezed between 0 and 1, while an upregulation goes from 1 to infinity. With the log, you get a better scale.
Indeed, centered around zero and symmetrical, instead of centered around one and asymmetrical.
Ty so much WouterDeCoster
Yeah, I think so. If you make a FC rule, big fold is out side the graph; and a logFC makes big number in a human-visible& understhandable scale; meanwhile a small number is as well. If you want a better one you can choose a log instead of log2 when you facing a fold value more than big
Log fold change = log(FC)
Usually, the transformation is log at base 2, so the interpretation is straightforward: a log(FC) of 1 means twice as expressed.
See some lengthier discussions here and here.
Thank you so much h.mon, Useful links.
Do you mean fold change?
Hello Ram, sorry!
Yes, Fold change (FC)