Log2 transformation is well used, but is there a good paper that can be used as reference ?
0
0
Entering edit mode
11 months ago
JACKY ▴ 160

Log2 transformation to bulk RNA-seq data, can achieve a more uniform distribution across the samples. This transformation is beneficial because it helps in stabilizing the variance and compressing the range of data points. By doing so, we can reduce the impact of extreme values or outliers, ensuring that the data is more suitable to various analytical techniques and provides a clearer representation of underlying patterns.

This is my PI's words and then I validated it with chatGPT, I found that the use of log2 transformation for this purpose is indeed a common and valid practice. However, I am currently in need of a credible academic paper to cite as a reference in my work. Despite my efforts, I have not been able to locate a suitable paper. Could anyone assist me in finding a reliable publication that I can use as a reference? I would greatly appreciate any help in this regard. Thank you!

TPM log2-transformation r • 1.0k views
ADD COMMENT
2
Entering edit mode

My personal opinion is that such most basic stats knowledge does not require a citation. Likewise, it does not need a citation that DNA consists of four nucleotides. Same level of basic knowledge. If it's absolutely necessary, then why not taking any textbook on basic data analysis and check the section on common data transformation methods. Cite that.

ADD REPLY
0
Entering edit mode

The Earth is flat. Prove me wrong.

ADD REPLY
2
Entering edit mode

Yeah, people have been using log transformations to stabilise variance in hetroskedastic variablees for about as long as we've known how to calculate logs. Probably goes back before the current system of scholarly publishing.

When it comes to bulk RNA-seq, the papers that are usually cited as the first RNA-seq papers:

Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008 Jul;5(7):621-8. doi: 10.1038/nmeth.1226. Epub 2008 May 30. PMID: 18516045.

and

Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science. 2008 Jun 6;320(5881):1344-9. doi: 10.1126/science.1158441. Epub 2008 May 1. PMID: 18451266; PMCID: PMC2951732.

Both use log transformed counts without giving it a second thought.

ADD REPLY
1
Entering edit mode

Agreed that you don't need a citation unless you're writing a math paper about a new transformation that outperforms log2. Here are two you can use though:

https://www.jstor.org/stable/3001536?seq=14

https://www.jstor.org/stable/2673623

ADD REPLY
1
Entering edit mode

Box-Cox transformation generally outperforms log2 on skewed datasets, though the differences are often small enough that there is no significant effect on downstream applications. Like in many other areas, the convenience of the tools we use often comes before their objective quality.

Is there ever a good reason to use a log transformation instead of Box-Cox?

ADD REPLY

Login before adding your answer.

Traffic: 1665 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6