What is the best way to compare transcriptome between different condition?
2
1
Entering edit mode
6.0 years ago
ch8316f5eyu ▴ 10

My goal is to compare transcriptome between different condition. For example, I KD gene A, gene B, genes C. And I want to know whether the consequence of KD gene A is more close to gene B or gene C. The first way I adopted is to compare CPM of KD gene A Control, KD gene A, KD gene B Control, KD gene B .... But the result is KD gene A Control and KD gene A is more close. So I think I should consider the effect of the background. I next compared the log2foldchange from DESeq2 result. But I lose the p-value information. So, what is the best way to compare the transcriptome of RNA-seq?

RNA-Seq • 1.9k views
ADD COMMENT
2
Entering edit mode

If you are interested in just knowing which of the knockdowns i.e. B or C is close to lets say A, you can do hierarchical clustering on the counts post applying a transform like vst() or rld() in DEseq2. You can find an example here.

ADD REPLY
0
Entering edit mode

But there is a batch effect. I haven't KD those genes at the same time. Those KD samples have corresponding control. Can I just cluster those without control? If I add control samples, the KD samples are clustered with their corresponding control.

ADD REPLY
0
Entering edit mode

I used the first way you mentioned. I 'm not confident because I don' t it is acceptable. Thank you for your help.

ADD REPLY
2
Entering edit mode
6.0 years ago

Then you can either:

  1. Do the hierarchical clustering on the log2FC produced by DESeq2
  2. You can batch correct the entire expression matrix (using sva::ComBat (see section 7 here) or limma::removeBatchEffect (see page 190 here)) and do the hierarchical clustering on the corrected matrix.

Btw for doing a global comparison of which are more/less similar I would not use p-values (or only significant features) but rather the entire transcriptome.

ADD COMMENT
0
Entering edit mode
6.0 years ago
bharata1803 ▴ 560

The question of your setting is basically find which change between treated gene vs control gene is closer acroos gene, right? In that case you need to measure the change between the group, then you measure the change acrross gene. Clustering log2FC is okay I guess but I think it will not show any direct relationship because 2 genes up regulation/down regulation can be caused by many things.

I think calculating correleation between 2 genes expression is better. Calculate using normalized expression from CPM function from Limma or EdgeR I forget or VST from DESeq2.

Why I think it is better? Correlation for expression of 2 genes basically check if gene A is affected by gene B or vice versa. If a gene is affecting another gene, it will affect both in control condition and in treatment condition. It means that no matter the condition, there would be an effect of gene A to gene B.

ADD COMMENT

Login before adding your answer.

Traffic: 2699 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6