Entering edit mode
7 months ago
Fish
•
0
Hello, I'm new to RNA seq data and would like to seek your help in the following.
My goal is to analyze gene-gene co-expression using Pearson correlation between two sample groups (normal vs disease). For example, I want to see if there is a higher correlation between Gene A and Gene B in normal vs disease.
Now my question is:
- Which normalization method should I use to reach this goal? I see some forums mentioned VST but some also discourage this. What is your take, and why?
- If VST works, should I remove genes with low counts before or after VST normalization?
Thank you for your answers in advance!
Please provide links for these iscussions. I bet they are in the context of "using VST in the context of differential testing". Here one should use the raw counts when using
DESeq2
oredgeR
.The idea behind VST is to mitigate the effects of variance, basically VST enables comparison of genes that demonstrate different levels of variance (see the quote below from the DESeq2 tutorial). I would use VST for your problem. Having said that, why reinventing the wheel and not using one of the existing methods/packages?