Hi All,
I have a set of five strains effecting a crop and the corresponding RNA-Seq data and HTSeq-Counts from which I have generated DEGs up and down regulated in the crop when infected with each of this five strains.
Now I'm getting highly up and down regulated genes in few strains and low number of up and down regulated in few strains, I want to normalize this values to avoid biased interpretation with respective any particular strain.
Are there any literature/tools available to normalize the values in this situation and to get some uniform values across this five strains..!!
For your information, I have normalized and removed batch effects prior to calculating DEGs.
Thanks in advance.
Regards, Vijay Narsapuram
How did you remove batch effects, and what evidence did / do you have that batch effects exist? Frequently, I come across researchers who worry too much about batch effects. Attempting to adjust for a batch effect where none exists can mess up your data.
I had used Combat for checking if there are any presence of batch effects and see that there is no need to perform any batch correction. so I had supplied the raw HTSeq-Counts to DESeq2 tool the one in Galaxy.
Okay, so, you did not correct for batch. How does your PCA bi-plot appear? How does the dispersion plot appear? In your own mind, the issue is that there is an imbalance in the number of statistically significantly differentially expressed genes in the comparisons that you have performed, right? How many samples do you have in each group?
Did you follow standard DESeq2 analysis protocol? You can access normalized counts from DESeq2 by: https://support.bioconductor.org/p/66067/
I followed DESeq2 under Galaxy online tool to generate my DEGs