Hello,
So, I have done gene expression analysis using DESeq2. Now, I have several gene which is differentially expressed between normal and cancer sample. I tried to analyze what is the cause of the differentially expressed. I also have data about variation from exome seq with same sample (I got it from NCBI GEO). To find the cause of DE gene, I tried to list all of TF for that gene that I can found from gene card website. My method is really simple, use Pearson correlation too check the correlation coefficient and plot it in scatter plot to find the linearity. From this method, I found quite interesting result which is I found 1 TF correlates strongly with down-regulation of my target gene (the target gene and TF gene both down-regulated). Other TF didn't show this result. I use assay function from DESeq2 object to get the normalized read count, not the raw count or logarithmic expression level.
My question is, is my method acceptable, both biologically and mathematically?
Also, I found most of the TF and target gene didn't give any strong linear correlation, do I need to calculate it with non-linear correlation method?
Thank you for your suggestion.
what did you calculate the Pearson correlation for? i.e. which values are you comparing with each other?
I use the value from DESeq2 object I get it using assay function from DESeq2. I use Pearson correlation coefficient just to check which TF has strong correlation in linearity befor I plot it in scatter plot.
And what value did you use for the TF?
Same, I compare expression level from DESeq2 for target gene and TF