Hello Everyone,
I have several samples of some types of biological data, such as: mRNA , miRNA and DNA Methylation where each of them has two conditions : normal and tumor.
I would like to implement the differential analysis between the two conditions of the entire samples in order to obtain the p- value for each gene.
I thought of using DESeq, but I could not because my data is already normalized . For example in mRNA samples, the data are normalized and applied the log 2, as you can see below:
Gene N1 T1 N2 T2
ARHGEF10L 3.3151314 3.2328449 3.2583983 3.4465871
HIF3A 3.0830942 1.9722883 3.2255372 1.5074648
RNF17 -0.7374466 -1.6201573 -1.3785693 -4.2487934
RNF10 3.5662794 3.5837116 3.5824115 NA
From what I read in the DESeq documentation , It needs a table containing to reads count , I have not it. So I wonder if there is some way to get some statistical analysis that makes the difference between the two conditions ( normal and tumor) of all samples in order to obtain the p-value of genes ?
The final result that I would like to obtain is like this:
Gene p-value
ARHGEF10L 0.2342
HIF3A 0.676
RNF17 0.892
RNF10 0.1243
Please, can anyone give me a suggestion to resolve it in python?
Thank you very much for all the attention!
python, perl, R, whatever, you can do a t-test for each gene to get the p-value, and then you need to adjust your p-values for multiple comparisons with some method such as FDR.
In R, it's straightforward and simple:
Hope this helps.
Once you get the p-values for your data, you may want to have a look what Open Targets has got in its Platform e.g. the differential RNA expression as one of the pieces of Evidence for HIF3A in cancer.