Question

How to get p-value from differential analysis involving normal and tumor samples?

2

Entering edit mode

8.8 years ago

carlos_marchi ▴ 80

Hello Everyone,

I have several samples of some types of biological data, such as: mRNA , miRNA and DNA Methylation where each of them has two conditions : normal and tumor.

I would like to implement the differential analysis between the two conditions of the entire samples in order to obtain the p- value for each gene.

I thought of using DESeq, but I could not because my data is already normalized . For example in mRNA samples, the data are normalized and applied the log 2, as you can see below:

Gene        N1           T1           N2           T2
ARHGEF10L   3.3151314    3.2328449    3.2583983    3.4465871
HIF3A       3.0830942    1.9722883    3.2255372    1.5074648
RNF17       -0.7374466   -1.6201573   -1.3785693   -4.2487934
RNF10       3.5662794    3.5837116    3.5824115    NA

From what I read in the DESeq documentation , It needs a table containing to reads count , I have not it. So I wonder if there is some way to get some statistical analysis that makes the difference between the two conditions ( normal and tumor) of all samples in order to obtain the p-value of genes ?

The final result that I would like to obtain is like this:

Gene       p-value
ARHGEF10L  0.2342
HIF3A      0.676
RNF17      0.892
RNF10      0.1243

Please, can anyone give me a suggestion to resolve it in python?

Thank you very much for all the attention!

differential python p-value analysis • 2.0k views

ADD COMMENT • link updated 2.1 years ago by Ram 45k • written 8.8 years ago by carlos_marchi ▴ 80

0

Entering edit mode

python, perl, R, whatever, you can do a t-test for each gene to get the p-value, and then you need to adjust your p-values for multiple comparisons with some method such as FDR.

In R, it's straightforward and simple:

d <-read.delim("mydatafile.txt");
pvals <- apply(d, 1, function(x) 
               {
                       rst<-try(
                                t.test(as.numeric(c(x[2], x[4])), 
                                       as.numeric(c(x[3], x[5]))), 
                                       silent=T); 
                       if(is(rst, "try-error")) 
                                return(NA) 
                       else 
                                return(rst$p.value); 
                }
               );

FDRs <- p.adjust(pvals,  method="fdr", n=length(pval));

Hope this helps.

ADD REPLY • link updated 8.8 years ago by Ram 45k • written 8.8 years ago by moxu ▴ 510

0

Entering edit mode

Once you get the p-values for your data, you may want to have a look what Open Targets has got in its Platform e.g. the differential RNA expression as one of the pieces of Evidence for HIF3A in cancer.

ADD REPLY • link 8.4 years ago by Denise CS ★ 5.2k

score 2 · Answer 1 · 2016-08-22

2

Entering edit mode

8.8 years ago

Devon Ryan 105k

limma is expecting normalized log2 transformed data, so you can put things in there. We've had some luck using it for methylation data as well, though do note that you should do the stats on the logit-transformed data.

ADD COMMENT • link 8.8 years ago by Devon Ryan 105k

score 0 · Answer 2 · 2016-08-24

0

Entering edit mode

8.8 years ago

carlos_marchi ▴ 80

Hi Devon!

Thank you very much for your ideas! I implemented t-test and it worked very well! :)

ADD COMMENT • link 8.8 years ago by carlos_marchi ▴ 80