EdgeR analysis p-value plateau formation
0
0
Entering edit mode
7.6 years ago
Laura • 0

Dear all,

I have performed various sub-phenotype differential expression analysis using EdgeR on RNAseq data. Overall, we had a sample size of 127 samples, the comparison containing the smallest sample sizes contains n= 24 vs n=28 samples.

There is a paper by Ching et al (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4201821/) suggesting that from n=25 in each group EdgeR should exhibit a power of 0.8.

But when looking at the p-values generated by the analysis there is a "plateau formation" see figure below https://postimg.org/image/4suy3i5c7/

Analysis on the same data set just other sub-phenotypes (slightly bigger sample size) works well, see below: https://postimg.org/image/6x62j5645/

Following discussion with the bioinformatics team, the one explanation they could give me is that samples size was the issue but reading up on it more and more, I do not believe this should cause such a clear effect. Although I cannot seem to get to the bottom of what would cause this either.

Has anyone seen this before or know of a different possibility why this might occure? Any help is much appreciated!

Laura

p.s. See p-value distribution plots odd one: https://postimg.org/image/nb4zqi6cb/

good one: https://postimg.org/image/spljz0cct/

EdgeR p-values Power samples size • 2.7k views
ADD COMMENT
0
Entering edit mode

I guess it's due to the multi-testing adjustment. which adjustment method is used by edgeR ?

ADD REPLY
0
Entering edit mode

It uses the Benjamini-Hochberg method. The multiple-testing adjustment method is the same for the 2nd graph though and that worked fine. I performed 3 other analysis using the same script, it is just this one subset that seems to come back odd. Really puzzled by it.

If you manually look at the p-values you see that the all the transcripts have very similar p-values e.g. its not just the q-values that 'level out/plateau'.

ADD REPLY
1
Entering edit mode

could you plot the p-value (before multi-testing correction) distribution for the two analysis.

ADD REPLY
0
Entering edit mode

yes, absolutely. Thank you for your help. I will do so tonight and add the plots.

ADD REPLY
0
Entering edit mode

I have added pictures of the plot the at bottom of the original message (hope this was what you meant anyway).

ADD REPLY
1
Entering edit mode

Sounds like the normalization step might have an error if you have similar p&q values? This is why it forms a plateau. Did you do voom, limma-trend or other type of normalization?

ADD REPLY
0
Entering edit mode

hi, sorry just to clarify the p-q values aren't similar in value they just show a similar trend e.g. a big subset of transcripts had almost identical values. I used TMM to normalize. How could I check if something went wrong there?

Also, I wrote one script and ran all the sub-phenotype analysis against it (with the obvious minor adjustments) so would it make sense that only in one comparison the normalization went wrong but in none of the others? I don't know :-)

ADD REPLY
1
Entering edit mode

In practice, it is a good exercise to evaluate the other available normalization methods since for RNA-seq the library size is a critical piece of the normalization process. For example, voom is used if the library sizes differ by more than 3%. Once you've done this you can choose the best normalization method. To choose the best method you can compare the before and after normalization plots.

ADD REPLY
0
Entering edit mode

Thank you! I will try other normalization methods and see if that 'fixes' the problem! Much appreciate your help!

ADD REPLY

Login before adding your answer.

Traffic: 2078 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6