Question

EdgeR toptags question

0

Entering edit mode

10.5 years ago

geneart$$ ▴ 50

Hi all,

I was using EdgeR to perform a DE analysis and following this document

Essentially edgeR tutorial.pdf.

I did an exact test on my samples using cmn, tgw and poi.

de.poi <- exactTest( cds , dispersion = 1e-06 , pair = c( "Control" , "Infected" ) )

I have also created a toptags table for all three, cmn , tgw and poi.

> resultsTbl.cmn <- topTags( de.cmn , n = nrow( de.cmn$table ) )$table
> resultsTbl.tgw <- topTags( de.tgw , n = nrow( de.tgw$table ) )$table
> resultsTbl.poi <- topTags( de.poi , n = nrow( de.poi$table ) )$table

Here is my question or rather questions:

My exact test values for cmn and tgw are exactly the same !!!!! I plotted a variance plot and the plot for cmn (solid blue line) and tgw (lt.blue dots) are exactly on each other (as if superimposed). So I believe I don't have a tagwise dispersion. Anyone has had this experience???
when I try to perform a significance level matching to 0.05 I get nothing !!!

> de.genes.cmn <- rownames( resultsTbl.cmn )[ resultsTbl.cmn$adj.P.Val <= 0.05 ]
> de.genes.tgw <- rownames( resultsTbl.tgw )[ resultsTbl.tgw$adj.P.Val <= 0.05 ]
> de.genes.poi <- rownames( resultsTbl.poi )[ resultsTbl.poi$adj.P.Val <= 0.05 ]
> head(de.genes.cmn)
character(0)
> de.genes.cmn
character(0)
### I do have values in my resultsTbl, see below !!!!! Also you can see how my resultsTbl values for cmn and tgw are exactly the same !! as discussed in Q1.
> head(resultsTbl.cmn)
                   logFC logCPM   PValue      FDR
ENSBTAG00000029982  5.80   7.59 5.66e-11 1.30e-08
ENSBTAG00000036418  7.88   5.63 1.56e-06 1.78e-04
ENSBTAG00000036410  1.34  16.34 7.29e-06 5.57e-04
ENSBTAG00000036423  2.29  10.97 7.59e-05 4.34e-03
ENSBTAG00000029762  1.68  10.61 2.66e-04 1.22e-02
ENSBTAG00000037319 -1.92   4.41 3.48e-04 1.33e-02
> de.genes.tgw
character(0)
> head(resultsTbl.tgw)
                   logFC logCPM   PValue      FDR
ENSBTAG00000029982  5.80   7.59 5.66e-11 1.30e-08
ENSBTAG00000036418  7.88   5.63 1.56e-06 1.78e-04
ENSBTAG00000036410  1.34  16.34 7.29e-06 5.57e-04
ENSBTAG00000036423  2.29  10.97 7.59e-05 4.34e-03
ENSBTAG00000029762  1.68  10.61 2.66e-04 1.22e-02
ENSBTAG00000037319 -1.92   4.41 3.48e-04 1.33e-02
> head(resultsTbl.poi)
                    logFC logCPM PValue FDR
ENSBTAG00000036423 1.9671   11.0      0   0
ENSBTAG00000029957 1.1314   15.6      0   0
ENSBTAG00000036410 1.1289   16.3      0   0
ENSBTAG00000029797 0.7093   15.7      0   0
ENSBTAG00000029897 0.3867   15.5      0   0
ENSBTAG00000029804 0.0894   18.3      0   0

Can anyone throw some light on this please?

Much appreciated,

geneart.

EdgeR Differential-expression • 6.9k views

ADD COMMENT • link updated 3.1 years ago by Ram 44k • written 10.5 years ago by geneart$$ ▴ 50

0

Entering edit mode

> group <- c(rep("C", 7), rep("T", 8))
> cds <- DGEList( counts , group = group )
> cds <- estimateCommonDisp( cds )

## To estimate the gene-wise or tagwise dispersions:
## 50/(#samples - #groups) for me it will be 50/(15-2)=3.85
## None of the tagwise prior.n values work ! Error is unused argument(prior.n). So ran it without any prior.n

> cds <- estimateTagwiseDisp( cds )

## Above the cds called after estimateCommonDisp is the exactly the same as called after tagwise disp. when I gave it as cdst <- estimateTagwiseDisp( cds ) and called cdst it is still the same. The cds is

> de.cmn <- exactTest( cds , pair = c( "C" , "T" ) )
> de.tgw <- exactTest( cds , pair = c( "C" , "T" ) )

#this is what edgeR.Tutorial document says = " the codes below gives full tables of the adjusted p-values taking into the FDR into consideration". but the exact test table I have generated as above ha sno adj-pvalue column.

> resultsTbl.cmn <- topTags( de.cmn , n = nrow( de.cmn$table ) )$table
> resultsTbl.tgw <- topTags( de.tgw , n = nrow( de.tgw$table ) )$table
> resultsTbl.poi <- topTags( de.poi , n = nrow( de.poi$table ) )$table

So as per the suggestion this is what I did:

> de.genes.cmn <- rownames( resultsTbl.cmn )[ resultsTbl.cmn$FDR <= 0.05 ] # command a
> de.genes.tgw <- rownames( resultsTbl.tgw )[ resultsTbl.tgw$FDR <= 0.05 ] # command b

I get 9 genes names whci is exactly the same in both of the above a and b commands.

When I dont give a FDR cutoff I get just the first gene name but reapeated over 11 times.

Any suggestions?

Thanks for the quick reply !

geneart

ADD REPLY • link updated 4.9 years ago by Ram 44k • written 10.5 years ago by geneart$$ ▴ 50

0

Entering edit mode

de.cmn and de.tgw will always be identical, since you didn't specify which dispersion estimate to use in exactTest().

ADD REPLY • link updated 4.9 years ago by Ram 44k • written 10.5 years ago by Devon Ryan 104k

0

Entering edit mode

I checked on that, but I still get the same output. I am so sorry that I might be missing something very trivial and obvious. Here is what I just did:

> cds <- estimateCommonDisp( cds ) ### I have called the common dispersion here as cds
> de.cmn <- exactTest( cds ,  pair = c( "Control" , "Infected" ) )
> resultsTbl.cmn <- topTags( de.cmn , n = nrow( de.cmn$table ) )$table
> de.genes.cmn <- rownames( resultsTbl.cmn )[ resultsTbl.cmn$FDR <= 0.05]
> de.genes.cmn
[1] "ENSBTAG00000029982" "ENSBTAG00000036418" "ENSBTAG00000036410" "ENSBTAG00000036423" "ENSBTAG00000029762" "ENSBTAG00000037319" "ENSBTAG00000029768"
[8] "ENSBTAG00000029918" "ENSBTAG00000029957"

> cdst <- estimateTagwiseDisp( cds )  ## Here I have called cdst as estimate tagwise dispersion and used it through.

> de.tgw <- exactTest( cdst , pair = c( "Control" , "Infected" ) )

> resultsTbl.tgw <- topTags( de.tgw , n = nrow( de.tgw$table ) )$table
> de.genes.tgw <- rownames( resultsTbl.tgw )[ resultsTbl.tgw$FDR <= 0.05 ]
> de.genes.tgw
[1] "ENSBTAG00000029982" "ENSBTAG00000036418" "ENSBTAG00000036410" "ENSBTAG00000036423" "ENSBTAG00000029762" "ENSBTAG00000037319" "ENSBTAG00000029768"
[8] "ENSBTAG00000029918" "ENSBTAG00000029957"

However the outputs are exactly the same.

I don't think I need to use different "name" for CDS as it is the DGEList object. Or do I need to change that?

Thanks for your patience :)

geneart

ADD REPLY • link updated 4.9 years ago by Ram 44k • written 10.5 years ago by geneart$$ ▴ 50

0

Entering edit mode

Why not just look at cds$tagwise.dispersion and cds$common.dispersion? If they're more or less the same, then things are DE regardless of the dispersion estimate. If cds has tagwise dispersions, then exactTest() will use them unless you tell it otherwise. If the cds you used at the beginning of this reply didn't have tagwise dispersions then, again, the results are just the same regardless.

ADD REPLY • link 10.5 years ago by Devon Ryan 104k

0

Entering edit mode

Adding onto the reply I just uploaded ...... if everything looks fine based on what I uploaded it could have been that in my sample sets i could have just those genes being differentially expressed across both tagwise and common?? not sure if I phrased it correctly here , but just wanted to confirm that the variability across cmn disp and tagwise disp is significant in just those genes?

Thanks again

geneart.

ADD REPLY • link updated 3.1 years ago by Ram 44k • written 10.5 years ago by geneart$$ ▴ 50

score 0 · Answer 1 · 2014-06-05

0

Entering edit mode

10.5 years ago

Devon Ryan 104k

You'll have to post the commands used and the resulting diagnostic images to get feedback. It's likely that you just mistyped something at some point.
resultsTbl.cmn$adj.P.Val doesn't exist, you want resultsTbl.cmn$FDR.

ADD COMMENT • link 10.5 years ago by Devon Ryan 104k