Question

Which one is better and Why? edgeR DESeq2

0

Entering edit mode

9.1 years ago

kanika.151 ▴ 160

Hello,

There have been several posts about comparison between edgeR, DESeq2 and cuffdiff2. I wanted how does one remove the selection bias from it?

Are there any papers which have given preference which have performed selection bias removal?

I do get papers which tell me to use goseq for selection bias removal but that won't be appropriate for DeNovo Assembly as it needs GO terms.

Any methods to remove selection bias from edgeR results?

RNA-Seq cuffdiff edgeR DESEq2 • 3.4k views

ADD COMMENT • link updated 2.4 years ago by Ram 44k • written 9.1 years ago by kanika.151 ▴ 160

0

Entering edit mode

Are you talking about the selection bias of the comparison papers or selection bias of something else? I ask because you start talking about de novo assembled genomes and GO terms (selection bias makes sense when considering GO terms, but largely not elsewhere).

ADD REPLY • link 9.1 years ago by Devon Ryan 105k

0

Entering edit mode

http://www.biomedcentral.com/1472-6750/15/89

This paper uses GOSeq to remove selection bias which doesn't fit well with my data as I don't have a reliable ref genome or annotation data.

Can I remove selection bias without doing gene enrichment analysis from the results obtained from edgeR, DESeq or cuffdiff?

ADD REPLY • link updated 5.1 years ago by Ram 44k • written 9.1 years ago by kanika.151 ▴ 160

score 5 · Accepted Answer · 2015-12-15

tldr: selection bias isn't relevant to you in your current context.

Selection bias in this context refers to your ability to measure a change in X dependent upon the presence/absence/level of Y. Taking GO enrichment as an example, using all of the genes in a group for testing only makes sense if you're meaningfully measuring all of them...which isn't the case in RNAseq (unless they all happen to be expressed in whatever tissue/condition you're looking at).

For differential gene expression, for which DESeq2/edgeR/cuffdiff2/etc. are typically used, selection bias isn't a coherent concept. There are, however, other biases that may or may not be important to you. One very common bias is sequencing depth, which is why all of these packages use some at least vaguely robust normalization method (e.g., TMM in edgeR). Other less common biases are GC bias, which can vary by sample and can cause some pretty funky results when it occurs (see the CQN package in R for a method to deal with this). You might also look at the "Alpine" Bioconductor package from Mike Love or Salmon from Rob Patro for some other examples of biases in RNAseq data and how to compensate for them (should they be relevant to you).