Selecting the consensus/overlapped genes from the DE study
2
1
Entering edit mode
6.2 years ago
glady ▴ 320

Hello everyone, I have 9 RNAseq samples(human), each with 3 replicates. I have performed the read mapping with STAR and quantification with RSEM. The differential expression(DE) study was performed by EBSeq, DESeq2 and limma. To keep our downstream analysis as stringent as possible, we decided to select the overlapped genes between these 3 algorithms. We have got a good overlap(57.3%) between these 3 methods

What my questions is....... 1) Can we go ahead with the overlapped genes? Is this scientifically right?

2) Or should I just select one from the three and then go ahead with it? if yes, then why?

RNA-Seq • 2.0k views
ADD COMMENT
1
Entering edit mode

DESeq2 and limma-voom are, in my experience, the most reliable tools. Taking the overlap between different methods will mainly select for genes that are more strongly DE.

ADD REPLY
0
Entering edit mode

Is it okay to go ahead with the overlaps? Because even though I'm getting a good intersection(65%) between limma & DESeq2, the way read counts are normalized in limma & DESeq2 are different.

I hope this doesn't create a problem for the reviewers.

ADD REPLY
1
Entering edit mode

Well, if you have genes with significant changes (statistical and/or expression), then almost all of the methods will pick up. Let us say you are looking at genes that are in twilight zone, that is where the methods matter. Some are sensitive to certain kinds of studies and rest to some other. Look at the manuscripts in your field and see the most used method (effective) and use that. glady. In addition, using different methods is one thing and getting accepted by scientific community is another thing.

ADD REPLY
0
Entering edit mode

He has only three biological replicates for each treatment, so there is a good chance a reasonable proportion of his results are in the twilight zone.

ADD REPLY
0
Entering edit mode

Most of the genes are in the twilight zone. The intersection between the three is somewhere around 58%. While the intersection between limma & DESeq2 is 65%.

ADD REPLY
1
Entering edit mode
6.2 years ago
h.mon 35k

My statistical skills are just rudimentary, so take the advice bellow with a grain of salt:

Although there are several papers using "ensemble" methods for various tasks and showing they perform better than any single tool, I am not aware if this has been done already for RNAseq. My feeling is such method would alter the nominal fdr and statistical power (if you knew your statistical power beforehand) in non-obvious ways. This may or may not be a problem, depending on what you want to do downstream.

Did you check the literature to see if EBSeq, DESeq2 and limma are good tools, i.e., they appropriately control false positive rate as reported, and they have good sensitivity? There is no point in including a tool that call incorrect results.

My suggestion would be to use one tool, chosen before performing the analysis. Now that you already performed with three tools, you are risking p-hacking by choosing the most "interesting" or "biologically plausible" results. If you want to choose one tool now, either do that randomly, or review the literature to choose the best according to it, and not due to your results at hand.

ADD COMMENT
0
Entering edit mode

Thank you for your reply.

DESeq2 & limma are good tools, you produce lower rates of false positives from these two tools as compared to the others. And this is according to the literature, not from my results. However, in my data as well I have observed the same.

ADD REPLY
1
Entering edit mode

Personally, for RNA-seq, having looked at the methods behind each 'tool', I don't feel comfortable using any method other than DESeq2. It's an intelligent method by an intelligent group of people that does better than any other at modelling biases that exist in RNA-seq. For microarray, I only use limma, which is the supreme method in that realm.

As such, I would just use DESeq2 and set cut-offs for fold change and FDR-adjusted P value accordingly.

In saying this, it's not bad science to just overlap the consensus lists from different tools. Just make it clear in your methods what you are doing. Also, be aware of your own internal biases when doing this, as to which h.mon has alluded.

ADD REPLY
0
Entering edit mode

Are you trying to validate results for further analyses, so you want the most stringent set? Then I believe it is fine to take the intersection of (reliable) tools results. If your only concern are the reviewers, then it would be simpler to stick to one tool results.

ADD REPLY
0
Entering edit mode

Yes, you are right. I wanted to keep the results as stringent as I can for the downstream analysis.

ADD REPLY
0
Entering edit mode
5.5 years ago

Adding to the previous answer and comments:

1) this paper provides arguments in favour of the consensus approach: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0190152. They benchmarked all three methods that you mention (and a couple more) and show that (given the datasets that they used):

the consensus among five DEGs identification methods guarantees a list of DEGs with great accuracy, indicating that the combination of different methods can produce more suitable results. The consensus option is also included for use in the available software

Specifically, see Table 5 for consensus and Table 4 for individual methods.

2) using just DESeq2 may be the least controversial, but this would depend on the goal of the study - e.g. what is greater, the "cost" of reporting a false positive (then use approach 1) or false negative (then use 2). For example - in my intuition - (1) would be better when looking for biomarkers (if the cost of further testing is high) and (2) would be better for exploratory studies.

Also note that while limma-voom is a very good tool for RNA-seq, the plain old limma (which was designed for microarrays) is not.

PS. I am curious to learn what you choose to do in the end.

ADD COMMENT
0
Entering edit mode

As they state in the paper you linked, it is expected that by taking the overlap of different tools you improve the absolute accuracy measures. As I mentioned in my original comment, you will select genes whose differential expression is simply more significant. You can however also do that by lowering the threshold of your FDR within the results of a single tool. I don't know if the paper uses any accuracy measures that takes the total number of selected genes into account...

ADD REPLY
0
Entering edit mode

I get the sentiment and agree that in an ideal scenario that would be the case, but I am not convinced that lowering the threshold will always have the same effect, especially if instead of taking overlap after thresholding one would choose to (wisely) combine p-values. What do you think?

ADD REPLY

Login before adding your answer.

Traffic: 1905 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6