In Gene ontology and KEGG analysis, Why there are two cutoff values; p-value and (-) log10(FDR) value?
What is the role of these two cut-offs?
Could bioinformatician give an explanation to a wet lab person?
In Gene ontology and KEGG analysis, Why there are two cutoff values; p-value and (-) log10(FDR) value?
What is the role of these two cut-offs?
Could bioinformatician give an explanation to a wet lab person?
I'm going to link to another of my previous answers that explains the logic behind enrichment analyses. The reason that both p-value and false discovery rates (FDR) are used is to account for statistical issues arising from multiple testing. In short, p-values are computed from a single score, which can be misleading due to the high number of tests actually performed.
Multiple test correction methods, FDR among them, try to correct for this issue to give more accurate values that can be compared to whatever threshold (or alpha) that you choose - typically 0.05 or 0.01. The chosen threshold is arbitrary, but those values are generally regarded as stringent enough to be relatively believable. A score with a p-value or q-value or FDR of 0.05 calculated would occur in 5% of cases by random chance in the data. Using lower alpha values should reduce the false positive rate with the drawback of increasing the false negative rate. This short paper has a simple example that is fairly easy to follow along with additional context.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thank you very much :)