I have performed Differential expression testing using FindMarkers in Seurat in R. I was hoping to find out which genes are upregulated in the mutant vs wild type and vice versa.
- First dilemma i am having is what log fold change to use as my cut off. Initially, the plan was to use a log fold change of greater than or less than 1 so i am looking for genes that had a two times change (2^1 = 2). But then my PI preferred we pick a gene of interest and make our cut off there for the downregulated list but the upregulated list would still be LFC > 1.
Is this a valid take? I am worried that the inconsistency in the choices will have people questioning my research.
- Second dilemma i am having is the p-value. I am used to choosing a p-value of less than 0.05 to base statistical significance as other researchers would do. However, my PI is complaining that the genes are too many and so for the downregulated list, he wants to use the p adjusted value and then the upregulated the p-value. Again, is this valid? Wouldn't the inconsistency in choices cause questioning? What is the difference between p-value and p-adjusted value and which is best to use?
I thank you for taking your time to provide your expertise.
I am going to provide a somewhat different point of view, which can be summarized by my response to this paragraph: within reason. Nobody will argue if you pick abs(logFC) >=1. You will get an argument if you pick abs(logFC)<0.5, no matter what your biological reasoning is.
Requiring abs(logFC)<0.5 is actaully very useful, particularly if coupled with interval null hypothesis look for no change (
altHypothesis="lessAbs"
in DESeq).Or do you mean that you shouldn't have, e.g.
abs(logFC)>=0.1
? I agree that there are some people that would argue, but I'd argue thsi can be apprioate in some cases, particularly when you have a lot of samples. With enough samples the null hypohesislogFC==0
is always wrong, and will always be rejected for some sufficiently large n. Requiringabs(logFC)>=0.1
is effectively saying, my null hypotehsis is that logFC is not approximately 0, rather than exactly 0. It can also be directly useful is you are, for example, studying ultrasensitive bistable switches.My main point was that instead of
you can set threshold where you think is reasonable
it should beyou can set threshold where you think is reasonable AND one that reviewers will accept
.There are good reasons to select thresholds smaller than
abs(logFC)=1
, but among those I wouldn't count: 1) I don't have enough differentially expressed genes, so I will lower the threshold; 2) I found an interesting protein atlogFC=0.7
that I am convinced should be differentially expressed, so now I will pretend that 0.7 was a good threshold all along.