Hi, I would like experts feedback on a thought/question. Can we filter all the non-differentially expressed proteins before performing differential expression in a proteomics dataset? Let me write in detail of what I trying to achieve.
We have a mass spec global proteomics data of ~5000 for 15 cases. 9 cases in cond-A and 6 in cond-B. Using the log2 normalized data I performed differential expression with limma lmFit. I had some ~250 proteins with p-value < 0.05 but the FDR is high because of p-value distribution for most of the proteins (tests). Hence I decided to filter the input matrix to keep only those 250 proteins (that had p-value of < 0.05) and performed the differential expression test again and ended up with ~150 proteins with adjusted p-value < 0.05.
So my simple question would be is it technically right to filter all the non-differentially expressed proteins before performing differential expression in a proteomics dataset?