Entering edit mode
5 weeks ago
kl
▴
20
Hi all,
I am have done a proteome-wide association study (about 7200 proteins). Unfortunately, none of the proteins are above the bonferroni or fdr threshold. However, there are 450 proteins with a p-value below 0.05. Despite the non-significant (I have a small sample of about 410 participants but with longitudinal modeling), I would like to do enrichment analyses using GOTERMS and kEGG. However, not sure which proteins to take forward? I would appreciate advice with this.
Thanks
This a common situation. And your suggestion is indeed reasonable. It is similar to an earlier publication by Tarca et al (2013) which has the following logic:
In summary, you are free to define the foreground as you like. 200 to 500 genes is fine. Using less than 100 proteins is not recommended according to our own testing. What is important is that you define your background list, a set of proteins that are robustly detected by the assay at a level where it has a chance at being detected as differential (see Timmons et al 2015 and Wijesooriya et al, 2022. Moreover, there is some disagreement as to whether up- and down-regulated proteins should be considered in separate tests or together. From my experience I think separate tests are better, and there is some agreement in the literature eg Hong et al (2014).
Thanks for this guidance. When looking at nominal p-value<0.05 is it essential to also filter on fold change?
Fold change filter is not essential unless you notice that they are very small.
Also, take into account that you can try methods like GSEA, which can acommodate using all your universe of 7200 proteins, because the input is not a pre-filtered list, but rather the whole list of differential results.