Hi everyone. I had to add more nodes to a Protein-Protein Interaction (STRING) network in order to get a better p value, but I'm not sure whether I should make assumptions on it since the main module and hub proteins derived from it are from the white nodes added (and not from my original gene list). What would you do? Discard the network and say that your list of genes did not generate a network or would you use the network with white nodes?
Well, the honest, but somewhat unhelpful, answer is "it depends". What is your p-Value based on? Are you doing some kind of GO enrichment? Where did your initial gene list come from? It's certainly possible that there are key proteins in pathways that aren't captured by typical transcriptomics or proteomics experiments, particularly if you are looking at fold changes. In that case, I would say that you should document what you did to add the extra proteins, note in particular the edge scores of those proteins and your gene list, and look for biological relevance of the additional proteins.