I have got a list of differentially expressed proteins from LC-MS data and have been trying to analyse it for protein-protein interactions (using STRING), functional classification (using PANTHER) among other things.
I found that both STRING and PANTHER couldn't find some IDs (Uniprot accessions). To be precise, out of the list of 39 proteins, STRING couldn't identify 7 and PANTHER a whopping 22 IDs ! Why this difference ?
After doing some recce, I noticed that the unidentified IDs in STRING were unreviewed enteries. One of the IDs was P02751-10 (same as in raw data) which is strange in itself. Is it acceptable to run P02751-10 as P02751 (as the protein name matches with the one in the raw data) ? Is there a way to run these unidentified IDs ?
The number after the hyphen in P02751-10 indicates the isoform, in this case isoform 10 of protein P02751. Many resources don't care about isoforms and so only use the reference accession number, i.e. the part before the hyphen.
Thanks Jean. I will look more into it.
What kind of organism are you working with? Although PANTHER has a variety of organisms in their database listed here, it just may be that your protein IDs are not listed in their database.