Hello everyone! This is my first time posting here so I hope I am doing this the right way.
I'm a student in bioinformatics in Québec, Canada and I just started a summer project with my teacher. I have to predict disordered proteins from a database containing a lot of sequences. I don't know if any of you is familiar with this process but I'll ask anyway.
I'll use DISOPRED3 for the prediction. The problem is that the results I get are not the same on my personal computer as those I get from the server itself. Before using my scripts to predict with all the files I have, I would like to have the same results as the server. I use UniRef90 (like they do) and I installed the last version of the program. What could be different and cause the small differences? I already sent an email asking what they are using but they didn't respond yet.
"2. The next most common question we get regards getting different predictions from our PSIPRED web server from those you get on your own system. There are many reasons why this might occur, but the most obvious one is that we are using a
different sequence data bank to the one you are using. Because modern secondary structure prediction methods are based on
analysing multiple sequence alignments, if your data bank includes some extra sequences or misses out some sequences compared
to our local data bank then you might get a slightly different prediction. Hopefully the differences will be small, but
they can be quite significant if the alignment only includes a few sequences. So, if our server can find say just 5
homologous sequences and your system finds 20 homologous sequences to align, then the predictions may be very different
indeed. That's just the reality of analysing evolutionary information. If the alignments are different, then the predicted
secondary structure will probably be different.
It's also possible that we are running a slightly older version of PSIPRED. Oddly enough we don't update our servers immediately
after releasing a new version of PSIPRED as we need to do internal testing first. So, check that you are running the exact
same version of PSIPRED as our server is currently running. Even so, it's likely that the real problem is going to be down
to differences in the data banks and alignments that are produced."
Also, I am no expert with this software but some structure modelling software are not always deterministic and have a constant seed parameter for debugging purpose. Others (like Rosetta Antibody) are wonderful but will never run or even compile correctly on your laptop.
Wow I spent a good part of this day trying to get an answer to my question on their site and all that time it was in the FAQ... I should have known ahah! I figured it would come down to either the PSIPRED version or the databank itself. I thought using the same they mentionned in the DISOPRED3 paper (UniRef90) and downloading the same version of DISOPRED3 from their site would work. Well I guess I'll have to try something or just leave it this way. Like it says, this is the reality of analysing evolutionary information.
Thanks a lot for your help! Maybe someone will know exactly what they are using as of right now so I'll wait for other answers but at least now I have a good explanation for my teacher^^
UniRef90 is frequently updated. The DISOPRED3 server is most certainly using an older version than the one you downloaded. I don't think it would be a simple process to try to match the same version they have (and I don't see them mention the release # anywhere). As long as your results are pretty close, you should feel confident that DISOPRED3 is working correctly.
UniRef90 is frequently updated. The DISOPRED3 server is most certainly using an older version than the one you downloaded. I don't think it would be a simple process to try to match the same version they have (and I don't see them mention the release # anywhere). As long as your results are pretty close, you should feel confident that DISOPRED3 is working correctly.