Entering edit mode
9 months ago
Curious
▴
10
Hello!
I'm working with a heteromer protein. In your predicted model a 200aa loop was not well modeled. I tried to improve this part using ModLoop and DaReUS-Loop, but it didn't work. Can you tell me about other tools, please?!
NOTE: I've googled other tools, but I haven't found anything that helps me.
Hard to provide advice since we don't know what data you are working with as input, or many details about the protein, or what you used initially to model it. How do you know it's poorly modeled? Do you have ground truth data?
Some generic suggestions - have you tried AlphaFold2 (or ColabFold for an easier interface) or an MSA free one like ESMfold, OmegaFold, or OpenFold-SoloSeq?
It is a transmembrane protein with 2100 amino acid residues. It has no known homologues and was modeled with AlphaFold2. The only part that was poorly modeled was the loop, which has 200 aa.
I think you have your answer there unfortunately. If it has no known homologues, then it's likely the MSA was poor and it's surprising only a small 200 AA section is poor confidence. It might be worth trying one of the non-MSA protein LLM models.
Also, if you're working on non-model data, AF2 can perform relatively poorly and there isn't all that much you can do unless you fine-tune the model or build a custom MSA database, where the latter has been shown to be very beneficial. See paper here and preprint here.
I see. I appreciate your help greatly!
Still worth checking the MSA too, you might get some coverage of regions despite no known homologues of the entire sequence. That could explain it, even if it's not a fix.
I will do that, thank you very much!