Hi, I have a protein with pdb available shown as a homotrimer. I'm interested in making a concatenated version of this protein, with 3 units linked with a long Gly linker (i.e. ChainA:ChainA:ChainA vs Chain A-(Gly)n-Chain A-(Gly)n-Chain A)
When I run 3 repeats of the protein on colabfold it works fine (better if I allow it to use templates as PDB is available), however when I try to run the linked trimer it fails. In my mind, with a long enough linker, it should give very similar results. Does anyone have any tips for improving structure prediction in this scenario?
I don't want to give away the protein of interest yet, but imagine running this with a streptavidin dimer/tetramer where it's a single protein linked by a 20 or 30x Gly linker.
Many thanks in advance!
What are the errors you are getting for the failed Gly linked trimer? What step is failing? Are you potentially running out of memory in the GPU you are using?
Potentially trying to use one of the larger GPUs could alleviate the issue, something like a V100 or A100 if it is a memory error.
You could also try running the analysis as a homodimer first to check the effect of different length linkers. That said, the first multimers were created using this concept, and this is why AF2-Multimer was developed. To my knowledge the non-linker multimer structures are better predicted.
Thanks, that's good idea to run the dimer first (kinda obvious now that I think of it). The structures are just messy combinations, whereas the actual protein complex has substantial intersubunit contacts with threading loops and threefold symmetry.
I should have added that I'm running this through the Cosmic website. On Google colab I run out of free GPU time before I can get a homotrimer. I suspect that running a linked dimer/trimer on multimeter mode would help (option on Google colab alphafold, not sure on Cosmic). I should just pay for Google colab I guess, the cost is negligible compared to the kits and time needed to clone these constructs.
Given that there's already a cryo structure of this, is there any way to force it onto a template? I'm just trying to get a rough idea of necessary linker length (the termini already have a few flexible AAs).
I'm new to this so I'm also struggling to get my head around the parameters such as suitable # of models vs recycles.
Yeah, I suspect you're exceeding the limits of free allowances and that's the reason it's failing.
Setting up GCP and colabfold is pretty easy and for a single complex it'll be cheap if you use the cost-efficient T4s. You usually can get some free credits too so may not even have to pay for this project. I think the length limit is ~2500 AAs at the moment.
From experience, I doubt you'll manage to get a good resolution of such a complex interaction. We couldn't replicate the interaction between Roq1 and XopQ for example. And whilst it's close and general orientation is correct, there are a lot of fine scale differences. This is especially true for unstructured looping regions in my work. Good luck!