What to do with a bunch of AlphaFold2 structures?
4
1
Entering edit mode
3.2 years ago
Dunois ★ 2.8k

I've ended up with the structures of a dozen or so proteins predicted by AlphaFold2 by happenstance. The proteins are all orthologs. The provenance of the sequences is transcriptomic (de novo assemblies, to be specific).

I'd like to try and do something with these structures, but I am drawing a blank on ideas. I was hoping the Biostars community could perhaps suggest some avenues of inquiry I could explore with these structure predictions.

prediction comparison pymol structure alphafold2 • 3.0k views
ADD COMMENT
1
Entering edit mode
3.2 years ago
Jiyao Wang ▴ 380

You can show the domain annotation in iCn3D. Go to https://www.ncbi.nlm.nih.gov/Structure/icn3d/full.html, click "File > Open File > PDB file" and then "Analysis > Seq. & Annotations", e.g., https://www.ncbi.nlm.nih.gov/Structure/icn3d/full.html?afid=A0A061AD48&showanno=1 enter image description here

ADD COMMENT
0
Entering edit mode

This is really interesting, and will come in very handy!! Thank you for the link!!

ADD REPLY
1
Entering edit mode
3.2 years ago
Mensur Dlakic ★ 28k

Since they are orthologs the structures are likely to be very similar (if not identical), so for practical purposes it is as if you have only one structure. Unless I knew something about these proteins that can be tested or highlighted by having a structure, I would not do anything beyond making a pretty picture for lab meetings, or to entice inquiring rotation students.

  • Are they enzymes? If so, it may be worth highlighting their catalytic site. If you know the substrate (or have a list of potential substrates), maybe try to dock it in the active site and show the predicted complex.
  • Yet another option is to calculate the electrostatics and solvation, and see if anything interesting comes up.
  • Are there splicing variants? If so, how do predicted structures differ among them?
ADD COMMENT
1
Entering edit mode

To add to this, if they are orthologues they might not be that similar, depending on the divergence of the sequences. If that is the case, you could do some structural comparison work to build an understanding of the conservation of structure vs sequence (how robust is the structure to sequence change?)

You can do a number of structural comparatives, such as determining exactly how similar the structures are (RMSD/TM score etc). You can also do some nice illustrations mapping the sequence conservation on to the structure itself to show visually where the critical residues are etc.

ADD REPLY
0
Entering edit mode

Mensur Dlakic Joe thank you both for your inputs!! Very insightful as always!!

I'll look into all three avenues Mensur Dlakic suggested, since all of them are viable. My proteins are indeed enzymes, and there are splice variants (or at least protein isoforms anyway).

You can do a number of structural comparatives, such as determining exactly how similar the structures are (RMSD/TM score etc). You can also do some nice illustrations mapping the sequence conservation on to the structure itself to show visually where the critical residues are etc.

Joe this kind of RMSD analysis was the loose idea I had in mind. I was thinking of comparing the predictions to a crystal structure pairwise, and calculating the RMSDs.

Regarding your latter idea, how would I go about that? Or specifically, what tools do I need for that?

ADD REPLY
1
Entering edit mode

To calculate positional residue conservation:

http://prodata.swmed.edu/al2co/al2co.php

And to map it onto structure (red is conserved, blue is not):

enter image description here

ADD REPLY
1
Entering edit mode

There are various tools out there, but I always use the Render By Conservation tool with UCSF Chimera (you just need a MSA of the sequences which correspond to the structure basically). If I find a bit of time I might be able to put a walkthrough together.

But basically you end up with something that looks like this:

PVC1-5-composite

ADD REPLY
0
Entering edit mode

Mensur Dlakic , Joe thank you both for the inputs. I'll try the UCSF Chimera route first.

I guess I'll update the thread again if--and when--I stumble.

ADD REPLY
0
Entering edit mode
2.7 years ago
Jiyao Wang ▴ 380

For AlphaFold structures, iCn3D can also show SNP, ClinVar, and 3D domain annotations besides the conserved domain. One example is shown in this URL with the UniProt ID Q08426: https://structure.ncbi.nlm.nih.gov/icn3d/share.html?AFQBUFHTYvwqQPdx6 enter image description here

ADD COMMENT
0
Entering edit mode
2.7 years ago
Jiyao Wang ▴ 380

Now you can align several AlphaFold or PDB structures in iCn3D using the menu "File > Align > Multiple Chains", e.g., https://www.ncbi.nlm.nih.gov/Structure/icn3d/full.html?chainalign=P69905_A,P01942_A,1HHO_A

enter image description here

ADD COMMENT

Login before adding your answer.

Traffic: 2005 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6