Hi everyone,
Excited to share a brand new package our team has been working on for parsing and working with structural and sequence data.
Current features include easy ways to:
- Generate MSAs, including using hmmer + mafft, mmseqs2, and more methods coming in the near future.
- Easily parse structures and complexes.
- Remove waters and non-biopolymers.
- Align and compute RMSDs.
- Find structural features such as salt bridges, disulfide bonds, hydrophobics, and more.
- Easily extract coordinates for all atoms, specific segments of a structure, or even just the backbone.
- Easy to access dataframe corresponding to the loaded structure (similar to biopandas).
- Ability to generate diverse conformers using RDkit in a compute friendly way.
- Calculate Sequence Identity.
- Generate static and animated pseudo 3D plots of your proteins.
- Calculate lDDT between proteins.
- Calculate AlphaFold2 derived metrics such as iPAE.
- And much more!
We are also planning on actively developing this package so if you are looking for a feature that hasn't been added yet. Definitely, let us know as there's a good chance we can implement.
Demo Notebook: https://colab.research.google.com/github/NeurosnapInc/neurosnap/blob/main/example_notebooks/protein.ipynb
Github: https://github.com/NeurosnapInc/neurosnap
Full disclosure Neurosnap is a for profit company and the collection of tools outlined in this post are intended to be fully open source and free to use for everyone!