Question

Mutation clustering in protein domains

0

Entering edit mode

5.1 years ago

Lio04 • 0

Hi everyone,

Via targeted resequencing, I have identified missense mutations in my gene of interest. These missense mutations are enriched in patients versus control individuals. I made a lolliop plot to visualise where these mutations are located on the protein and to see whether they occur in protein domains. I now want to know wheter certain protein domains are enriched for these mutations, or if these mutations tend to cluster in any other region of the protein.

What is the correct way to achieve this? Are there any papers or tools that I can look into?

Thanks a lot for your suggestions.

gene protein domain clustering missense mutation • 1.3k views

ADD COMMENT • link updated 5.1 years ago by Jean-Karim Heriche 27k • written 5.1 years ago by Lio04 • 0

score 1 · Answer 1 · 2020-03-31

1

Entering edit mode

5.1 years ago

Mensur Dlakic ★ 29k

I am not sure whether this is just the way you are describing it, or if you understanding of protein domains is different from mine. In proteins, there is no division between "domains" and "other regions of the protein." Sure, there may be short linkers or unfolded parts in some proteins, but for practical purposes all proteins have domain organization.

If you submit your protein to Pfam, it will generate its domain organization. A domain organization of a random protein is shown here, and you can easily check whether your mutations fall into the same domain boundary.

What may be more informative is to check whether your mutations cluster together spatially, even if they are some distance apart in the same domain or even in different domains. For that you would need a 3D structure (or 3D model) of your protein, and I don't have enough information if something like that is available.

ADD COMMENT • link 5.1 years ago by Mensur Dlakic ★ 29k

0

Entering edit mode

Yes, I already used Pfam to retrieve the domain organization of my protein of interest and mapped the mutations onto my protein, so I know where they are located. My question exactly is how to bioinformatically/statistically test clustering to present the results in a more scientific way instead of by visual confirmation.

I could find a 3D model via Swiss-Model Repository. Can I use a tool or test other than to visually check clustering?

Thank you very much!

ADD REPLY • link 5.1 years ago by Lio04 • 0

0

Entering edit mode

If you have a structure (or a model) of your protein, coloring mutated residues differently from the rest should help you visualize whether they are in close spatial proximity. PyMol can do that easily - an extensive tutorial is here, and you will specifically need Selection commands. If you want to quantify this beyond visualization, PyMol also can measure distances between residues. You can safely assume that residues closer than 8-10 angstroms in space are part of the same "patch" within a molecule, and it may be appropriate to use even larger distance.

ADD REPLY • link 5.1 years ago by Mensur Dlakic ★ 29k

0

Entering edit mode

I will look into it, thank you very much!

ADD REPLY • link 5.1 years ago by Lio04 • 0

0

Entering edit mode

A word of caution about visual confirmation is that mutations can preferentially occur at certain nucleotide sequence contexts (e.g. CpG) which are not necessarily evenly distributed across the CDS of a protein.

ADD REPLY • link 5.1 years ago by Collin ▴ 1000

score 1 · Answer 2 · 2020-04-01

1

Entering edit mode

5.1 years ago

Jean-Karim Heriche 27k

This paper titled "Clustering of phosphorylation site recognition motifs can be exploited to predict the targets of cyclin-dependent kinase" may give you some ideas on how to test for clustering along a sequence.

ADD COMMENT • link 5.1 years ago by Jean-Karim Heriche 27k