Question

Identify unique amino acids in a seuence of interest present in a MSA block

1

Entering edit mode

8 weeks ago

sushmitapaul007 ▴ 10

Hi,

I wish to identify unique amino acid subsequences in a MSA for a sequence of interest.

Example:

A = Sequence of interest

A:NYTPLUYB
B:NYPNLUYB
C:NYPNLUYB

Here in A sequence 'TP' are the unique amino acids. I am looking for a tool/package that can automatically detect these unique patterns. Any suggestion will be highly appreciated.

Multiple-alignment-sequence • 275 views

ADD COMMENT • link updated 8 weeks ago by b.contreras.moreira ▴ 310 • written 8 weeks ago by sushmitapaul007 ▴ 10

0

Entering edit mode

I guess you are after single-copy K-mers in those strings, which in fact don't need to be aligned, right?

ADD REPLY • link 8 weeks ago by b.contreras.moreira ▴ 310

score 0 · Answer 1 · 2024-09-25

There is a python script that will calculate conservation scores from the alignment:

https://github.com/Cantalapiedra/msa_conservation_index

Most people want to detect the conservation rather than a lack of it, but it should still work for your purpose after you pick a conservation threshold.

What also might work for you is to use a MSA viewer and color the sequences according to strict conservation. In that case residues that are not conserved will not be colored.