Hi,
I wish to identify unique amino acid subsequences in a MSA for a sequence of interest.
Example:
A = Sequence of interest
A:NYTPLUYB
B:NYPNLUYB
C:NYPNLUYB
Here in A sequence 'TP' are the unique amino acids. I am looking for a tool/package that can automatically detect these unique patterns. Any suggestion will be highly appreciated.
I guess you are after single-copy K-mers in those strings, which in fact don't need to be aligned, right?