Multiple sequence alignement MSA editing

0

Entering edit mode

7.9 years ago

TEman ▴ 10

I want to remove all rare insertions (when it occurs in less than 5% of the sequences) in a multiple sequence alignment file (clustal .aln) with 699 sequences.

That is, I have a MSA with many columns containing only one or two insertions while the rest of the sequences are blank "-". It is by far too much to do manually.

Any suggestions how to do this?

R alignment clustal • 2.1k views

ADD COMMENT • link 7.9 years ago by TEman ▴ 10

1

Entering edit mode

Do you specifically want to do this in R?

If you use BioPython, you can create an ungapped concensus sequence with a threshold for inclusion of a particular residue in a column.

ADD REPLY • link 7.9 years ago by Joe 22k

Login before adding your answer.

Similar Posts

Loading Similar Posts

Traffic: 3142 users visited in the last hour

Content Search
Users
Tags
Badges

Help About
FAQ

Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the

version 2.3.6