Generate consensus sequence without considering gaps
1
0
Entering edit mode
2.3 years ago
Nathan ▴ 10

Hello.

I have a file with aligned amino acids like below. Each sequence is a fragment of the same protein, therefore, they are identical in some regions.

>seq1
-APQALVARPHVTAPSARRSSRPLLMR---
>seq2
-----LVARPHVTAPSARRSSRPLLMRAAG
>seq3
--PQALVARPHVTAPSARRSSRPLLMRA--
>seq4
SAPQALVARPHVTAPSARRSSRPLL-----

I would like to merge all sequences in one, generating a larger one.

I have tried to use seqinr::consensus in R, but as the gaps ("-") are more frequent than the aminoacids, I don't get the complete protein:

> consensus
--PQALVARPHVTAPSARRSSRPLLMR---

Can anyone help me with this issue, please?

seqinr R consensus fasta • 752 views
ADD COMMENT
1
Entering edit mode
2.3 years ago
Mark ★ 1.6k

Use the EMBOSS suite of tools, I think EMBOSS Cons will do this with lots of options. https://www.bioinformatics.nl/cgi-bin/emboss/help/cons

ADD COMMENT

Login before adding your answer.

Traffic: 2658 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6