ClustalW alignment - Unexpected result
0
0
Entering edit mode
6.1 years ago

Hi,
I have two sequences

>seq1
AAAAAAACCCCCCCCCCAAAAAAAAAAAAAAAAAAAAAAAAA
>seq2
TTTTTTTTTTTTTTTTTTTCCCCCCCCCCTTTTTTTTTTTTT

And I want to align them this way using clustalW

AAAAAAA-------------------CCCCCCCCCCAAAAAAAAAAAAAAAAAAAAAAAAA-------------
-------TTTTTTTTTTTTTTTTTTTCCCCCCCCCC-------------------------TTTTTTTTTTTTT

The exact alignment is not important but I want that only the Cs are aligned and the mismatching bases are gapped.

So I set the following scores

  • match: 1
  • mismatch: -1
  • gap opening: 0
  • gap extending: 0

But what I get is this alignment

------------AAAAAAACCCCCCCCCCAAAAAAAAAAAAAAAAAAAAAAAAA
TTTTTTTTTTTTTTTTTTTCCCCCCCCCCTTTTTTTTTTTTT------------

I don't understand why this is happening. The score for my expected alignment would be 10 (10 matches) while the score for the alignment I get should be -10 (10 matches and 20 mismatches).

I use the following R code

library (msa)

seqs <- DNAStringSet(c("AAAAAAACCCCCCCCCCAAAAAAAAAAAAAAAAAAAAAAAAA", "TTTTTTTTTTTTTTTTTTTCCCCCCCCCCTTTTTTTTTTTTT"))
smatrix <- matrix(c(1,-1,-1,-1,-1,1,-1,-1,-1,-1,1,-1,-1,-1,-1,1), ncol=4, nrow=4, byrow = T)
colnames(smatrix) <- c("A", "C", "G", "T")
rownames(smatrix) <- c("A", "C", "G", "T")
aln <- msa (seqs, type = "dna", gapOpening=0, gapExtension=0, substitutionMatrix = smatrix)
print(aln, show="complete")

Output:

MsaDNAMultipleAlignment with 2 rows and 54 columns
    aln 
[1] ------------AAAAAAACCCCCCCCCCAAAAAAAAAAAAAAAAAAAAAAAAA
[2] TTTTTTTTTTTTTTTTTTTCCCCCCCCCCTTTTTTTTTTTTT------------
Con ???????????????????CCCCCCCCCC?????????????????????????

Does anyone know why this is happening and what do I have to change to get my expected result?

clustalw alignment • 1.2k views
ADD COMMENT

Login before adding your answer.

Traffic: 1675 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6