Hi,
I have a python program generating a clustalw2 alignment of about 500 sequences from a fasta file. The names of the sequences correspond to the respective organisms plus the substrate specificity of a given sequence. Therefore quite a few of these names are identical and i get the error message: "Error: Multiple sequences found with same name" and no alignment is generated. Is it possible to ignore this error without having to change all the sequence names?
Cheers David
+1 for this. In the past I have just GREPed the names and added numbers or more information to make them unique, but I like this idea better.
Agree. Many phylogenetic programs have problems handling fancy sequence names. The horrible case is phylip format (used by RAxML etc) which allows only 10 characters per name. So I always rename the sequences as "s1", "s2", s3"... I don't recommend using 1, 2, 3... because some programs cannot handle numerical sequence names.