Hi guys, I'm currently will be doing comparative genome analysis of human coronaviruses (including SARS-CoV-2, SARS-CoV and MERS-CoV). Tentatively the number of sequences might be over 1000+ or 2000+ and it will be RNA sequences. May I know which multiple sequence alignment tool will be suitable in my case? I prefer high accuracy with short computational time tool, it will be the best if there are online server that I can use. So far I saw, MAFFT, MUSCLE and T-Coffee are quite not bad. Can kindly explain their pros and cons?
All of the recommendations and opinions are highly appreciated. Thank you.
There isn't much to choose between the software in terms of speed and accuracy. MAFFT and MUSCLE are both very good in my experience. The main thing will be ensuring you pick appropriate models/methods, but they should both do this in a mostly automated way anyway.
Often it comes down to subjectively deciding which alignment looks better to you. You may also find one tool more user friendly than another, but there's no technical reason not to use any of them to my knowledge.
They should all be available online in one form or another, but you may need to just watch your input file size. With that many examples, depending on the sequence length, you may end up with a file that the webserver won't allow you to upload.