Dear all, Nowadays many scientists have to solve some problems related to SARS-CoV-2. The viral genomes are huge(~30kb). Is it correct, that to align several of them I have to put those fasta-genomes to a single file, add a reference genome to the same file and run a proper tool like MAFFT?
(like here - https://www.biostars.org/p/187360/) It will be my next question – what MAFFT-option I should use? Probably MAFFT-L-INS-i, since «More specifically, in Figure 1 in Le et al. [18], the advantage of MAFFT-L-INS-i (an iterative refinement method) over MAFFT-L-INS-1 (a progressive method) was clearly observed for a small number of sequences but not for thousands of sequences.» Thank you! Natalia Sernova
Unfortunately some genomes are elsewhere, I had to search other dbs to find most of them.
You could use the software/procedure
nextstrain
uses to build their alignments for SARS-CoV-2.