I am aligning many similar sequences from a BLAST result and looking for mutations at certain positions. My inclination is that an MSA (Clustal Omega) is the best approach but my PI is worried about misalignments and believes that Pairwise alignments against a reference sequence would be the best approach. Assuming that all the sequences to be aligned are homologs, which method would be more accurate and why? I need to convince her that I am right i.e. more information will produce better alignments. Thanks!
It'd be helpful if you provided more information. Are you performing local realignment around indels in the blast results before calling variants (doing this should produce similarish results to using MSA on those regions)? How certain are you that the section of the reference that you're interested matches the sequences you're blasting? If this is data was derived from a PCR that you strongly believe is specific then MSA might work OK. If you have much in the way of off-target sequences, however, then you're going to run into problems.
These are sequences extracted from a metagenomic sample targeting a gene of interest using BLASTp with a relatively high bit score and identity cut-off so all of the sequences to be aligned are very similar. I need to create an alignment to check for mutations based on their position in a reference sequence, so I either do a pairwise alignment of each sequence with the reference or do an MSA including the reference and then check the positions in the MSA using the reference sequence to identify the desired columns in the MSA to iterate over. This is all being done within a script because there are thousands of sequences to examine.