I have a set of sequences for the YPR193C coding sequence from various yeast strains. I would like to get the percent identity matrix from multiple sequence alignments using ClustalW, Clustal Omega, or MUSCLE using the Biopython wrappers. This should be possible for ClustalW and Clustal Omega based on the documentation, but I can't figure out how to output (or even print) the percent identity matrix. Additionally, I get a .dnd file from ClustalW but not Clustal Omega. Here is a minimal example using ClustalW and Clustal Omega:
TRY CLUSTALW2
from Bio.Align.Applications import ClustalwCommandline
file = 'YPR193C_cds_GatheredSeqs.fasta'
clustalw_exe = r"C:\Program Files (x86)\ClustalW2\clustalw2.exe"
clustalw_cline = ClustalwCommandline(clustalw_exe, infile=file, pim=True)
print(clustalw_cline)
stdout, stderr = clustalw_cline()
TRY CLUSTAL OMEGA
from Bio.Align.Applications import ClustalOmegaCommandline
clustalo_exe = r"C:\Users\RossLab\Downloads\clustal-omega-1.2.2-win64\clustalo.exe"
clustalo_cline = ClustalOmegaCommandline(clustalo_exe, infile=file, outfile='YPR193C_ClustalO', percentid=True)
print(clustalo_cline)
stdout, stderr = clustalo_cline()
COMMANDS RESULTING FROM THE PRINT STATEMENTS ABOVE
"C:\Program Files (x86)\ClustalW2\clustalw2.exe" -infile=YPR193C_cds_GatheredSeqs.fasta -pim
C:\Users\Sean\clustal-omega-1.2.2-win64\clustalo.exe -i YPR193C_cds_GatheredSeqs.fasta --percent-id -o YPR193C_ClustalO
I don't get any errors when running these and I get alignment files for both ClustalW and Clustal Omega, as well as a .dnd file for ClustalW only. What am I missing?
I have the same issue, did you already know how to do this? :)