Hi All,
Im using the supermat function in R to generate a super matrix (i.e. concatenate the multiple sequence alignments) and run RAxML over the concatenated MSA file, to do that supermat function is also producing the partitions files. when input the partition file to run RAxML using the -q parameter, the program is giving me the following error:
ERROR: Bad base (I) at site 4 of sequence 1
Printing error context:
400 6099
1 AAMIALKCETDFVAKNADFVALTQ
Problem reading alignment file
When I run however without the -q parameter, over the same concatenated multiple sequence alignment file, the program starts running without errors, first few lines of my alignment file in phylip format looks like this:
400 6099
1 AAMIALKCETDFVAKNADFVALTQAILDAAIANKCQTLDDVKALPM-GSGT----IADAIVERSGITGEKTELDGYFFVSGA-----CTAVYNHMNKNQ-----
and then the first few lines of the partition file looks like this:
DNA, EF_TS_NCBI400Genes_msa = 1 - 370
DNA, EFG_C_NCBI400Genes_msa = 371 - 463
DNA, EFG_II_NCBI400Genes_msa = 464 - 538
DNA, EFG_IV_NCBI400Genes_msa = 539 - 669
DNA, EFP_N_NCBI400Genes_msa = 670 - 728
Now I notice that it specifies the partitions as DNA, whereas my sequences are amino acids, is that the only problem? I have tried replaceing DNA by protein/PROT/PROTEIN etc still gave the same error, is there a flag that I should use in the supermat function so that I can tell it it;s a protein sequence?
Any help is appreciated. Thanks.
okay yes that did the trick. I should read this manual more carefully. Thank you for your time and answer!