I'm using musgy to align two genomes and having some issues.I am running the following command
mugsy "$large_fasta" "$sample_fasta" --directory "$full_out_path"
I installed and built from anaconda. I am aligning 9 fastas one at a time (each with multiple sequences) to larger fasta (single sequence). 3 of the 9 fastas get killed with an error
Can't find species <fasta header name> at <path to mugsy>/envs/Mugsy/bin/mugsy line 501
This sequences "not found", do exist in the fasta (I did a quick grep to check), usually in the middle of it somewhere, so I do not think it is a syntax issue. This is only a problem with a few of the fastas, and not all of them and I cannot find anything uniquely weird or different about the ones getting killed. The fastas were all built using the same pipeline, located in the same folder, and the file path is correct. The aligner recognizes the sequences are there too and prints some progress update.
Parsing sequences for <fasta 1 name> num_seqs:1
Parsing sequences for <fasta 2 name> num_seqs:166
I will note the num_seqs
is incorrect for the files that fail. and already tried removing any white space etc. from the file using
sed -i 's/[[:space:]]*$//' "$sample_fasta"
I would appreciate any help that anyone can provide on this!