Question

Mugsy aligner kills alignment with "can't find species error"

0

Entering edit mode

7 months ago

skoplik • 0

I'm using musgy to align two genomes and having some issues.I am running the following command

mugsy "$large_fasta" "$sample_fasta" --directory "$full_out_path"

I installed and built from anaconda. I am aligning 9 fastas one at a time (each with multiple sequences) to larger fasta (single sequence). 3 of the 9 fastas get killed with an error

Can't find species <fasta header name>  at <path to mugsy>/envs/Mugsy/bin/mugsy line 501

This sequences "not found", do exist in the fasta (I did a quick grep to check), usually in the middle of it somewhere, so I do not think it is a syntax issue. This is only a problem with a few of the fastas, and not all of them and I cannot find anything uniquely weird or different about the ones getting killed. The fastas were all built using the same pipeline, located in the same folder, and the file path is correct. The aligner recognizes the sequences are there too and prints some progress update.

Parsing sequences for <fasta 1 name>  num_seqs:1
Parsing sequences for <fasta 2 name> num_seqs:166

I will note the num_seqs is incorrect for the files that fail. and already tried removing any white space etc. from the file using

sed -i 's/[[:space:]]*$//' "$sample_fasta"

I would appreciate any help that anyone can provide on this!

aligner mugsy WGA • 289 views

ADD COMMENT • link updated 7 months ago by Ram 44k • written 7 months ago by skoplik • 0