Briefly, I am trying to automate the process of aligning a query sequence (from FASTA) to an assembly graph (FASTG from Spades assembly). As output I need the sequences of the paths in the assembly graph corresponding to the alignment(s).
More detail: I have used Spades to assemble the genome from a diploid yeast starting with short reads (WGS sequencing). Using the wonderful program Bandage I am then able to BLAST a certain query sequence against the assembly graph (FASTG file). Because of the diploid nature of the genome, the result looks like this:
There are two paths corresponding to this BLAST alignment. In Bandage I can select the nodes corresponding to a path, and then export that path's sequence to a FASTA file.
Doing this manually gives me exactly what I want (essentially, haplotypes derived from the assembled genome). However, I would very much like to automate this process. What tools should I be looking into?
Thank you for providing an answer/closure for the question. It will help someone in future.