Hi all,
I have a multiple sequence alignment of ~ 26000 sequences, each around 16kb. I would like to construct a vg graph of this alignment with all sequences embedded as haplotypes. However I am finding that
vg construct -M *msa_path*
Is too slow with this input size. I am able to construct the graph without the embedded paths however. I can also split the input into a smaller number of sequences and construct graphs for each subset of the MSA, but in this case I have not been able to merge these smaller graphs by overlapping nodes. So I was wondering if there was a simple way to achieve this, or is it simply infeasible to construct a graph in this way?
Thank you!