I have 151 fasta assemblies (contigs) corresponding to the fungal genome with a size of 50 Mb, including the reference sequence and I would like to perform a pangenome in order to have a single fasta sequence combining the 151 strains.
At first I mapped each of these assemblies to reference sequence then I extracted the bam file that contained unmapped reads from each stub, then I am stuck for the rest of the steps.
How is it possible to perform a single reference sequence based on the pangenome of these sequences?
Thank you, Kamel
You are reaching into the realm of genome graphs, a single space representing multiple genomes Look into the vg toolkit, cactus alignment and pggb It will require new whole genome alignments in order to generate the graph (cactus and pggb) which can then be used by vg in order to use the graph in pangenome analyses