Hello,
I am interested in the selection statistics (pi, Tajima’s D) of a few Gardnerella vaginalis genes. In order to best calculate these stats I want as many G. vaginalis genomes as possible. So, I have generated G.vaginalis MAGs from vaginal metagenomes (https://virgo.igs.umaryland.edu/).
The next step is where I am unsure.
My goal is to create individual alignments of my genes of interest from my MAGs.
Example:
geneA.fasta
>MAG1
actgactg
>MAG2
actaactg
>MAG3
actgactg
A challenge with Gardnerella is that the genes I’m interested are not universally annotated. Some annotators call them as hypothetical or as different names.
Do you know of good tools/approaches to make gene alignments from metagenomes?
My first thought was to use Roary which is a core genome aligner & gene predictor. https://sangerpathogens.github.io/Roary/ I can feed it annotated gff files from a prokka annotation of each of my MAGs, and it will generate alignments for each gene.
Another idea I had is to somehow align the MAGs to a reference, but I’m not sure if that would work.