Tool to align metagenomic data to reference genome
2
0
Entering edit mode
18 months ago
Madde ▴ 20

I have 100 metagenomic samples and a complete reference genome for Gardnerella vaginalis (14018).

I want a whole genome alignment of Gardnerella vaginalis for each metagenome, with the assumption that each alignment represents the population of Gardnerella vaginalis in each metagenome.

What tool would you recommend to create this fasta alignment by aligning metagenomic reads to a reference genome? And, how can I be confident that each read maps to the correct species (ie, only g. vaginalis reads map to g.vaginalis reference vs. lactobacillus reads mapping by accident)?

I have experience using bwa-mem2.

Thank you

metagenomics alignment • 711 views
ADD COMMENT
3
Entering edit mode
3 months ago

We wrote a pipeline for doing this here: https://github.com/MHH-RCUG/nf_wochenende

Check the installation docs. You can download a reference sequence of many bacteria which should include your samples of interest.

When doing metagenomics you are not working with isolates, but obviously with mixed samples, so I would always try to work with as many genomes in your reference sequence as possible. The flipside is you get problems with read attributions, mapping quality and quantification if you include say 1000 almost identical E. coli genomes in your reference. Try out our carefully constructed reference and see how you get on.

ADD COMMENT
1
Entering edit mode
3 months ago
Jason ▴ 10

Suggest BBTools and minimap2.

With BBTools you can also provide multiple references with the bbsplit which is more for dealing with a mixed sample but might work for your use case.

Wth minimap2 you can set high stringency with the -x asm5 (most stringent) or -x asm20 (least stringent of these settings)

asm5/asm10/asm20 - asm-to-ref mapping, for ~0.1/1/5% sequence divergence

ADD COMMENT

Login before adding your answer.

Traffic: 2562 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6