Question

vg paired end short read mapping: Best practice?

2

Entering edit mode

4.1 years ago

christian.kubica ▴ 20

I'm trying to map short reads to a genome graph constructed from multiple whole genome alignments of A. thaliana. What is the best practice method to produce the alignments in a reasonable time frame. I've been mapping just a subset of 5000 pairs to a single chromosome graph. vg map takes >24h to complete and enormous resources (up to 500G of RAM on a single core). Mapping a full sequencig run to the full genome graph took over 7 days to complete. Any help would be appreciated. Thanks!!!

mapping short read genome graph vg • 1.8k views

ADD COMMENT • link updated 2.7 years ago by Jordan M Eizenga ▴ 650 • written 4.1 years ago by christian.kubica ▴ 20

0

Entering edit mode

Hello, can i have your feedback about this question please ? and what worked the best for you

ADD REPLY • link 2.7 years ago by bouchenak.chuxi • 0

1

Entering edit mode

Speed is still an issue with vg map. If your graph is complicated, it can be intractable to map read pairs with vg map. Sometimes the reads can often still be mapped as single-ended reads, although at the expense of mapping rate. The alignment algorithm can also be swapped out for a faster (but less accurate) algorithm with --xdrop-alignment

These days you might have better luck with the vg giraffe mapping tool instead of vg map. vg giraffe tends to be much faster (closer to the speed of bwa mem). However, some of the indexes in vg giraffe can be difficult to build on very complicated graphs.

ADD REPLY • link 2.7 years ago by Jordan M Eizenga ▴ 650

score 0 · Answer 1 · 2020-10-18

0

Entering edit mode

4.1 years ago

xwwang ▴ 20

In my experience, you can use the multiple threads to speed up via the -t option: -t $nthread

where $nthread is the number of threads, e.g. 10

In this way, it will be much fast. It took me one day for 200 million paired-end reads to map to a large graph.

ADD COMMENT • link 4.1 years ago by xwwang ▴ 20