vg paired end short read mapping: Best practice?
1
2
Entering edit mode
4.1 years ago

I'm trying to map short reads to a genome graph constructed from multiple whole genome alignments of A. thaliana. What is the best practice method to produce the alignments in a reasonable time frame. I've been mapping just a subset of 5000 pairs to a single chromosome graph. vg map takes >24h to complete and enormous resources (up to 500G of RAM on a single core). Mapping a full sequencig run to the full genome graph took over 7 days to complete. Any help would be appreciated. Thanks!!!

mapping short read genome graph vg • 1.8k views
ADD COMMENT
0
Entering edit mode

Hello, can i have your feedback about this question please ? and what worked the best for you

ADD REPLY
1
Entering edit mode

Speed is still an issue with vg map. If your graph is complicated, it can be intractable to map read pairs with vg map. Sometimes the reads can often still be mapped as single-ended reads, although at the expense of mapping rate. The alignment algorithm can also be swapped out for a faster (but less accurate) algorithm with --xdrop-alignment

These days you might have better luck with the vg giraffe mapping tool instead of vg map. vg giraffe tends to be much faster (closer to the speed of bwa mem). However, some of the indexes in vg giraffe can be difficult to build on very complicated graphs.

ADD REPLY
0
Entering edit mode
4.1 years ago
xwwang ▴ 20

In my experience, you can use the multiple threads to speed up via the -t option: -t $nthread

where $nthread is the number of threads, e.g. 10

In this way, it will be much fast. It took me one day for 200 million paired-end reads to map to a large graph.

ADD COMMENT

Login before adding your answer.

Traffic: 1669 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6