Question

Mapping contigs to a reference genome

1

Entering edit mode

5.6 years ago

SpamChop ▴ 10

Hi!

I have ~163000 contigs (a single FASTA file) and I want to map them to a refernce genome (also a FASTA file). Is there any way to do it? I have tried bowtie2, but could not work out its nuances.

Any help appreciated!

Thanks

alignment genome • 4.8k views

ADD COMMENT • link updated 5.6 years ago by evelyn ▴ 230 • written 5.6 years ago by SpamChop ▴ 10

3

Entering edit mode

Try minimap2 with preset -x:

asm5/asm10/asm20: asm-to-ref mapping, for ~0.1/1/5% sequence divergence

ADD REPLY • link 5.6 years ago by AK ★ 2.2k

2

Entering edit mode

Is this a related reference (i.e. you expect high homology)? If so you could use blat or even blast+. If these are very large contigs a program like LASTZ would also be valuable.

ADD REPLY • link 5.6 years ago by GenoMax 148k

1

Entering edit mode

5.6 years ago

Corentin ▴ 610

Hi,

You can use D-genies, it can map two fasta to produce a dotplot (http://dgenies.toulouse.inra.fr/)

Alternatives are:

If you are more interested in ordering your contigs, then you could try "show-tiling" from Mummer

ADD COMMENT • link 5.6 years ago by Corentin ▴ 610

0

Entering edit mode

5.6 years ago

evelyn ▴ 230

You can try nucmer from mummer:

nucmer -p output_prefix ref.fa example-contigs.fa

ADD COMMENT • link 5.6 years ago by evelyn ▴ 230

score 2 · Accepted Answer · 2019-06-06

Bowtie2 is a short read mapper and, although it can be used to map long sequences, it probably won't be good at it, specially if the query and reference are even moderately divergent.

In addition to the suggestions by SMK and genomax , there is also LAST.

Another option to consider is QUAST, an assembly evaluation tool. Quast uses minimap2 to align one (or more) query genomes to a reference genome, and in addition to the alignment, it will provide a number of metrics comparing the query to the reference.