Mapping contigs to a reference genome
3
1
Entering edit mode
5.5 years ago
SpamChop ▴ 10

Hi!

I have ~163000 contigs (a single FASTA file) and I want to map them to a refernce genome (also a FASTA file). Is there any way to do it? I have tried bowtie2, but could not work out its nuances.

Any help appreciated!

Thanks

alignment genome • 4.7k views
ADD COMMENT
3
Entering edit mode

Try minimap2 with preset -x:

asm5/asm10/asm20: asm-to-ref mapping, for ~0.1/1/5% sequence divergence

ADD REPLY
2
Entering edit mode

Is this a related reference (i.e. you expect high homology)? If so you could use blat or even blast+. If these are very large contigs a program like LASTZ would also be valuable.

ADD REPLY
2
Entering edit mode
5.5 years ago
h.mon 35k

Bowtie2 is a short read mapper and, although it can be used to map long sequences, it probably won't be good at it, specially if the query and reference are even moderately divergent.

In addition to the suggestions by SMK and genomax , there is also LAST.

Another option to consider is QUAST, an assembly evaluation tool. Quast uses minimap2 to align one (or more) query genomes to a reference genome, and in addition to the alignment, it will provide a number of metrics comparing the query to the reference.

ADD COMMENT
1
Entering edit mode
5.5 years ago
Corentin ▴ 610

Hi,

You can use D-genies, it can map two fasta to produce a dotplot (http://dgenies.toulouse.inra.fr/)

Alternatives are:

If you are more interested in ordering your contigs, then you could try "show-tiling" from Mummer

ADD COMMENT
0
Entering edit mode
5.5 years ago
evelyn ▴ 230

You can try nucmer from mummer:

nucmer -p output_prefix ref.fa example-contigs.fa
ADD COMMENT

Login before adding your answer.

Traffic: 2503 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6