Hi!
I have ~163000 contigs (a single FASTA file) and I want to map them to a refernce genome (also a FASTA file). Is there any way to do it? I have tried bowtie2, but could not work out its nuances.
Any help appreciated!
Thanks
Hi!
I have ~163000 contigs (a single FASTA file) and I want to map them to a refernce genome (also a FASTA file). Is there any way to do it? I have tried bowtie2, but could not work out its nuances.
Any help appreciated!
Thanks
Bowtie2 is a short read mapper and, although it can be used to map long sequences, it probably won't be good at it, specially if the query and reference are even moderately divergent.
In addition to the suggestions by SMK and genomax , there is also LAST.
Another option to consider is QUAST, an assembly evaluation tool. Quast uses minimap2 to align one (or more) query genomes to a reference genome, and in addition to the alignment, it will provide a number of metrics comparing the query to the reference.
Hi,
You can use D-genies, it can map two fasta to produce a dotplot (http://dgenies.toulouse.inra.fr/)
Alternatives are:
If you are more interested in ordering your contigs, then you could try "show-tiling" from Mummer
You can try nucmer
from mummer
:
nucmer -p output_prefix ref.fa example-contigs.fa
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Try minimap2 with preset
-x
:Is this a related reference (i.e. you expect high homology)? If so you could use
blat
or evenblast+
. If these are very large contigs a program like LASTZ would also be valuable.