Hey everyone,
I'm working on combining contigs from two WGS projects into one scaffold set, as outlined here. I've just gotten CISA - which was recommended in a related post- to work for me in the MyPro virtual machine detailed on the CISA page.
Essentially, the CISA output genome size is half of what I specified. I've run it twice using different sets of contigs. It's worked both times, but in each case the total size of the final scaffold set was half of what I specified.
I've gotten similar results using Roche and Velvet assemblers. It looks like the contigs I'm using are too large for these programmes.
In each case, I don't have raw reads or quality values, only sequences from NCBI (the assembly statistics reports are also available on NCBI, but I haven't used them).
Which programme should be used for mapping sets of large contigs to a reference genome?
If there isn't an appropriate program for this, how can I manually fill the gaps in each sequence with contigs which are unique to the other sequence?
Thanks for your time,
Ronan
bwa-mem can help. Find out contigs aligning to unique regions, use them to fill in the gaps