Hi all!
I'm trying to get a complete draft genome assembly of a bacteria from NGS Proton data. Here are my first steps : read cleaning using prinseq, alignment against hg19 (with bowtie2) in order to remove aligned reads, de novo assembly using Mira and scaffold building using abacas on a close reference genome. I got some gaps (NNNN sequences) on my consensus sequence. My next steps would be gap closed, iterative mapping and then genome annotation. I'm struggling with gaps, do you have any tips about removing gaps of my consensus sequences? I tried to use GMcloser, gapfiller and IMAGE but they are not especially designed for Proton technology... Any feedback on these softwares or other I missed?
Thank you very much!
Have you considered the possibility that the "missing" sequence is missing from your data (not sequenced or got removed in the post-processing you did) or are perhaps absent in your particular strain?