I have a genome assembled from pacbio reads. However I have noticed many frameshift mutations in the genome due to which some genes are not annotated even if they are present in the genome. How would I be able to fix it?
I have a genome assembled from pacbio reads. However I have noticed many frameshift mutations in the genome due to which some genes are not annotated even if they are present in the genome. How would I be able to fix it?
What kind of a genome is it, and what kind of assembler have you used? Normally, PacBio-only assembly should not contain many errors, since assemblers polish the assembly before it's finalized.
At any rate, you can polish the assembly using Racon (https://github.com/isovic/racon), or a built-in polisher from flye assembler (https://github.com/fenderglass/Flye).
yes, both work fine with long reads only.
Why are you so sure these are in fact errors? Did you look at the alignments in a genome browser, e.g. IGV? Visual inspection can often be misleading - you need a pileup and some sort of summary (BAM + BAI index in IGV help with this nicely - you'll see variants in coverage right away).
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
You can use Quiver (if you have old RS reads) or Arrow (Sequel reads) to polish your assembly. You can also give Pilon a try. The newest version supports long reads.