Hi,
We sequenced the only one fish genome using Pacbio HiFi (HiFi reads for highly accurate long-read sequencing).
I annotated the mitogenomes (3 circular replicons) . using MITOS,
1st replicon - 17. kb
Split/duplicated genes: cox1, nad1, nad2, nad5, trnD
2nd replicon - 17. kb
Split/duplicated genes: cox3, cob, nad1, nad3, nad4, nad5, nad6, rrnL
3rd replicon - 17.kb
Split/duplicated genes: cox1, cob, atp6, nad2, nad4, nad4l, nad5, nad6, rrnL
What I'd be the reason for having more duplicated genes in the 2nd & 3rd replicon (mitogenome) ?
How do I troubleshoot this issue?
Is it because of the problems with the assembly program
Duplication genes were quite common in mitogenomes ?
Suggestions appreciated.
With HiFi reads, I imagine that your observed duplications are likely real and not assembly artifacts: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5015259/
Artifacts are still possible,
Genes like nad and rrnL, have splitted one sequence into two separate sequences.
The 2nd sequence from both the genes doesn't code for anything - confirmed by blast.
The splitting of one sequence into two sub sequences is due to improper annotation.
Annotating with different pipelines clearly clarifies the doubt.