Question

Is it possible to have a mitochondrial genome of size 30-40kbp in insects?

0

Entering edit mode

4 months ago

Meeran ▴ 10

Hi all,

I am working on assembling the mitochondrial genome of my insect parasitoid, Microctonus aethiopoides. There is a reference mitochondrial genome of Microctonus brassicae available in NCBI (accession number: OU953852) with a size of 40kbp. Using this as a reference, I assembled the Nanopore long-read data with Flye in meta mode and obtained a contig of 31kbp with coverage >600x, and the assembly is circular according to the Flye info file. (Note : Used this method because non of the mito genome assembly tools (Novoplasty, Mitofinder, Mitoz, mitoflow) could produce a circular contig)

For additional confirmation, I blasted this contig against the nucleotide database (ntdb) and got max hits for mitochondrial genomes, confirming that it is indeed the mitochondrial genome.

Here are my questions:

Is it common for insect mitochondrial genomes to be as large as 30-40kbp? I was initially shocked to see the reference in NCBI with 40kbp, as I understood that mitochondrial genomes typically range from 15-25kbp.

Could the large size be due to many repeat regions? When visualizing the mapped regions in IGV, I noticed many reads with low mapping quality, which might indicate repeat regions. How should I deal with these repeats?

Annotation issues: After annotating this assembled genome, it annotated mitochondrial genes but also many OH (origin of replication) regions. Is this normal, and how should I address this in my analysis?

Any suggestions or insights would be greatly appreciated!

mitochondria genome assembly • 470 views

ADD COMMENT • link 4 months ago by Meeran ▴ 10

score 1 · Answer 1 · 2024-07-18

I've never dealt with insects or their parasites. But in fungi, mito sizes vary wildly. They're an organelle. As such, key genes can be in the mito, or on the main genome, or both. The key question in terms of whether it was assembled correctly is not "Should gene X be here, based on common knowledge?" but rather "Does the assembler indicate this contig is circular, according to the data? And is the coverage similar across the contig?". There are papers indicating that mito genomes can be in the megabases in some clades. I'm not sure whether that's true, but I wouldn't worry about a mitochondrial genome being 15kb vs 40kb if the assembler indicates strong support for circularity from long reads. Some of the genes can move back and forth from the mito to the main genome. Hmm, you might check whether the unexpected extra size is from genes not present in the main genome, but are present in the main genomes (and not mitos) in related organisms with smaller mitochondrial genomes.

Mito genomes have pressure to be small. It seems unlikely to me that small mito genomes would have lot of repeats. And particularly super-long repeats too long for PacBio/Nanopore reads to span; I've never seen that.