Question

Organelle genome assembly did not circularize and I am not sure why

0

Entering edit mode

24 months ago

Арсений • 0

Hello and thank you for visiting this question!

Basically, I have 2 sets of reads from 2 different lines of sunflower and am working on producing complete chloroplast genomes from them. Both sets contain about 20M reads, have gone through similar preprocessing of polyG-tail trimming and adapter trimming, showed similar statistics with FastQC and on every step of preprocessing.

After processing the paired reads were plugged into NOVOPlasty, tool designed specifically for de novo assembly of organelle genomes. Identical configs were used for those sets of reads.

Here's where the issue emerged. One genome was successfully assembled and circularized, reaching length matching with reference sequence. But the other one was assembled into one contig and did not circularize; it also did not reach appropriate length (144.5 Kbp vs. 151 Kbp).

My question is, why might that be? And what are possible solutions to this? It is my first time assembling and I'm scratching my head thinking of reasons 2 nearly identical readsets would result in differing outputs because of.

Assembly • 1.2k views

ADD COMMENT • link updated 24 months ago by shelkmike ★ 1.7k • written 24 months ago by Арсений • 0

score 0 · Answer 1 · 2023-08-10

There are many possible causes of this problem. For example, the fraction of the plastid DNA (with respect to the nuclear and mitochondrial DNA) could have been lower in the second sample, leading to a lower plastid genome coverage. Have you checked the average coverage of plastid genomes of these samples?

You can try the following:
1) Use GetOrganelle instead of NOVOPlasty. GetOrganelle often assembles plastid genomes better.
2) Try some general genome assembler like ABySS or SPAdes. Find contigs that correspond to the plastid genome. There is a chance that this genome assembler will be able to assemble the region that NOVOPlasty couldn't assemble.