chloroplast complete Genome assembly
3
0
Entering edit mode
9.5 years ago

Hi ... I have about 12GB data based on whole genome seq. by Illumina Seq.. I tried to assemble (soap denovo) 50% of the data by Genious software, but I couldn't get big contigs, the larger contig was about 70KB. so because I want to assemble complete chloroplast and mitochondria seq. this is not a proper way. do you have any idea or opinion what I have to change in my assembler (Soap denovo) to get larger contigs. Thanks

next-gen Assembly blast sequence • 4.7k views
ADD COMMENT
0
Entering edit mode

Did you taxon-annotate your contigs? Is this 70kbp contig from your chloroplast? I would say a 70kbp contig for a chloroplast is a good contig, given that IR will rarely be correctly assembled with NGS short reads. By the way, which insert size and read length did you use?

Did you try the suggestions from this thread?

ADD REPLY
0
Entering edit mode

Firstly, thanks for your answer... I did not annotate the contigs, is it necessary to get complete chloroplast or mitochondrial genome seq?

yes, the 70Kbp from the plant chloroplast that I sequenced after blast it to NCBI. but the complete chloroplast should come up to nearly 120-150 Kbp!!?

for this seq. I used Illumina Miseq 2x 150 mid-length.

ADD REPLY
1
Entering edit mode
9.4 years ago

I developed a denovo assembler specially for chloroplasts and mitochondria. It can assemble a chloroplast in one contig within one hour from whole genome illumina data. I compared the results with MIRA and MITObim and it seems the quality is higher and it always assembles the whole chloroplast and only the chloroplast. As an output you get two fasta files, the only difference between them is the orientation of the region between the inverted repeats. If you blast both files against a reference you can select the correct orientation. I can send the link where you can download the tool, once it's online available.

ADD COMMENT
0
Entering edit mode

Hi,

Sorry forgot to post the link: https://github.com/ndierckx/NOVOPlasty

So it works for Illumina paired end reads derived from whole genome data (no capture DNA). Best not to trim nor filter the reads, use the raw data!!

I will upload the tool very soon, but if you mail me through the github, I send it to you the same day.

If the chloroplast hasn't got many repetitive regions, it should assemble in one circular contig (assembled over 80 chloroplast genomes in once circular contig and within 30 min)

ADD REPLY
0
Entering edit mode
9.5 years ago
Brice Sarver ★ 3.8k

Some chloroplast assemblies can be tricky because of an inverted and duplicated region. I'm not sure if this applies to your system, but I've heard complaints about it from botanists wading into HTS. Different assemblers will produce different results, so you might need to try a few and mess around with the settings a bit. Have you tried mapping your contigs back to a reference of some sort? Perhaps differential coverage is breaking up your assembly into multiple pieces and things are more-or-less okay; perhaps other things are going on.

The Milkweed Genome Project has a pipeline that passes arguments to several programs and might help out. Alternatively, I'd try using a reference-guided assembly if you can use a not-too-distant reference.

I also recommend ARC for breaking down the complex genome assembly problem into a more manageable one focusing on just reads that share similarity with the chloroplast. I've had great success with this for mammalian mitochondrial genomes, and others have used it for plastids.

ADD COMMENT
0
Entering edit mode
9.5 years ago
h.mon 35k

You have a mix of plant, mitochondria and chloroplast contigs (and possibly other stuff), you need to sort them out. You may use blobology or simply blast to assign the contigs to genome, mitochondria and plastid. After you do that, follow the suggestions of brice.sarver to evaluate and/or improve you assemblies.

However, it is highly unlikely you will be able to assemble the whole chloroplast, due to the inverted repeats. You may get a small number of contigs covering most of it, though.

ADD COMMENT

Login before adding your answer.

Traffic: 1946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6