Entering edit mode
11.6 years ago
justinhaselbach
▴
250
Hi there. New to Bioinformatics !
Results are as follows: Ion-Torrent Data for a 5 Mbp microbial genome: Aquired 3, 435, 101 high-quality filtered reads, average read length of 180 bp, assembled using MIRA/ CLC Assembler, & got: 350 Contigs (> 500 bp CO), mean contig size: 17 kbp, 110 X Coverage; N50 Contigs: 32 kbp, largest scaffold: 89 kbp, 6000 ORFS/CDSs.
What looks unreasonable/ unacceptable/ gross ? Please feel free to advice and comment ! Just wish to check out and find flaws.
Thanks, Justin
Really depends on what you are trying to achieve with your data.
To Publish a Draft Genome ! In that case ? Thanks,
If you have time, I would do two more things: 1) reduced coverage, as assembly software might have problem (and often has problem) with as high coverage asi 110X - I would downsample it to let's say one half (55X). Maybe you get better results with less reads:)) 2) I would try also another assembler, such as Trinity and compared number of contigs (maybe even checked how much they differ?). From my experience, MIRA gave me assembly which was very different to one produced by Trinity, so it's worth a try. Good luck :-)
Wow ! Thanks a loads ! Worth trying Trinity with half-the-coverage ! Will do so. Thanks again.