Entering edit mode
5 months ago
Umer
▴
130
Hi,
Background: Fusarium oxysporum genome sequenced.
- Estimated Genome Size: ~60mb
- Coverage: 100x
- Platform: Illumina NOvaseq x (Paired 150bp)
GOAL: Denovo Genome Assemblies
Assembler: SPAdes genome assembler v4.0.0
Command Used:
(time $spades -o $out_assembly -1 $fastp_out/ILL02_1_qc.fq.gz -2 $fastp_out/ILL02_2_qc.fq.gz --careful --threads 14 --memory 240 -k 21 33 55 77 99 111 127 ) 2>&1 | tee -a $out_assembly/spades_denovo.log
Question: The assembly process worked completly without any error but i recieved a Warning as below
=== Error correction and assembling warnings:
* 0:18:54.165 2271M / 7869M WARN General (launcher.cpp : 180) Your data seems to have high uniform coverage depth. It is strongly recommended to use --isolate option.
As per my understanding the --isolate option is prefered for bacterial genomes. What should I do in case of this warning. Should i use Isolate option and reassemble the genome or just ignore this warning ?
Thank you.
Use the option recommended or better yet downsample your data to something in 30x-40x range. You can do this using
seqtk
orreformat.sh
from BBMap suite (check sampling option in in-line help). Having too much data can be a bad thing for assemblies.I have 45 samples to assemble. Right now i was test-running everything on one sample. Should I use that --isolate option for all samples ? every sample is at 100x coverage.
Try the assemblies both ways. Downsampling may produce a better assembly. Then decide the option to use.
Before i ran assemblies with --careful option.
Now running assemblies with --isolate option.
Will share comperative results.
I may be wrong here, but I don't think
--isolate
is _specific_ to microbes, instead it's trying to capture whether the genome has come from an isolated organism, or whether the assembler should expect there to be contaminating reads from other species.If the tool says use the option, I'd just use the option.