Hi everyone.
I have a goal to assemble the SARS-CoV-2 having forward and reverse FASTQ reads. I have used the SPAdes tool and the best result I managed to receive is a FASTA with a bunch of scaffolds, namely 38 pieces. What should I do in order to get a full single FASTA?
It is possible that you simply have way too much data (considering the small size of SARS genome). You can normalize/downsample your data and try again. Use a tool like
bbnorm.sh
from BBMap suite to normalize the data. Since there are so many SARS genomes available you may simply want to align your data instead of doing an assembly.Use some long-read sequencing and perform a hybrid assembly, or use a reference sequence and do reference guided alignment.
You are unlikely to ever achieve a complete genome with short reads no matter what assembler you use.