Hi all, I am doing the de novo genome assembly of a species of Chlamydomonas. I have 100bp long paired Illumina reads and I have checked the coverage using
C = LN/G
G = genome size of Chlamydomonas reinhardtii
L = 100bp
N = Number of reads
I am getting 50x coverage this way. I have used fastp for quality trimming of reads and used soapdenovo, velvet, abyss and spades for de novo genome assemblies using kmer length (57,67,77,87) for each of them. By this way I have 16 assemblies. But when I am running busco for quality assessment of the assemblies using
--lineage_dataset chlorophyta_odb10 -m genome --cpu 16 --augustus_species chlamydomonas
I am getting very low busco score (with every assembly) like
C:18.2%[S:2.6%,D:15.6%],F:0.1%,M:81.7%,n:1519
I am unable to check where is the problem. Please help.
Thanks
Even with the complexity of these species' genomes, that is a surprisingly low busco score... As below, an important stat would be your assembly size to see if there is a lot of material missing as suggested by the busco analysis Also, depending on the distance of your species to other Chlamydomonas species, you could try aligning your genome to one and seeing how much is covered?