I am looking for an optimal method for de novo assembly of fungal isolates and came across this nice nature paper.
They use a combination of methods and in two of them (workflow 2 and 3) they mention normalization of the data:
To overcome MDA-generated differences in coverage across the genome, the second workflow normalized raw reads to average 100X before assembling using SPADES
and again:
A third assembly was created using SPADES40 after combining raw reads from 24 nuclei followed by normalization to 100X
I am struggling to understand what they mean by this. Could anyone help explain to me what they are doing here?
Check the guide for
bbnorm.sh
which is the tool used for read normalization.