Normalization of raw Illumina reads.
1
0
Entering edit mode
4.1 years ago

I am looking for an optimal method for de novo assembly of fungal isolates and came across this nice nature paper.

They use a combination of methods and in two of them (workflow 2 and 3) they mention normalization of the data:

To overcome MDA-generated differences in coverage across the genome, the second workflow normalized raw reads to average 100X before assembling using SPADES

and again:

A third assembly was created using SPADES40 after combining raw reads from 24 nuclei followed by normalization to 100X

I am struggling to understand what they mean by this. Could anyone help explain to me what they are doing here?

sequencing Assembly • 1.3k views
ADD COMMENT
0
Entering edit mode

Check the guide for bbnorm.sh which is the tool used for read normalization.

ADD REPLY
1
Entering edit mode
4.1 years ago
Ram 44k

In the methods section, they explain how they use BBMap to normalize.

Assembly workflow 2: Each set of reads was normalized using bbnorm of BBMap52 v. 38.08 with a target average depth of 100×. Normalized data were assembled individually into 24 assemblies using SPADES40, and a consensus assembly was generated with Lingon38, with the same sequence motifs as for assembly 1.

Normalizing reads before de novo assembly is a common strategy to reduce computational complexity where the qualitative nature of data is more important/relevant than the quantitative information in it.

ADD COMMENT

Login before adding your answer.

Traffic: 1741 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6