Dear all,
I am running denovo assembly for Newcastle (NDV) virus which its genomic size is 15 kb. What is the optimum number of Illumina paired end short reads to start denovo assembly with 250 bp reads? 150 bp reads?
many thanks Esmaeil
Dear all,
I am running denovo assembly for Newcastle (NDV) virus which its genomic size is 15 kb. What is the optimum number of Illumina paired end short reads to start denovo assembly with 250 bp reads? 150 bp reads?
many thanks Esmaeil
I suggest you try assembling with various levels of coverage to see what comes out best. Normally 40x-150x coverage is sufficient for a good assembly, depending on the assembler. I would expect a 15 kbp virus to assemble in a single contig with 60x coverage, though rapidly-mutating viruses seem to be hard to assemble in a single contig even when they are that small. Just check the genome size after assembly to make sure it is reasonable.
For 60x coverage of 15 kb genome, how many paired short reads in 250 bp reads I should use? how many paired short reads in 150 bp reads I should use? Which formula converts the the length and number of short reads to coverage base on genome size?
thanks
For a rough (and imperfect) estimation, use the formula:
C = R x L / G
Where C is coverage, R is total number of reads, L is read length, and G is genome size. So for a 15 kb genome, with 250 bp reads, in order to get 60X coverage, you need about 3,600 reads.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.