If we ran 384 samples multiplexed on a Novaseq 6000 using S4 chemistry (2x150 reads) using human samples, what would be the expected coverage for each sample?
If we ran 384 samples multiplexed on a Novaseq 6000 using S4 chemistry (2x150 reads) using human samples, what would be the expected coverage for each sample?
Theoretically if all samples are present at exactly the same concentration then you should get PF_clusters/384 clusters per sample
. That would equate to PF/384
single-end reads or 2 *(PF/384)
paired-end reads. You can calculate expected coverage by using the human genome size.
In practice, this would all depend on quality of your libraries, efficiency of pooling and how well the run works.
10,000,000,000 PF clusters (max spec) / 384 = 26,042,666 clusters per sample
26,042,666 * 2 * 150 = 7.8125000e9 bases
7.8125000e9 / 3,000,000,000 = ~2.6x coverage
A NovaSeq S4 flow cell creates 800-1000 million clusters according to the specs. Given you have 1000mio / 384 samples gives roughly 26mio per sample. Using the coverage calculator from Stephen Turner, setting read length to 150bp in paired-end mode for the human genome and specifying 26mio read pairs equals roughly 3x coverage. This is of course very optimistic, as it does assume 0% loss of data due to data quality, custering efficiency or duplicates. In any case, the coverage will be low.
Also, as this question is quiet similar to your last one, please for the future follow up on similar questions rather then opening a new one in case the topic is so similar:
Is it possible to multiplex mora than 384 samples on Novaseq 6000 for a very low pass (0.5-10x) WGS?
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.