The NovaSeqXTM Plus sequencing system is the latest in next-generation sequencing technology and builds on Illumina’s well-established sequencing workflow for higher throughput and accuracy, enabling larger-scale genomic studies. The NovaSeqX Plus platform can produce up to 16 Tb output or up to 52 billion reads per dual flow cell run due to the ultra-high density of the flow cell.
The NovaSeqX series uses XLEAP-SBSTM chemistry, Illumina’s most advanced sequencing by synthesis (SBS) chemistry to date. This innovative chemistry provides a 2x faster cycle and 3x greater accuracy than standard SBS, thanks to its stable reagents and the high-fidelity polymerase used to incorporate the fluorescent nucleotides throughput the SBS workflow. The basic principles of this advanced SBS system are the same as those used by previous Illumina systems and follow the established steps of library preparation, cluster generation, sequencing, and data analysis, albeit with optimised performance metrics and output. Here we will discuss these principles and how they have been enhanced to bring the NovaSeqX series to life.
Library Preparation
The purpose of library preparation is to make the DNA fragments that are to be sequenced compatible with binding to the flow cell. To do this, the extracted DNA is mechanically or enzymatically fragmented to produce fragments short enough to be sequenced by the Illumina instrument. Adaptors are then added to either end of the generated fragments to allow them to bind to the flow cell and be sequenced.
Figure 1. The composition of a library. Image courtesy of Illumina.
The adaptors added to the fragments consist of three elements on either end – a P5 or P7 binding region, a read sequencing primer binding site, and an index sequence.
- The P5 and P7 binding regions are complementary to the flow cell oligos and allow for hybridisation and cluster generation.
- The index regions are short sequences of 6-10 base pairs that allow for multiplexing i.e. for sequencing multiple samples in one run, and identifying them afterwards. This example has an index on either end and is therefore a “dual index” library.
- The read 1 primer binding site is where the forward read primer for sequencing binds and the read 2 primer binding site is where the reverse read primer binds. A “single-end” library will contain only the read 1 primer binding site, whereas a “paired-end” library will contain both read 1 and read 2 primer binding sites. This allows for forward and reverse sequencing for higher coverage.
Cluster Generation
A cluster is a group of about a thousand clonal copies of the same DNA library bound very closely together on the flow cell. Multiple clones in close proximity are required to produce a signal that will be detected by the optical system of the sequencing instrument. Each cluster begins as a single DNA library and is amplified to produce clonal copies, all of which will be imaged during the sequencing cycle.
- Prepared libraries are denatured as they must be single stranded to hybridise to the flow cell. Once hybridised, a polymerase begins 3’ extension, with the flow cell oligo acting as the primer for extending a complementary strand the length of the library.
- The newly generated double strand is denatured, and the template strand is washed away. This results in a library strand covalently linked to the flow cell.
- The molecule flips over and forms a bridge by hybridising to an adjacent complementary flow cell oligo and 3’ extension begins to produce a double-stranded DNA molecule. This is known as the “bridge amplification” step of cluster generation.
- The double stranded DNA produced is then denatured, and the fragments flip again for another round of amplification.
- The cycle repeats until around a thousand copies are produced from the first fragment, resulting in millions of these clusters across the flow cell.
Figure 2. The steps involved in cluster generation. Image courtesy of Illumina.
Sequencing
All Illumina platforms use sequencing by synthesis, where the complementary fluorescently labelled nucleotide is incorporated into the growing strand and detected to produce a read. 3 different detection methods exist – 4-channel, 2-channel and 1-channel. The 2-channel system operates in the NovaSeqX series.
The 2-channel system uses green and blue fluorophores. Thymine is green, adenine is blue, cytosine is labelled with both blue and green, and guanine is dark, i.e.it is not labelled with either fluorophore and does not emit a signal when incorporated into the growing strand. Figure 3. The steps involved in sequencing by synthesis. Image courtesy of Illumina.
- The cycle begins with nucleotides and other reagents being washed over the flow cell.
- A fluorescently labelled base complementary to the next base in the template strand is incorporated into the growing strand. Unused bases are washed away.
- The imaging step begins when light excites the fluorophore attached to the incorporated base and the fluorophore emits a signal that is captured by the optical system.
- The fluorescent label is washed away to allow room for the next nucleotide in the sequence to be incorporated.
- The cycle continues with one base added at a time which is detected by the signal emitted from the attached fluorophore.
- A read complementary to the template strand is produced.
Different sequencing strategies exist, be that single-end or paired-end sequencing. Libraries for single-end sequencing contain only one read primer binding site and are sequencing from one end of the library only. Paired-end libraries contain two read primer binding sites and are read from both ends of the library to increase coverage. In both cases the read length can be chosen e.g., 150bp, and with paired-end sequencing the read length can be increased to produce and overlap between the forward and reverse reads.
The NovaSeqX series uses a Dual Index Reverse Complement workflow.
- The library is bound to the flow cell at the P7 adaptor end. Read 1 primer is introduced to the flow cell. It hybridises to its binding site and read 1 is generated and detected as in figure 3. The read is then washed away.
- Index read 1 is produced when the i7 index sequencing primer is washed into the flow cell. This is detected and then washed away.
- Paired-end turnaround: during this step, the complementary strand is synthesised, and the template strand is washed away. This leaves a stand now bound to the flow cell at the P5 adaptor end.
- The i5 index sequencing primer is introduced to the flow cell and index read 2 is produced and washed away.
- Read 2 primer is then added to produce the second read of the fragment in this paired-end strategy.
Figure 4. The dual index reverse complement sequencing workflow. Image courtesy of Illumina.
Many of the advances within the NovaSeqX series that set it apart from previous instruments are related to imaging and base calling algorithms. Illumina’s Real-Time Analysis (RTA) image analysis software has been updated to complement optimisations in flow cell architecture, optics, and chemistry. The imaging hardware has also been updated to include a custom high-speed camera sensor and blue-green optics with an ultra-high numerical aperture which enables the optical system to resolve clusters at high density.
The NovaSeqX Plus system is available at Novogene. Find out more here and contact us today to access the latest in NGS technology.