Question

Illumina Terminology Clarification

1

Entering edit mode

4.1 years ago

joe_genome ▴ 50

Hello community,

I was curious to clarify the following terminology related to sequencing details of Illumina processing as they are quite confusing in the manuals, so I was looking to see if someone could explain in more general terms:

Clusters
Trimmed Bases
Yield (or YieldQ30) per Lane
Tiles
Demultiplexing

Thanks :)

genomics • 1.2k views

ADD COMMENT • link updated 4.1 years ago by Mensur Dlakic ★ 29k • written 4.1 years ago by joe_genome ▴ 50

score 4 · Accepted Answer · 2021-03-25

See if this helps:

Clusters are formed on flowcells after a library molecule binds and then is amplified by the process of bridge amplification in place. Clonal copies (hundreds) of that molecule form a cluster. One cluster produces one sequence in final sequence file as long as it passes QC post-processing.
Trimmed bases is the final sequence remaining in the file after adapter sequences are trimmed/removed. In case of Illumina libraries short inserts will cause the sequencer to read into adapter at the other end of a library molecule. It is these bases are generally removed.
Yield is the total amount of sequence data (reads and bases) that you get per lane. Q30 is generally average number of bases that have Q scores of 30 or more.
See explanation here - Answer: Significance of Tile in Sequencing?
Demultiplexing bioinformatically separates multiple samples into individual files i.e. bins the reads according to the index (also called barcode by some, though I generally prefer barcode term for in-line indexes). There are two indexing adapter that can be used when preparing libraries. A sample may be indexed with a single index (1D, p7 adapter) or a combination of two (2D, both on p7 and p5 adapters). Using combinatorial indexes (2D or both p7/p5) allows one to mix large number of samples in the same pool as long as the index sequences are at a certain edit distance (generally a distance of 2 to allow for sequencing errors).