What Exactly Does A Read Represent?
3
1
Entering edit mode
11.2 years ago
rborges ▴ 50

If I understand correctly, sequencing (Illumina) is done by amplifying molecules that have been fragmented and then synthesizing the corresponding strand to these amplified molecules and measuring fluorescence emitted during this synthesis. But then, what exactly does a read represent? Is it a consensus of fluorescence emitted by many of these amplified reads at once, or does a single read represent one single of these amplified molecules that is synthesized and emitting fluorescence. I hope this is clear!

reads sequencing illumina • 2.7k views
ADD COMMENT
9
Entering edit mode
11.2 years ago
seidel 11k

The base calls of an Illumina-sequenced read represent a consensus of the fluorescence of many molecules that were amplified to form a little patch of identical molecules (often called a polony for PCR colony), on the glass surface. The quality scores that come along with Illumina reads represent the confidence in this consensus.

"or does a single read represent one single of these amplified molecules..." this is referred to as single-molecule sequencing, and it's close but not exactly prime-time yet.

ADD COMMENT
2
Entering edit mode
11.2 years ago

Right, I'll just add that the Illumina term is 'cluster'. One DNA molecules sticks to the flow cell, and a PCR-like amplification takes place, making a grove of identical DNA strands around that original one. This boosts the fluorescence signal.

So if you have a non-clonal locus in your sample, you will see that some clusters will have one allele, and other clusters will have the other. You will not see a mixed call from a single cluster; when you do see that, it's technical error only.

ADD COMMENT
0
Entering edit mode

So if I have a depth of 100X in a certain region, which means this region is covered by 100 reads, does this mean that 100 different individual molecules from different cells in the sample stuck to the flow cell? Can one make this cell number assumption?

ADD REPLY
0
Entering edit mode

Not necessarily. It could mean that one molecule was amplified by PCR into 100, and those 100 all stuck to the flowcell. Most Illumina protocols have PCR amplification, and unless you are explicitly told otherwise, you should assume that yours does to, so PCR artifacts are always a possibility.

However, if 100 reads cover your region, and they all have different coordinates, then yes, they did come from different molecules.

ADD REPLY
2
Entering edit mode
11.2 years ago
SES 8.6k

There are a lot of good resources on the Illumina website, such as documentation and videos describing their technologies. In particular, I recommend you check out the video describing their sequencing technology on their online courses (link; it's the second one from the top). Seeing how the reactions work through diagrams and video format may be more informative. There are also a lot of other training opportunities on the Illumina website for free such as tutorials, webinars and various courses.

ADD COMMENT

Login before adding your answer.

Traffic: 1645 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6