Hello, I am learning about expression quantification, which according to my slides means "finding the amount of sequenced reads assigned to a specific gene/transcript". But how is it possible that more than one read is assigned to a specific gene? How are specific genes expressed more than once in reading cycle? Best wishes, Mairena
Chalk it up to technology. In rare instances we prepare libraries for sequencing that do not use any amplification. In most instances there will be a shearing/amplification step in the making of libraries since we need to increase the amount of material so it becomes measurable/detectable. One will end up with overlapping library fragments in the process. This will explain multiple reads aligning to a specific gene (in addition to the "multi-mapping" that can result because of sequence similarities between domains etc). If one is worried about finding original number of molecules one is starting with then there are ways to add
unique molecular identifiers (UMI)
that will allow one to keep track of molecules as they go through amplification.