I have noticed a long time ago that the true read length may not exactly match the promised read length in the analysis name e.g. 2x100bp Illumina data frequently consists of 101bp reads. Today I looked into a dataset that systematically has 97 or 96 bp reads for the promised 2x100bp across many files. Why does this happen and should it be interpreted in any way?
Thanks in advance for the comments!
To add to this: some sequencing facilities run
n+1
cycles to get youn
cycles of data with reliable quality. This may be more critical for the tag reads than the main.Sorry, I tried to find it but failed - what is the difference between tag reads and main reads?
Main are reads contain sequence from your own samples. Tag/index reads are the short oligos used to "code" individual samples (during library prep) so multiple samples can be pooled together and run in one lane. Based on the sequence of these tag reads, it is possible to bin reads belonging to individual samples after the run is done. In Illumina technology, tag reads are sequenced independent of the
main
reads.