Hello all,
I have received paired-end reads for 40 samples. The reads are supposed to be 100bp per end. Instead, 20 of my samples are 101bp per end, while the other 20 are 100bp as expected.
Because of this, we assume that these 20 samples were all in the same lane; and that by accident there was an extra iteration in the illumina sequencing. However, we also see a strong negative correlation with read length and quality; the samples in which we had 101bp per end lose about 30% of the reads in the trimmomatic quality step.
My question is: Has this happened to anyone else? Does it occur often, and if so, does it often affect quality? We are really quite puzzled by this.
Any ideas / clues appreciated.
Thank you! -Thies Gehrmann
Did you check if the first or last letter is the same for every read?