Hello All,
I have a 100-bp mouse RNAseq dataset, and have about 15-20 million reads per fastq file. However, one of the two paired ends, R2, has a very low duplication level (less than 10%) compared to R1 consistently across all my samples. This is from the fastQC report.
Usually, in my experience, and also from other discussions in this community, I understand that in RNAseq, it is normal and expected to have 50-60% duplication levels.
So, what could be wrong with my R2? Is there anything actually wrong that the samples need to be re-sequenced? Can something be done to salvage or correct the situation and still keep the data and do differential expression analysis?
If it is of any help, I did go ahead and complete alignment using STAR, and only about 50% of the reads aligned.
I would appreciate any insights anybody has into this issue.
Many thanks!