RNAseq Paired-end fastq - one end has low and other has high duplication

2

Entering edit mode

2.9 years ago

PR ▴ 50

Hello All,

I have a 100-bp mouse RNAseq dataset, and have about 15-20 million reads per fastq file. However, one of the two paired ends, R2, has a very low duplication level (less than 10%) compared to R1 consistently across all my samples. This is from the fastQC report.

Usually, in my experience, and also from other discussions in this community, I understand that in RNAseq, it is normal and expected to have 50-60% duplication levels.

So, what could be wrong with my R2? Is there anything actually wrong that the samples need to be re-sequenced? Can something be done to salvage or correct the situation and still keep the data and do differential expression analysis?

If it is of any help, I did go ahead and complete alignment using STAR, and only about 50% of the reads aligned.

I would appreciate any insights anybody has into this issue.

Many thanks!

RNAseq duplication fastq • 435 views

ADD COMMENT • link 2.9 years ago by PR ▴ 50

Login before adding your answer.