Illumina RNA-seq: Overrepresented sequences are mostly in the R2 reads, not R1
1
1
Entering edit mode
6 weeks ago
Umberto ▴ 10

Hi everyone,

I'm testing a universal rRNA depletion protocol followed by Illumina sequencing. I have sequenced my RNA (from an insect species) on an Illumina NextSeq2500, 2x150bp, about 30mln reads per sample. I don't expect my protocol to complete deplete the rRNA and when I check the overrepresented sequences on FastQC I see something interesting:

The R1 always have about 2/3% overrepresented sequences The R2 have about 7/8%. In both cases, these sequences are rRNA, but I don't understand why they would be more abundant in the R2 than the R1 since they are sequenced from the same fragment (note that fragments are often smaller than reads size in this case, but I don't think it explains it).

Did anybody saw a similar pattern before? Thanks!

illumina rRNA RNAseq RNA • 400 views
ADD COMMENT
0
Entering edit mode

why they would be more abundant in the R2 than the R1 since they are sequenced from the same fragment (note that fragments are often smaller than reads size in this case, but I don't think it explains it).

You could try to merge your R1/R2 reads and then scan/trim/QC the merged reads. Having inserts shorter than the length of sequencing does not bode well for the quality of library.

Trying to reason out every FastQC graph is not essential.

ADD REPLY
0
Entering edit mode
6 weeks ago
noodle ▴ 590

Sounds like an artifact of whatever QC program you're using. For a better glimpse of these numbers look at an aligned bam and maybe try https://gatk.broadinstitute.org/hc/en-us/articles/360037057492-CollectRnaSeqMetrics-Picard

ADD COMMENT

Login before adding your answer.

Traffic: 1323 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6