High amount of intronic/intergenic reads in SMARTer stranded total bulk RNAseq
0
0
Entering edit mode
14 months ago
Mat ▴ 80

Hello,

I have bulk RNAseq data (SMARTer, stranded total RNA with ribo depletion, 100M paired-end reads 150bp) of 12 human samples and the QC stats look like this

  • High mapping rate against the genome (~90% with Hisat2/STAR)
  • Low mapping rate (3-30%, 7 samples <= 10%) against the transcriptome (using Salmon); also alignment-based quantification using STAR alignments as input didn't increase the mapping rate
  • According to Qualimap most of the reads map to the intronic region followed by the intergenic region, e.g.
    • exonic: 8%
    • intronic: 59%
    • intergenic: 33%
    • overlapping exon: 3%
  • After trimming with Fastp, around 65-75% of the reads map to the genome uniqely, and 15-25% reads are multimapping
  • The average input read and mapped length is ~280 according to STAR.

This is consistent across all 12 samples.

Are there other explanations than genomic DNA contamination for a high amount of intronic/intergenic reads and what else could I check?

A similar question was already asked before: High percentage of intronic/intergenic reads in RNA-seq

Thank you very much. MM

RNA-seq DNA SMARTer • 741 views
ADD COMMENT
0
Entering edit mode

Sounds like genomic DNA contamination to me. Even if you had captured nascent (unspliced) RNA, you should still have a much higher coverage over exons.

If you want to check more things, take some of the reads and map them to genomic coordinates (i.e. a BAM file), and visualize on a genome browser.

ADD REPLY

Login before adding your answer.

Traffic: 2574 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6