Hi,
I am working with a bulk RNA-seq dataset with a very high % of intronic counts (40-60% of mapped reads), which I believe come from pre-mRNA fraction and maybe to a lesser extent intron-retention in some transcripts. I am a bit concerned as the exonic counts for some sample are only about 1 million mapped reads. The tools that I am analyzing the data with only consider exonic reads for calculating gene expression. I am wondering how accurate would my gene expression be. Although the read depth that I sequenced the samples were good for my purpose (~15-20 million reads/sample), the low mapped % in exonic regions makes me a bit concerned. As I understand the genes with high expression would still be OK in this case correct? It is only the ones with low expression levels that's problematic. Any insights would be highly appreciated.
May I ask how the library prep and alignment was performed?
The sample prep method is a novel method currently being developed. I used STAR for alignment.
When you are referring to high % of intronic counts and etc, are you getting the numbers from
Log.final.out
from STAR? If that's the case and you're seeing low % of uniquely mapped reads and high % of multi-mapping reads, then I would check for the presence of rRNA. Examining library prep protocol and the samples themselves may provide more insight on this.In terms of accuracy in measuring gene expression, I'm not sure if you will be able to determine accuracy even when you have more reads considering how RNA-Seq captures the expression levels at a specific time point and could change depending on various factors (any biological/technical replicates?). I would say genes with more coverage should be okay for downstream analyses assuming the uniquely mapped reads aren't spread out too thin.
Hi the QC counts were generated using Qualimap (http://qualimap.bioinfo.cipf.es/). One of the goals of our study is to understand how our novel method compares to a standard RNA-seq sample prep method. The standard method for the sample samples has much lower intronic reads. Yes I understand the limitations of RNA-seq. However, I wanted to get some insights if sequencing the library more deeply (so that we get more exonic mapped %) would help in this case.