Hi,
I am analyzing RNA-seq data from human blood samples. I checked the read distribution using RSeQC read_distribution after mapping by STAR. Usually, I get more than 80% of reads mapped to exons. However, at this time, the result showed only several % were mapped to exons, even though the STAR outputs showed more than 90% were uniquely mapped. I am wondering if this result was correct or my setting for the RSeQC was wrong.
The command I used: read_distribution.py -i my.bam -r hg19_Ensembl_gene.bed
The bam files were output from STAR and sorted by samtools. the bed file was downloaded from https://sourceforge.net/projects/rseqc/files/BED/Human_Homo_sapiens/
The reference genome for mapping was Homo_sapiens.GRCh38.dna.primary_assembly.fa
and the annotation file was Homo_sapiens.GRCh38.104.gtf
One of the output from RSeQC was below:
The multiqc image was below:
Thank you for your help!!
Do you know the library type that was used? E.g. if this is total RNA (not poly-A enriched), then you may simply have lots of immature transcripts and non-coding transcripts. There also seems to be a bit of genomic DNA contamination ("other_intergenic").