Good statistics for RNA-seq alignments using RSEM

0

Entering edit mode

2.4 years ago

Thanh • 0

Hi, After calculating the expression from my raw read files, I retrieved these statistics from the cnt file in the stat folder:

17572420 28769454 0 46341874

27465439 1304015 11918214

60305896 3

I'm quite concerned that the number of unalignable reads are 2/3 of the number of alignable reads. However, my reference transcriptome are that provided on the RSEM website which only includes RefSeq with NM prefix.

Does this mean that the unalignable reads may be belong to noncoding sequences, miRNA, etc. instead of mature RNAs? And is this alignment statistics good enough to be proceeded to differential expression analysis?

cnt rsem stat rsem-calculate-expression alignment • 807 views

ADD COMMENT • link updated 2.4 years ago by mark.ziemann ★ 1.9k • written 2.4 years ago by Thanh • 0

2

Entering edit mode

Is this standard RNA-seq? Did you do poly-A enrichment or ribosomal depletion?

Some of the reasons why you might get low mapping rates:

There is a high level of adapter only reads
The inserts are very short due to RNA degradation
There is a high level of ribosomal RNA contamination
The base quality scores are very low
There was a mixup with the reference genome
There was contamination with genomic DNA

So you should look further into the QC to eliminate the above possibilities.

ADD REPLY • link 2.4 years ago by mark.ziemann ★ 1.9k

Login before adding your answer.