Can someone help me understand the RSeQC Output from infer_experiment.py?
I have RNAseq data from library constructed by TruSeq Stranded Total RNA (NEB Microbe), from pure bacterial culture so following some suggestions found here about this topic I run the mapping against the reference genome using a subsample by HISAT2 (unstranded way )and below you find the summary:
100000 reads; of these:
100000 (100.00%) were paired; of these:
5111 (5.11%) aligned concordantly 0 times
88631 (88.63%) aligned concordantly exactly 1 time
6258 (6.26%) aligned concordantly >1 times
----
5111 pairs aligned concordantly 0 times; of these:
1127 (22.05%) aligned discordantly 1 time
----
3984 pairs aligned 0 times concordantly or discordantly; of these:
7968 mates make up the pairs; of these:
5886 (73.87%) aligned 0 times
1899 (23.83%) aligned exactly 1 time
183 (2.30%) aligned >1 times
97.06% overall alignment rate
Here you see the statistic on bam file obtained:
Total records: 224985
QC failed: 0
Optical/PCR duplicate: 0
Non primary hits 24985
Unmapped reads: 5886
mapq < mapq_cut (non-unique): 12699
mapq >= mapq_cut (unique): 181415
Read-1: 90671
Read-2: 90744
Reads map to '+': 90693
Reads map to '-': 90722
Non-splice reads: 180391
Splice reads: 1024
Reads mapped in proper pairs: 177262
Proper-paired reads map to different chrom:0
Then I run the infer experiment tool of RSEQC and I see this result:
This is PairEnd Data
Fraction of reads failed to determine: 0.6712
Fraction of reads explained by "1++,1--,2+-,2-+": 0.0732
Fraction of reads explained by "1+-,1-+,2++,2--": 0.2557
So I don't understand why I cannot see an higher fraction of reads from first strand as I expected from truseq stranded RNA library. Thanks in advance if someone could give me some suggestions?
Hi, did you solve the problem? I have similar results.