hisat2 mm10 low transcriptome mapping rate and low uniquely mapping rate
2
0
Entering edit mode
4.4 years ago
qianxu0517 • 0

I have rRNA-depleted RNA seq data from mouse. I wanted to map the reads to mm10 transcriptome using hisat2. My scripts are as follows: hisat2 -x /data/Hisat2Index/In \ --no-spliced-alignment \ --maxins 600 \ --mp 1,0 \ -1 UM_1_val_1.fq -2 /UM_2_val_2.fq \ -S UM_mp_1_0_hisat2.sam \ --summary-file UM_hisat2_mp_1_0_summary.txt

My mapping output is as follows: 19754290 reads; of these: 19754290 (100.00%) were paired; of these: 13774555 (69.73%) aligned concordantly 0 times 2766396 (14.00%) aligned concordantly exactly 1 time 3213339 (16.27%) aligned concordantly >1 times ---- 13774555 pairs aligned concordantly 0 times; of these: 570850 (4.14%) aligned discordantly 1 time ---- 13203705 pairs aligned 0 times concordantly or discordantly; of these: 26407410 mates make up the pairs; of these: 22991167 (87.06%) aligned 0 times 1623547 (6.15%) aligned exactly 1 time 1792696 (6.79%) aligned >1 times 41.81% overall alignment rate

I checked that all the adapters are removed. fastqc result is good Blast results shows there is no contamination. I also tried two different reference transcriptome: mm10 from genecode and esembl, the mapping rate is similar.

Can anyone gives any suggestion? Thanks very much!

rna-seq alignment SNP • 1.7k views
ADD COMMENT
0
Entering edit mode
4.4 years ago
Arindam Ghosh ▴ 530

Have you checked the --rna-strandness option?

ADD COMMENT
0
Entering edit mode

I have not try that, but I think I figured out the reason. By looking at the fastqc report I found that there are a severe GC bias which indicates a lot of duplicates in my fastq files. That is probably the reason for a low mapping rate.

ADD REPLY
0
Entering edit mode

Can you put up the FastQC GC content plot?

ADD REPLY
0
Entering edit mode

https://ibb.co/D5d2VHQ

ADD REPLY
0
Entering edit mode

please go to this link: https://ibb.co/D5d2VHQ

ADD REPLY
0
Entering edit mode
4.4 years ago
qianxu0517 • 0

Here is the figure for GC contents

ADD COMMENT

Login before adding your answer.

Traffic: 1877 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6