high aligned_not_unique number for HTSeq
0
0
Entering edit mode
8.5 years ago
bxia ▴ 180

Hi,

I have a problem with TrueSeq RNA-seq from illumina, after mapping with Hisat2 and HTSeq, I have about 7 m to 11 m aligned_not_unique reads reported by HTSeq, that is about 35-45% of the total reads, is it normal?

Thanks

RNA-Seq • 1.7k views
ADD COMMENT
0
Entering edit mode

What were the metrics in the alignment summary of Hisat2 regarding uniquely mapped, unmapped, multiple mapped?

Notice that a single read which is aligned 30 times in the genome will count 30 times for HTSeq and as such inflate that number.

ADD REPLY
0
Entering edit mode

about 70% single aligned and 19% more than 1, I have 16 m single aligned according to hisat2, but in the end only have 14 min left by HTSEQ, sum up the no_feature and ambiguous, the number still missing about 1.5 m reads...

ADD REPLY
0
Entering edit mode

19% more than once aligned doesn't sound unreasonable. You shouldn't see this 11M as number of sequenced reads, but as number of mapped reads.

ADD REPLY
0
Entering edit mode

If a read aligns 30 times htseq-count will count it 0 times. Perhaps you're thinking of bedtools.

ADD REPLY
1
Entering edit mode

It will count it 30 times as aligned_not_unique if I'm not mistaken. I recognize that it was rather unclear in my previous comment.

ADD REPLY
0
Entering edit mode

Ah, that could be.

ADD REPLY
0
Entering edit mode

That's a bit high, have a look at the alignments in IGV. Also, we're assuming you're not trying quantify transcripts or anything like that, since that'd cause this sort of effect.

ADD REPLY

Login before adding your answer.

Traffic: 2778 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6