Entering edit mode
8.5 years ago
bxia
▴
180
Hi,
I have a problem with TrueSeq RNA-seq from illumina, after mapping with Hisat2 and HTSeq, I have about 7 m to 11 m aligned_not_unique reads reported by HTSeq, that is about 35-45% of the total reads, is it normal?
Thanks
What were the metrics in the alignment summary of Hisat2 regarding uniquely mapped, unmapped, multiple mapped?
Notice that a single read which is aligned 30 times in the genome will count 30 times for HTSeq and as such inflate that number.
about 70% single aligned and 19% more than 1, I have 16 m single aligned according to hisat2, but in the end only have 14 min left by HTSEQ, sum up the no_feature and ambiguous, the number still missing about 1.5 m reads...
19% more than once aligned doesn't sound unreasonable. You shouldn't see this 11M as number of sequenced reads, but as number of mapped reads.
If a read aligns 30 times htseq-count will count it 0 times. Perhaps you're thinking of bedtools.
It will count it 30 times as aligned_not_unique if I'm not mistaken. I recognize that it was rather unclear in my previous comment.
Ah, that could be.
That's a bit high, have a look at the alignments in IGV. Also, we're assuming you're not trying quantify transcripts or anything like that, since that'd cause this sort of effect.