Hi, I was running my RNA-seq analysis. I finished the hisat2 alignment and got the .bam files. The alignment rate seems to be OK with an average >90%.
However, when I ran the featureCounts for these bam files the assigned rate seemed to be low with an average of 55%. Just wondering if this is normal of if I need some modification for my analysis parameters.
Below are the hisat2 and featureCounts output for the same sample.
hisat2 output:
29441125 reads; of these:
29441125 (100.00%) were paired; of these:
3201971 (10.88%) aligned concordantly 0 times
23447466 (79.64%) aligned concordantly exactly 1 time
2791688 (9.48%) aligned concordantly >1 times
----
3201971 pairs aligned concordantly 0 times; of these:
648706 (20.26%) aligned discordantly 1 time
----
2553265 pairs aligned 0 times concordantly or discordantly; of these:
5106530 mates make up the pairs; of these:
2960349 (57.97%) aligned 0 times
1638099 (32.08%) aligned exactly 1 time
508082 (9.95%) aligned >1 times
94.97% overall alignment rate
And the featureCounts output:
Assigned 21949831
Unassigned_Unmapped 1138346
Unassigned_Read_Type 0
Unassigned_Singleton 891940
Unassigned_MappingQuality 0
Unassigned_Chimera 0
Unassigned_FragmentLength 0
Unassigned_Duplicate 0
Unassigned_MultiMapping 9894324
Unassigned_Secondary 0
Unassigned_NonSplit 0
Unassigned_NoFeatures 2531103
Unassigned_Overlapping_Length 0
Unassigned_Ambiguity 0
And this is the parameters that I used for featureCounts:
featureCounts -a "$annotation_file" -o "$output_file" -p --countReadPairs -B -O -T 8 "$bamfile"
Thanks.
Thanks for answering. I used the gtf file from NCBI and it should be a high-quality reference annotation. I did notice that the Sequence Duplication Levels were pretty high when I ran fastqc, and wondering if that could be the reason.
Probably not. With a counting experiment like RNAseq you expect there to be some duplication (e.g. multiple transcripts from same gene).
Are you using matching genome sequence and annotation? i.e. you generally should not mix and match genome and annotation sources.
Have you checked your alignments in a viewer and are the reads piling up under exons?