featureCounts output summary assigned value higher than uniquely mapped reads from HISAT2
1
0
Entering edit mode
11 days ago
Prawesh • 0

Hello all,

Data: Paired end, RNASeq data.

I had an issue with the featureCounts output Assigned reads are greater than the HISAT mapped on aligned concordantly exactly 1 time

From HISAT:
aligned concordantly exactly 1 time is 48335140 

From featureCounts summary:
Assigned: 64074047

Assigned value is 1.32 times greater than HISAT mapping results. It's weird that Assigned value is higher than mapped reads. how do I solve this issue?

I used bam file from hisat2 to extract only uniquely concordant mapped reads code to address the issue but still I get 1.3x greater than hisat. code below:

# Run HISAT2
.......

# extract header from bam and save to sam file
…..

#extract uniquely concordant reads
samtools view sample-sorted.bam | \
awk 'BEGIN{FS="\t";OFS="\t"}{if ($NF=="NH:i:1" && $(NF-2)=="YT:Z:CP"){print $0}}' > \
sample-for-subread.sam

# merge header and sam file above
….

# sam to bam
….

#sort bam for featureCounts
samtools sort -o sample-for-subread-with-header-sorted.bam \
sample-for-subread-with-header.bam

#run featureCounts
  featureCounts  -p -B -d 20 -D 500000 -C --ignoreDup --primary  -Q 1 -t CDS -a reference/canFam3.gtf  -o sample_one_overlap.txt sample-for-subread-with-header-sorted.bam

Output:

featureCounts output:

Assigned    64074047
Unassigned_Unmapped 0
Unassigned_Read_Type    0
Unassigned_Singleton    0
Unassigned_MappingQuality   0
Unassigned_Chimera  0
Unassigned_FragmentLength   0
Unassigned_Duplicate    0
Unassigned_MultiMapping 0
Unassigned_Secondary    0
Unassigned_NonSplit 0
Unassigned_NoFeatures   32309500
Unassigned_Overlapping_Length   0
Unassigned_Ambiguity    286733

HISAT2 Output:

61464086 reads; of these:
  61464086 (100.00%) were paired; of these:
    11848735 (19.28%) aligned concordantly 0 times
    **48335140 (78.64%) aligned concordantly exactly 1 time**
    1280211 (2.08%) aligned concordantly >1 times

Please help me to solve this.

RNA-seq featureCounts HISAT • 502 views
ADD COMMENT
0
Entering edit mode

It's featureCounts, not featureCount. I've edited your post and fixed all mentions of featureCounts.

ADD REPLY
0
Entering edit mode

What is the output of samtools flagstat?

ADD REPLY
1
Entering edit mode
6 days ago
Prawesh • 0

I figured out:

Since featureCounts counts fragments and not reads, we have pair-end data that means Assigned value from the output will be divided by 2. i.e 64074047/2 which is lesser than the HISAT alignment.

ADD COMMENT
1
Entering edit mode

You always need to add the following option when you are using -p to count paired-end reads.

--countReadPairs    If specified, fragments (or templates) will be counted
                      instead of reads. This option is only applicable for
                      paired-end reads. For single-end data, it is ignored.
ADD REPLY
0
Entering edit mode

Please accept your own answer to mark the post as solved.

ADD REPLY

Login before adding your answer.

Traffic: 1149 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6