Low "assigned alignments" in output and high "Unassigned_NoFeatures" in summary
1
0
Entering edit mode
5 months ago
2406691063 • 0

I used featureCounts to get exon counts. And my codes are following:

'''./featureCounts -t exon --countReadPairs  -M -O -f -s 0 -p -T 4 -F GTF -a /line/WJ/project/NY_Transcriptome2Genome/featurecount/Subread_to_DEXSeq/NY_for_dexseq.gtf -o /line/WJ/project/NY_Transcriptome2Genome/featurecount/DT_16vs0.txt /line/WJ/project/NY_Transcriptome2Genome/04_samtools_SNPcalling/sort/DT_16_01.sorted.bam /line/WJ/project/NY_Transcriptome2Genome/04_samtools_SNPcalling/sort/DT_16_02.sorted.bam /line/WJ/project/NY_Transcriptome2Genome/04_samtools_SNPcalling/sort/DT_16_03.sorted.bam /line/WJ/project/NY_Transcriptome2Genome/04_samtools_SNPcalling/sort/DT_16_06.sorted.bam /line/WJ/project/NY_Transcriptome2Genome/04_samtools_SNPcalling/sort/DT_16_08.sorted.bam /line/WJ/project/NY_Transcriptome2Genome/04_samtools_SNPcalling/sort/DT_0_04.sorted.bam /line/WJ/project/NY_Transcriptome2Genome/04_samtools_SNPcalling/sort/DT_0_05.sorted.bam /line/WJ/project/NY_Transcriptome2Genome/04_samtools_SNPcalling/sort/DT_0_06.sorted.bam /line/WJ/project/NY_Transcriptome2Genome/04_samtools_SNPcalling/sort/DT_0_07.sorted.bam /line/WJ/project/NY_Transcriptome2Genome/04_samtools_SNPcalling/sort/DT_0_08.sorted.bam

''' But the output in terminals showed that the successfully assigned alignment scores were only around 40%.

> //========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 10 BAM files                                     ||
||                                                                            ||
||                           DT_16_01.sorted.bam                              ||
||                           DT_16_02.sorted.bam                              ||
||                           DT_16_03.sorted.bam                              ||
||                           DT_16_06.sorted.bam                              ||
||                           DT_16_08.sorted.bam                              ||
||                           DT_0_04.sorted.bam                               ||
||                           DT_0_05.sorted.bam                               ||
||                           DT_0_06.sorted.bam                               ||
||                           DT_0_07.sorted.bam                               ||
||                           DT_0_08.sorted.bam                               ||
||                                                                            ||
||             Output file : DT_16vs0.txt                                     ||
||                 Summary : DT_16vs0.txt.summary                             ||
||              Paired-end : yes                                              ||
||        Count read pairs : yes                                              ||
||              Annotation : NY_for_dexseq.gtf (GTF)                          ||
||      Dir for temp files : /line/WJ/project/NY_Transcriptome2Genome/fea ... ||
||                                                                            ||
||                 Threads : 4                                                ||
||                   Level : feature level                                    ||
||      Multimapping reads : counted                                          ||
|| Multi-overlapping reads : counted                                          ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
\\============================================================================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file NY_for_dexseq.gtf ...                                 ||
||    Features : 215200                                                       ||
||    Meta-features : 20538                                                   ||
||    Chromosomes/contigs : 335                                               ||
||                                                                            ||
|| Process BAM file DT_16_01.sorted.bam...                                    ||
||    Paired-end reads are included.                                          ||
||    Total alignments : 31522099                                             ||
||    Successfully assigned alignments : 11715686 (37.2%)                     ||
||    Running time : 0.83 minutes                                             ||
||                                                                            ||
|| Process BAM file DT_16_02.sorted.bam...                                    ||
||    Paired-end reads are included.                                          ||
||    Total alignments : 29776568                                             ||
||    Successfully assigned alignments : 12311903 (41.3%)                     ||
||    Running time : 0.76 minutes                                             ||
||                                                                            ||
|| Process BAM file DT_16_03.sorted.bam...                                    ||
||    Paired-end reads are included.                                          ||
||    Total alignments : 28298923                                             ||
||    Successfully assigned alignments : 12146956 (42.9%)                     ||
||    Running time : 0.64 minutes                                             ||
||                                                                            ||
|| Process BAM file DT_16_06.sorted.bam...                                    ||
||    Paired-end reads are included.                                          ||
||    Total alignments : 31025669                                             ||
||    Successfully assigned alignments : 9508592 (30.6%)                      ||
||    Running time : 0.97 minutes                                             ||
||                                                                            ||
|| Process BAM file DT_16_08.sorted.bam...                                    ||
||    Paired-end reads are included.                                          ||
||    Total alignments : 28444412                                             ||
||    Successfully assigned alignments : 9160038 (32.2%)                      ||
||    Running time : 0.93 minutes                                             ||
||                                                                            ||
|| Process BAM file DT_0_04.sorted.bam...                                     ||
||    Paired-end reads are included.                                          ||
||    Total alignments : 27512167                                             ||
||    Successfully assigned alignments : 12262121 (44.6%)                     ||
||    Running time : 0.63 minutes                                             ||
||                                                                            ||
|| Process BAM file DT_0_05.sorted.bam...                                     ||
||    Paired-end reads are included.                                          ||
||    Total alignments : 38093207                                             ||
||    Successfully assigned alignments : 14982222 (39.3%)                     ||
||    Running time : 0.81 minutes                                             ||
||                                                                            ||
|| Process BAM file DT_0_06.sorted.bam...                                     ||
||    Paired-end reads are included.                                          ||
||    Total alignments : 86329285                                             ||
||    Successfully assigned alignments : 26742040 (31.0%)                     ||
||    Running time : 2.83 minutes                                             ||
||                                                                            ||
|| Process BAM file DT_0_07.sorted.bam...                                     ||
||    Paired-end reads are included.                                          ||
||    Total alignments : 58327939                                             ||
||    Successfully assigned alignments : 21884010 (37.5%)                     ||
||    Running time : 1.60 minutes                                             ||
||                                                                            ||
|| Process BAM file DT_0_08.sorted.bam...                                     ||
||    Paired-end reads are included.                                          ||
||    Total alignments : 20276198                                             ||
||    Successfully assigned alignments : 6942758 (34.2%)                      ||
||    Running time : 0.46 minutes                                             ||
||                                                                            ||
|| Write the final count table.                                               ||
|| Write the read assignment summary.                                         ||
||                                                                            ||
|| Summary of counting results can be found in file "/line/WJ/project/NY_Tra  ||
|| nscriptome2Genome/featurecount/DT_16vs0.txt.summary"                       ||
||                                                                            ||
\\============================================================================//

I have also found very high "Unassigned_NoFeatures" in summary.

Assigned    11715686    12311903    12146956    9508592 9160038 12262121    14982222    26742040    21884010    6942758
Unassigned_Unmapped 1434907 1376340 1462155 1588518 1277436 1343727 1724046 4257775 2630628 934777
Unassigned_Read_Type    0   0   0   0   0   0   0   0   0   0
Unassigned_Singleton    0   0   0   0   0   0   0   0   0   0
Unassigned_MappingQuality   0   0   0   0   0   0   0   0   0   0
Unassigned_Chimera  0   0   0   0   0   0   0   0   0   0
Unassigned_FragmentLength   0   0   0   0   0   0   0   0   0   0
Unassigned_Duplicate    0   0   0   0   0   0   0   0   0   0
Unassigned_MultiMapping 0   0   0   0   0   0   0   0   0   0
Unassigned_Secondary    0   0   0   0   0   0   0   0   0   0
Unassigned_NonSplit 0   0   0   0   0   0   0   0   0   0
Unassigned_NoFeatures   18371506    16088325    14689812    19928559    18006938    13906319    21386939    55329470    33813301    12398663
Unassigned_Overlapping_Length   0   0   0   0   0   0   0   0   0   0
Unassigned_Ambiguity    0   0   0   0   0   0   0   0   0   0

I have no ideas what to do. Any help in this will really appreciated.

featureCounts • 579 views
ADD COMMENT
1
Entering edit mode

What percentage were you expecting and why?

I ask because unassigned no features means the mapped read simply doesn't overlap your GTF file, which might be expected. For example, I have ribo-depleted total RNA, and it turns out we capture a fair bit of nascent RNAs. So when I do counts using a regular GTF, I get percentages comparable to yours. If I included counts over intronic regions, that percentage would increase because now the nascent pre-mRNA reads get counted. There's also non-gene genomic regions that have RNA signal that I don't count with a typical GTF.

ADD REPLY
0
Entering edit mode

Could there be information missing from your GTF file, such as specific chromosomes?

ADD REPLY
0
Entering edit mode
5 months ago

Have you verified that your genome file and gtt have matching chromosome names? That's the most common reason nothing at all would match.

ADD COMMENT
0
Entering edit mode

That does not seem to be the case here. There are ~30-40% assignments.

ADD REPLY
0
Entering edit mode

Ah, couldn't see that on my phone. Still worth investigating.

ADD REPLY

Login before adding your answer.

Traffic: 2160 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6