Hello,
I am analyzing an SE (50) RNAseq dataset (Illumina) for the first time. I used STAR for read alignment. I generated genome indices by STAR using GRCh38_r85.all.fa and Homo_sapiens.GRCh38.85.gtf downloaded from Ensembl (e.g. wget ftp://ftp.ensembl.org/pub/release-85/gtf/homo_sapiens/Homo_sapiens.GRCh38.85.gtf.gz).
$ head -6 Homo_sapiens.GRCh38.85.gtf
#!genome-build GRCh38.p7
#!genome-version GRCh38
#!genome-date 2013-12
#!genome-build-accession NCBI:GCA_000001405.22
#!genebuild-last-updated 2016-06
1 havana gene 11869 14409 . + . gene_id "ENSG00000223972"; gene_version "5"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; havana_gene "OTTHUMG00000000961"; havana_gene_version "2"; etc
The alignment went smoothly and the result looked OK.
Started job on | Sep 20 09:49:47
Started mapping on | Sep 20 09:50:27
Finished on | Sep 20 10:09:30
Mapping speed, Million of reads per hour | 88.83
Number of input reads | 28202438
Average input read length | 51
UNIQUE READS:
Uniquely mapped reads number | 23282219
Uniquely mapped reads % | 82.55%
Average mapped length | 50.75
Number of splices: Total | 1764086
Number of splices: Annotated (sjdb) | 1747703
Number of splices: GT/AG | 1745945
Number of splices: GC/AG | 13045 etc
Then, I wanted to quantify the reads per gene by "featureCounts"
subread-1.5.0-p3-Linux-x86_64/bin/featureCounts -a Homo_sapiens.GRCh38.85.gtf -g gene_id -o counts.txt Aligned.sortedByCoord.out.bam
I did not get any error message but the output file contains only 0.
$ head -10 counts.txt
Geneid Chr Start End Strand Length Aligned.sortedByCoord.out.bam
ENSG00000223972 1;1;1;1 11869;12613;12975;13221 12227;12721;13052;14409 +;+;+;+ 1735 0
ENSG00000227232 1;1;1;1;1;1;1;1;1;1;1 14404;15005;15796;16607;16858;17233;17606;17915;18268;24738;29534 14501;15038;15947;16765;17055;17368;17742;18061;18366;24891;29570 -;-;-;-;-;-;-;-;-;-;- 1351 0
ENSG00000278267 1 17369 17436 - 68 0
ENSG00000243485 1;1;1 29554;30267;30976 30039;30667;31109 +;+;+ 1021 0
ENSG00000237613 1;1;1 34554;35245;35721 35174;35481;36081 -;-;- 1219 0
ENSG00000268020 1 52473 53312 + 840 0
ENSG00000240361 1 62948 63887 + 940 0
ENSG00000186092 1 69091 70008 + 918 0
$ head -10 counts.txt.summary
Status Aligned.sortedByCoord.out.bam
Assigned 0
Unassigned_Ambiguity 0
Unassigned_MultiMapping 0
Unassigned_NoFeatures 0
Unassigned_Unmapped 0
Unassigned_MappingQuality 0
Unassigned_FragmentLength 0
Unassigned_Chimera 0
Unassigned_Secondary
I am not sure what caused this "hiccup". Could the GTF file be an issue? It worked for STAR.
Your advice and help are highly appreciated.
Thanks, Anita
Thank you for your help. Anita