Hi everyone ,
I have ran this feature count command :
featureCounts -T 4 -p -a gencode.v43.basic.annotation.gtf -o featurecount.txt *.bam
this gave me this error :
Process BAM file UI_E2_sorted.bam.bam... ||
|| Paired-end reads are included. ||
|| The reads are assigned on the single-end mode. ||
|| Total alignments : 49308686 ||
|| Successfully assigned alignments : 0 (0.0%) ||
|| Running time : 0.26 minutes
this error coming for all bam files.
I am using gencode.v43.basic.annotation.gtf as the annotation file but the feature count command is giving zero assigned alignments .Please help on this , I am not sure if it's the right gtf annotation file , I got it from NCBI.
I am using human ref genome hg38 for my analysis of RNA seq data .
This is how my bam file header looks by using this command :
samtools_0.1.18 view RA1_E2_sorted.bam.bam | head
NB551648:44:HK5KLBGXG:3:23507:16037:10429 419 NC_000001.11 13158 1 76M = 183744 150203 GAAGGGGATGCACTGTTGGGGAGGCAGCTGTAACTCAAAGCCTTAGCCTCTGTTCCCACGAAGGCAGGGCCATCAG AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE/EEEE AS:i:0 ZS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:76 YS:i:0 YT:Z:CP NH:i:5
NB551648:44:HK5KLBGXG:3:23507:16037:10429 419 NC_000001.11 13158 1 76M = 13225 143 GAAGGGGATGCACTGTTGGGGAGGCAGCTGTAACTCAAAGCCTTAGCCTCTGTTCCCACGAAGGCAGGGCCATCAG AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE/EEEE AS:i:0 ZS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:76 YS:i:0 YT:Z:CP NH:i:5
NB551648:44:HK5KLBGXG:3:23507:16037:10429 339 NC_000001.11 13225 1 76M = 13158 -143 GGCCATCAGGCACCAAAGGGATTCTGCCAGCATAGTGCTCCTGGACCAGTGATACACCCGGCACCCTGTCCTGGAC EEEEEEEEAEEEAEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAAAA AS:i:0 ZS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:76 YS:i:0 YT:Z:CP NH:i:5
NB551648:44:HK5KLBGXG:3:12601:18920:17906 355 NC_000001.11 14360 1 2S74M = 14414 132 GTCATCCTGCACAGCTAGAGATCCTTTATTAAAAGCACACTGTTGGTTTCTGCTCAGTTCTTTATTGATTGGTGTG AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEE AS:i:-2 ZS:i:-2 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:74 YS:i:0 YT:Z:CP NH:i:2
NB551648:44:HK5KLBGXG:1:23104:13420:13760 355 NC_000001.11 14362 1 2S74M = 14403 119 CGTCCTGCACAGCTAGAGATCCTTTATTAAAAGCACACTGTTGGTTTCTGCTCAGTTCTTTATTGATTGGTGTGCC AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE AS:i:-2 ZS:i:-2 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:74 YS:i:0 YT:Z:CP NH:i:2
NB551648:44:HK5KLBGXG:3:22601:17672:18350 355 NC_000001.11 14366 1 1S75M = 14517 228 AGCACAGCTAGAGATCCTTTATTAAAAGCACACTGTTGGTTTCTGCTCAGTTCTTTATTGATTGGTGTGCCGTTTT AAAAAEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE AS:i:-1 ZS:i:-1 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:75 YS:i:0 YT:Z:CP NH:i:5
NB551648:44:HK5KLBGXG:3:22601:17672:18350 355 NC_000001.11 14366 1 1S75M = 185038 150290 AGCACAGCTAGAGATCCTTTATTAAAAGCACACTGTTGGTTTCTGCTCAGTTCTTTATTGATTGGTGTGCCGTTTT AAAAAEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE AS:i:-1 ZS:i:-1 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:75 YS:i:0 YT:Z:CP NH:i:5
NB551648:44:HK5KLBGXG:4:23409:8298:7644 355 NC_000001.11 14367 0 76M = 184930 150180 CACAGCTAGAGATCCTTTATTAAAAGCACACTGTTGGTTTCTGCTCAGTTCTTTATTGATTGGTGTGCCATTTTCT AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE AS:i:-5 ZS:i:-5 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:69G6 YS:i:-10 YT:Z:CP NH:i:5
NB551648:44:HK5KLBGXG:2:13208:13922:5858 355 NC_000001.11 14399 1 76M = 14437 114 GTTGGTTTCTGCTCAGTTCTTTATTGATTGGTGTGCCGTTTTCTCTGGAAGCCTCTTAAGAACACAGTGGCGCAGG A//AAEEEEEEAA/A/AEE<EEEEE6AE6//E/AA//E/EEAEEEEE/EEE/6E6E<AA<AEAE6A////EE/AAE AS:i:0 ZS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:76 YS:i:0 YT:Z:CP NH:i:5
NB551648:44:HK5KLBGXG:3:22508:3274:10922 99 NC_000001.11 14402 1 76M = 14492 166 GGTTTCTGCTCAGTTCTTTATTGATTGGTGTGCCGTTTTCTCTGGAAGCCTCTTAAGAACACAGTGGCGCAGGCTG AAAAAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE AS:i:0 ZS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:76 YS:i:0 YT:Z:CP NH:i:5
and this is how my gtf file looks :
##description: evidence-based annotation of the human genome (GRCh38), version 43 (Ensembl 109)
##provider: GENCODE
##contact: gencode-help@ebi.ac.uk
##format: gtf
##date: 2022-11-29
chr1 HAVANA gene 11869 14409 . + . gene_id "ENSG00000290825.1"; gene_type "lncRNA"; gene_name "DDX11L2"; level 2; tag "overlaps_pseudogene";
chr1 HAVANA transcript 11869 14409 . + . gene_id "ENSG00000290825.1"; transcript_id "ENST00000456328.2"; gene_type "lncRNA"; gene_name "DDX11L2"; transcript_type "lncRNA"; transcript_name "DDX11L2-202"; level 2; transcript_support_level "1"; tag "basic"; tag "Ensembl_canonical"; havana_transcript "OTTHUMT00000362751.1";
chr1 HAVANA exon 11869 12227 . + . gene_id "ENSG00000290825.1"; transcript_id "ENST00000456328.2"; gene_type "lncRNA"; gene_name "DDX11L2"; transcript_type "lncRNA"; transcript_name "DDX11L2-202"; exon_number 1; exon_id "ENSE00002234944.1"; level 2; transcript_support_level "1"; tag "basic"; tag "Ensembl_canonical"; havana_transcript "OTTHUMT00000362751.1";
chr1 HAVANA exon 12613 12721 . + . gene_id "ENSG00000290825.1"; transcript_id "ENST00000456328.2"; gene_type "lncRNA"; gene_name "DDX11L2"; transcript_type "lncRNA"; transcript_name "DDX11L2-202"; exon_number 2; exon_id "ENSE00003582793.1"; level 2; transcript_support_level "1"; tag "basic"; tag "Ensembl_canonical"; havana_transcript "OTTHUMT00000362751.1";
chr1 HAVANA exon 13221 14409 . + . gene_id "ENSG00000290825.1"; transcript_id "ENST00000456328.2"; gene_type "lncRNA"; gene_name "DDX11L2"; transcript_type "lncRNA"; transcript_name "DDX11L2-202"; exon_number 3; exon_id "ENSE00002312635.1"; level 2; transcript_support_level "1"; tag "basic"; tag "Ensembl_canonical"; havana_transcript "OTTHUMT00000362751.1";
Hi, thank you so much for your reply, I am really struggling with this issue since a quite long time now .
Do you have any idea which annotation file will work for my bam .
Since you appear to be using NCBI's genome you should find the corresponding GTF and GFF files here: https://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/annotation/GRCh38_latest/refseq_identifiers/
but still its giving this error
Add option
-g gene
to your featureCounts command.Hi,
now I am using the compatible gtf file
that looks like this