Rsubread feature counts returns 0 matching features. I'm working with paired end reads from Yarrowia lipolytica (CLIB89) attempting to get to differential expression. My reference genome and annotations (gtf) are downloaded from NCBI. I was able to do an alignment, but 0 matching features/ counts file with 0 for every feature in every sample.
My code for the attempted generation of a counts file is as follows.
bam_files<- list.files(path= "/path/to/Genewiz-us-ngs-00_fastq", pattern = "*subread.BAM$", full.names = TRUE)
fc <- featureCounts(bam_files, annotext. = "/path/to/gtf?GCF_000002525.2_ASM252v1genomic.gtf.gz", isGTFAnnotationFile = TRUE, isPairedEnd = TRUE)
names(fc)
The console output is here. (I included only one of the 24 BAM File reports. They all give 0 succesful alignments.
========== _____ _ _ ____ _____ ______ _____
===== / ____| | | | _ \| __ \| ____| /\ | __ \
===== | (___ | | | | |_) | |__) | |__ / \ | | | |
==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
Rsubread 2.2.6
//========================== featureCounts setting ===========================\\
|| ||
|| Input files : 24 BAM files ||
|| o F353-B-glu_R1_001.fastq.gz.subread.BAM ||
|| o F353-B-mal_R1_001.fastq.gz.subread.BAM ||
|| o F353-C-glu_R1_001.fastq.gz.subread.BAM ||
|| o F353-C-mal_R1_001.fastq.gz.subread.BAM ||
|| o F535-A-glu_R1_001.fastq.gz.subread.BAM ||
|| o F535-A-mal_R1_001.fastq.gz.subread.BAM ||
|| o F570-A-glu_R1_001.fastq.gz.subread.BAM ||
|| o F570-A-mal_R1_001.fastq.gz.subread.BAM ||
|| o F570-B-glu_R1_001.fastq.gz.subread.BAM ||
|| o F570-B-mal_R1_001.fastq.gz.subread.BAM ||
|| o F570-C-glu_R1_001.fastq.gz.subread.BAM ||
|| o F570-C-mal_R1_001.fastq.gz.subread.BAM ||
|| o F580-A-glu_R1_001.fastq.gz.subread.BAM ||
|| o F580-A-mal_R1_001.fastq.gz.subread.BAM ||
|| o F580-B-glu_R1_001.fastq.gz.subread.BAM ||
|| o F580-B-mal_R1_001.fastq.gz.subread.BAM ||
|| o F580-C-glu_R1_001.fastq.gz.subread.BAM ||
|| o F580-C-mal_R1_001.fastq.gz.subread.BAM ||
|| o F584-A-glu_R1_001.fastq.gz.subread.BAM ||
|| o F584-A-mal_R1_001.fastq.gz.subread.BAM ||
|| o F584-B-glu_R1_001.fastq.gz.subread.BAM ||
|| o F584-B-mal_R1_001.fastq.gz.subread.BAM ||
|| o F584-C-glu_R1_001.fastq.gz.subread.BAM ||
|| o F584-C-mal_R1_001.fastq.gz.subread.BAM ||
|| ||
|| Annotation : GCF_000002525.2_ASM252v1_genomic.gtf.gz (GTF) ||
|| Dir for temp files : . ||
|| Threads : 1 ||
|| Level : meta-feature level ||
|| Paired-end : yes ||
|| Multimapping reads : counted ||
|| Multi-overlapping reads : not counted ||
|| Min overlapping bases : 1 ||
|| ||
|| Chimeric reads : counted ||
|| Both ends mapped : not required ||
|| ||
\\============================================================================//
//================================= Running ==================================\\
|| ||
|| Load annotation file GCF_000002525.2_ASM252v1_genomic.gtf.gz ... ||
|| Features : 8505 ||
|| Meta-features : 7115 ||
|| Chromosomes/contigs : 7 ||
|| ||
|| Process BAM file F353-B-glu_R1_001.fastq.gz.subread.BAM... ||
|| Paired-end reads are included. ||
|| Total alignments : 19722583 ||
|| Successfully assigned alignments : 0 (0.0%) ||
|| Running time : 0.97 minutes
I'm wondering if, perhaps there is a problem with the type of GTF annotation file from NCBI. Maybe it's the way the reads are labeled or something? I looked for annotations from other sources. Entrez has a set, but it's separated into different files for each chromosome, rather than being a complete annotation for the organisms genome assembly.
Any help you all might be able to give me in generating an accurate complete set of counts would be very much appreciated. Thanks in advance. -David
Thanks to genomax, I found that my chromosome identifiers were not identical in my reference genome and my annotations file, which led to the problem. I was able to rectify this.