Question

Rsubread featureCounts returns no matches

0

Entering edit mode

4.3 years ago

devarts ▴ 40

Rsubread feature counts returns 0 matching features. I'm working with paired end reads from Yarrowia lipolytica (CLIB89) attempting to get to differential expression. My reference genome and annotations (gtf) are downloaded from NCBI. I was able to do an alignment, but 0 matching features/ counts file with 0 for every feature in every sample.

My code for the attempted generation of a counts file is as follows.

bam_files<- list.files(path= "/path/to/Genewiz-us-ngs-00_fastq", pattern = "*subread.BAM$", full.names = TRUE)

    fc <- featureCounts(bam_files, annotext. = "/path/to/gtf?GCF_000002525.2_ASM252v1genomic.gtf.gz", isGTFAnnotationFile = TRUE, isPairedEnd = TRUE)

names(fc)

The console output is here. (I included only one of the 24 BAM File reports. They all give 0 succesful alignments.

 ==========     _____ _    _ ____  _____  ______          _____ 

    =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \

      =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |

        ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |

          ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |

    ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/

   Rsubread 2.2.6



//========================== featureCounts setting ===========================\\

||                                                                            ||

||             Input files : 24 BAM files                                     ||

||                           o F353-B-glu_R1_001.fastq.gz.subread.BAM         ||

||                           o F353-B-mal_R1_001.fastq.gz.subread.BAM         ||

||                           o F353-C-glu_R1_001.fastq.gz.subread.BAM         ||

||                           o F353-C-mal_R1_001.fastq.gz.subread.BAM         ||

||                           o F535-A-glu_R1_001.fastq.gz.subread.BAM         ||

||                           o F535-A-mal_R1_001.fastq.gz.subread.BAM         ||

||                           o F570-A-glu_R1_001.fastq.gz.subread.BAM         ||

||                           o F570-A-mal_R1_001.fastq.gz.subread.BAM         ||

||                           o F570-B-glu_R1_001.fastq.gz.subread.BAM         ||

||                           o F570-B-mal_R1_001.fastq.gz.subread.BAM         ||

||                           o F570-C-glu_R1_001.fastq.gz.subread.BAM         ||

||                           o F570-C-mal_R1_001.fastq.gz.subread.BAM         ||

||                           o F580-A-glu_R1_001.fastq.gz.subread.BAM         ||

||                           o F580-A-mal_R1_001.fastq.gz.subread.BAM         ||

||                           o F580-B-glu_R1_001.fastq.gz.subread.BAM         ||

||                           o F580-B-mal_R1_001.fastq.gz.subread.BAM         ||

||                           o F580-C-glu_R1_001.fastq.gz.subread.BAM         ||

||                           o F580-C-mal_R1_001.fastq.gz.subread.BAM         ||

||                           o F584-A-glu_R1_001.fastq.gz.subread.BAM         ||

||                           o F584-A-mal_R1_001.fastq.gz.subread.BAM         ||

||                           o F584-B-glu_R1_001.fastq.gz.subread.BAM         ||

||                           o F584-B-mal_R1_001.fastq.gz.subread.BAM         ||

||                           o F584-C-glu_R1_001.fastq.gz.subread.BAM         ||

||                           o F584-C-mal_R1_001.fastq.gz.subread.BAM         ||

||                                                                            ||

||              Annotation : GCF_000002525.2_ASM252v1_genomic.gtf.gz (GTF)    ||

||      Dir for temp files : .                                                ||

||                 Threads : 1                                                ||

||                   Level : meta-feature level                               ||

||              Paired-end : yes                                              ||

||      Multimapping reads : counted                                          ||

|| Multi-overlapping reads : not counted                                      ||

||   Min overlapping bases : 1                                                ||

||                                                                            ||

||          Chimeric reads : counted                                          ||

||        Both ends mapped : not required                                     ||

||                                                                            ||

\\============================================================================//



//================================= Running ==================================\\

||                                                                            ||

|| Load annotation file GCF_000002525.2_ASM252v1_genomic.gtf.gz ...           ||

||    Features : 8505                                                         ||

||    Meta-features : 7115                                                    ||

||    Chromosomes/contigs : 7                                                 ||

||                                                                            ||

|| Process BAM file F353-B-glu_R1_001.fastq.gz.subread.BAM...                 ||

||    Paired-end reads are included.                                          ||

||    Total alignments : 19722583                                             ||

||    Successfully assigned alignments : 0 (0.0%)                             ||

||    Running time : 0.97 minutes

I'm wondering if, perhaps there is a problem with the type of GTF annotation file from NCBI. Maybe it's the way the reads are labeled or something? I looked for annotations from other sources. Entrez has a set, but it's separated into different files for each chromosome, rather than being a complete annotation for the organisms genome assembly.

Any help you all might be able to give me in generating an accurate complete set of counts would be very much appreciated. Thanks in advance. -David

RNA-Seq Feature Counting Rsubread featureCount • 1.5k views

ADD COMMENT • link 4.3 years ago by devarts ▴ 40

0

Entering edit mode

Thanks to genomax, I found that my chromosome identifiers were not identical in my reference genome and my annotations file, which led to the problem. I was able to rectify this.

ADD REPLY • link 4.3 years ago by devarts ▴ 40

score 1 · Answer 1 · 2020-08-24

1

Entering edit mode

4.3 years ago

GenoMax 147k

Are chromosome identifiers identical in your GTF/Reference/Alignment file?

ADD COMMENT • link 4.3 years ago by GenoMax 147k

0

Entering edit mode

Good question. I'm going to take a look at the files.

ADD REPLY • link 4.3 years ago by devarts ▴ 40

0

Entering edit mode

This was the problem. I was, thus able to fix it. Thank You!

ADD REPLY • link 4.3 years ago by devarts ▴ 40

0

Entering edit mode

I've moved his comment to an answer. Please accept it to mark the question as solved.

ADD REPLY • link 4.3 years ago by Ram 44k