Dear community,
I have been struggling finding the problem for the past few days now.
I work on making a differential expression analysis (DEA) with direct RNA nanopore long reads from the minion platform.
I ran into this wall and I would really appreciate your help!
I sequenced the RNA of two strains but now further down the analysis only one of both seems to work in the pipeline.
I start with my freshly basecalled "calls.fastq" as well as my indexed reference genome.
library(Rsubread)
setwd("/home/ubuntu/Desktop/Aime/AnalyzeG_AAR_11_09_2023/Rsubread_Analysis")
align(index="my_index",readfile1="callsxx.fastq",output_file="subread_resultsx.bam",nthreads = 20,indels = 0,maxMismatches = 5)
What results in a about 50-70% mapping rate depending on the settings. Unfortunately the following command gives me 0% reads. The exact same pipeline works on another dataset on my computer and I am struggling to find a reason!
featureCounts(files="subread_resultsx.bam",annot.ext="Abaye.gtf.gz",isGTFAnnotationFile=TRUE,GTF.featureType="gene",GTF.attrType="gene_id")
fc<-featureCounts(files="subread_resultsx.bam",annot.ext="Abaye.gtf.gz",isGTFAnnotationFile=TRUE,GTF.featureType="gene",GTF.attrType="gene_id",minOverlap = 0)
It somehow worked with another GTF File now. But I still have only about 50% alignment
What does the summary say? (featureCounts generates a summary file with different classifications, more about it on the subread manual under section 6.2.9: https://bioconductor.org/packages/release/bioc/vignettes/Rsubread/inst/doc/SubreadUsersGuide.pdf )
Note for posterity: For subread version v2.0.4