Hello,
My question is where my "Multi-mapped reads" went after STAR Alignment to Quantification with Rsubread featureCounts?
I am aligning my 151 bp untrimmed paired reads to the Human Primary Assembly from Gencode and then quantifying the reads using Rsubread's featureCounts().
I ran STAR with the following parameters:
STAR --genomeDir /storage/genomeDir --runThreadN 6
--readFilesIn /storage/mock1.AAGCATCCTG-GACCGATACA_R1.fastq /storage/merged/mock1.AAGCATCCTG-GACCGATACA_R2.fastq
--outFileNamePrefix /storage/mock1.AAGCATCCTG-GACCGATACA_
--outSAMtype BAM SortedByCoordinate --outSAMunmapped Within --outSAMattributes Standard
--outFilterMatchNminOverLread 0 --outFilterScoreMinOverLread 0 --outFilterMatchNmin 40 --outReadsUnmapped Fastx
I set a cutoff of 40 bp that must align for a read pair to be considered mapped because my reads are untrimmed and the default 66% of the read pair required to map was causing many to be considered "too short" & unmapped. I am not sure of how to appropriately choose the best parameter so any advice of read exploration would be appreciated.
I got 85% uniquely mapped and 13.8% multi-mapped so the results seem good.
I then quantified the pairs using featureCounts in R:
mock2_quant <- featureCounts(files="mock1.AAGCATCCTG-GACCGATACA_Aligned.sortedByCoord.out.bam",isPairedEnd=TRUE, GTF.featureType="gene",GTF.attrType="gene_id",
annot.ext= "/storage/Genome_files/gencode.v41.primary_assembly.annotation.gtf" , isGTFAnnotationFile=TRUE)
In my output, I get the following:
Assigned 20909396
Unassigned_Unmapped 64991
Unassigned_MultiMapping 0
Unassigned_NoFeatures 7240676
Unassigned_Overlapping_Length 0
Unassigned_Ambiguity 6649467
This seems fine but why is unassigned_MultiMapping equal to 0 if 13% from the STAR Alignmnent were seen to be Multimappers? Did I use a parameter that is including these with the other categories in the output such as "Aligned"?
I thought I was missing a parameter, thank you!
Hi Marco,
I re-ran featureCounts and added the parameter
countMultiMappingReads = TRUE
which is equivalent to -M in R but I got the exact same results....any thoughts?