Multimapped reads in STAR alignment and subread::featureCounts()
2
0
Entering edit mode
2.1 years ago
Saran ▴ 50

Hello,

My question is where my "Multi-mapped reads" went after STAR Alignment to Quantification with Rsubread featureCounts?

I am aligning my 151 bp untrimmed paired reads to the Human Primary Assembly from Gencode and then quantifying the reads using Rsubread's featureCounts().

I ran STAR with the following parameters:

STAR --genomeDir /storage/genomeDir --runThreadN 6
--readFilesIn /storage/mock1.AAGCATCCTG-GACCGATACA_R1.fastq  /storage/merged/mock1.AAGCATCCTG-GACCGATACA_R2.fastq
--outFileNamePrefix /storage/mock1.AAGCATCCTG-GACCGATACA_
 --outSAMtype BAM SortedByCoordinate --outSAMunmapped Within --outSAMattributes Standard 
--outFilterMatchNminOverLread 0 --outFilterScoreMinOverLread 0 --outFilterMatchNmin  40 --outReadsUnmapped Fastx

I set a cutoff of 40 bp that must align for a read pair to be considered mapped because my reads are untrimmed and the default 66% of the read pair required to map was causing many to be considered "too short" & unmapped. I am not sure of how to appropriately choose the best parameter so any advice of read exploration would be appreciated.

I got 85% uniquely mapped and 13.8% multi-mapped so the results seem good.

I then quantified the pairs using featureCounts in R:

mock2_quant <- featureCounts(files="mock1.AAGCATCCTG-GACCGATACA_Aligned.sortedByCoord.out.bam",isPairedEnd=TRUE, GTF.featureType="gene",GTF.attrType="gene_id",
                       annot.ext= "/storage/Genome_files/gencode.v41.primary_assembly.annotation.gtf" , isGTFAnnotationFile=TRUE)

In my output, I get the following:

Assigned                    20909396
Unassigned_Unmapped         64991
Unassigned_MultiMapping         0
Unassigned_NoFeatures           7240676
Unassigned_Overlapping_Length   0
Unassigned_Ambiguity            6649467

This seems fine but why is unassigned_MultiMapping equal to 0 if 13% from the STAR Alignmnent were seen to be Multimappers? Did I use a parameter that is including these with the other categories in the output such as "Aligned"?

Multimap STAR featureCounts alignment Rsubread • 1.8k views
ADD COMMENT
3
Entering edit mode
2.1 years ago
Marco Pannone ▴ 810

If you do not use argument -M you do not consider multimapping reads in featureCounts. (At least in the command line, never used featureCounts in R tho).

ADD COMMENT
0
Entering edit mode

I thought I was missing a parameter, thank you!

ADD REPLY
0
Entering edit mode

Hi Marco,

I re-ran featureCounts and added the parameter countMultiMappingReads = TRUE which is equivalent to -M in R but I got the exact same results....any thoughts?

ADD REPLY
1
Entering edit mode
2.1 years ago
Saran ▴ 50

I figured out that if you specify countMultiMappingReads = TRUE with Rsubread::featureCounts() then these counts will be added to your 'Aligned' counts. If you set it to False, then they will display in Unassigned_MultiMapping as shown below after I added the parameter: countMultiMappingReads = FALSE.

Assigned               15124819
Unassigned_Unmapped    64991
Unassigned_MultiMapping    10844204
Unassigned_NoFeatures      4538772
Unassigned_Ambiguity       4291744

Now a very large portion of my reads are seen to be multimappers.... on to the next issue.

ADD COMMENT
0
Entering edit mode

UPDATE: The count of fragments mapping to multiple locations is inflated because there is a count for every location that it is mapping; its important not to look at those results as if its a count per read/fragment.

ADD REPLY

Login before adding your answer.

Traffic: 2497 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6