Entering edit mode
5.6 years ago
luca
▴
70
Hi everyone, I performed an RNA-seq approach on mice using 3'tag sequencing. I mapped the reads on the mouse genome using STAR (on average >70% reads mapped uniquely) and I wanted to get the raw counts with featureCounts. The code I am using is this:
featureCounts -a Mus_musculus.GRCm38.95.gtf -t exon -g gene_id --primary -T 16 -o counts_w_extraAttributes-primary.txt E12.5/Aligned.sortedByCoord.out.bam E14.5/Aligned.sortedByCoord.out.bam E18.5/Aligned.sortedByCoord.out.bam...
The output from featureCount is kind of strange (to me at least) because it says that the "Successfully assigned alignments" is, on average, 40%. I think it is quite low as number, so I was wondering if I am doing something incorrect?
Thanks for your helpful replies, Best Luca
Have you tried to add
-M
option to see how the counts change? Also important to keep in mind that while STAR may have been able to map a certain % of reads unless there is a feature defined for a region, reads will not be counted. Is 3'-tag sequencing capturing a certain strand (top/bottom) then you should specify that as well (-s
option). By defaultfeatureCounts
treats data as unstranded (-s 0
).Edit: I am going to edit this post since I have hit my post limit for the day.
If your kit was stranded then definitely use the right
-s
option (sounds like-s 1
is that option).Dear genomax, Thanks for your reply. I tried adding the
-M
option and the % of Successfully assigned alignments increases on average by 15/20%. I have not specified any strand with the-s
option but the kit is strand specific. I checked and the best results are with-s 1
. Do you think I should include-M
and count also the multi mapping reads?Thanks genomax! In relation to the multi mapping reads, is there a "gold standard" procedure (i.e. to include them or exclude them)? Thanks Luca
Multi-mapped reads are generally excluded since you can't be sure of the gene/region they originated from. Some aligners allow you to place them at a random spot out of all the places that they map to.
There are alternate strategies (e.g. mapping instead of alignment in
salmon
, https://salmon.readthedocs.io/en/latest/ ) which can be used to deal with them. Since you have 3'-end specific data I am not sure you can use that option.Thanks! I will follow your suggestion and ignore multi mapped reads
Luca