Entering edit mode
16 days ago
rajdeepboral00
•
0
Should i include Multimapping reads in my RNA-seq data analysis? My successfully assigned alignments is coming as low as 45.1% . I have used this command featureCounts -p -T 8 -a /data/sata_data/home/rajdeep/GRCm39/gencode.vM36.chr_patch_hapl_scaff.annotation.gtf -s 2 -O --countReadPairs -o counts_CC2.txt sorted_39aligned_CC1.bam
Is there any way i can improve my reads bt chnging any parameter in featureCounts?
|| Process BAM file sorted_CC2.bam...
|| Strand specific : reversely stranded
|| Paired-end reads are included.
|| Total alignments : 76891146
|| Successfully assigned alignments : 34688363 (45.1%)
Assigned 34688363
Unassigned_Unmapped 1274794
Unassigned_Read_Type 0
Unassigned_Singleton 0
Unassigned_MappingQuality 0
Unassigned_Chimera 0
Unassigned_FragmentLength 0
Unassigned_Duplicate 0
Unassigned_MultiMapping 12674576
Unassigned_Secondary 0
Unassigned_NonSplit 0
Unassigned_NoFeatures 28253413
Unassigned_Overlapping_Length 0
Unassigned_Ambiguity 0
You don't gain anything by counting them since it is not reliable to where they map. Still, you have 35M successful assignments, that's good in most situations. See how downstream analysis goes first. Also, any reason you use the GTF with the haplotypes, that is unusual.
What gtf file i should use then, i dont know much of this.I found the gtf file in gencode .Kindly reply.
How about the one that says "This is the main annotation file for most users" which is the Basic Annotation at https://www.gencodegenes.org/human/
Looks like this is mouse data but your point stands. Use: https://www.gencodegenes.org/mouse/
This is the result when i used the basic gtf file that you suggested. Still the assigned reads are quite low, even lower when i used the previous gtf file. Which one i should use then? |