Entering edit mode
10 months ago
ju_ra
•
0
Hi guys,
I am analysing a mRNASeq dataset of extracellular vesicles. I get decent results for mapping using STAR
Number of input reads | 32143093
Average input read length | 100
UNIQUE READS:
Uniquely mapped reads number | 22862852
Uniquely mapped reads % | 71.13%
Average mapped length | 103.01
Number of splices: Total | 269874
Number of splices: Annotated (sjdb) | 125578
Number of splices: GT/AG | 235728
Number of splices: GC/AG | 4253
Number of splices: AT/AC | 243
Number of splices: Non-canonical | 29650
Mismatch rate per base, % | 0.42%
Deletion rate per base | 0.02%
Deletion average length | 1.79
Insertion rate per base | 0.01%
Insertion average length | 1.64
MULTI-MAPPING READS:
Number of reads mapped to multiple loci | 4125490
% of reads mapped to multiple loci | 12.83%
Number of reads mapped to too many loci | 575277
% of reads mapped to too many loci | 1.79%
UNMAPPED READS:
Number of reads unmapped: too many mismatches | 0
% of reads unmapped: too many mismatches | 0.00%
Number of reads unmapped: too short | 4016547
% of reads unmapped: too short | 12.50%
Number of reads unmapped: other | 562927
% of reads unmapped: other | 1.75%
CHIMERIC READS:
Number of chimeric reads | 0
% of chimeric reads | 0.00%
But after FeatureCounts most of them are not assigned to any feature.
Assigned 1766684
Unassigned_Unmapped 5154751
Unassigned_Read_Type 0
Unassigned_Singleton 0
Unassigned_MappingQuality 0
Unassigned_Chimera 0
Unassigned_FragmentLength 0
Unassigned_Duplicate 0
Unassigned_MultiMapping 22295015
Unassigned_Secondary 0
Unassigned_NonSplit 0
Unassigned_NoFeatures 20977810
Unassigned_Overlapping_Length 0
Unassigned_Ambiguity 118358
I never run into this issue. Do you have any idea what is a possible explanation?
Please show us your STAR and featureCounts commands.
STAR was run on the Galaxy Server as well as feature counts in standard settings.
Are the annotations/reference genome file identifiers matching? That is generally a prime cause of issues with counting. Have you examined the alignment to make sure reads are piling up under exons?
Since you are working with
this could have unique/odd characteristics compared to plain RNAseq. Assuming there was something special done to isolate the structures before making libraries.
Yes it is (as I test run an old sample with the same datasets and it worked out fine). Visualising the alignments it does not seem that they pile up under annotated regions but also in (see image).
The main question for me seems to be whether this is valid data from the extracellular vesicles or an technical error...
This does not look like RNAseq data.