Entering edit mode
7.0 years ago
XBria
▴
90
Hi everyone,
I am working on Rna-seq data. Star is mapping only on chromosome X, data are down-sampled. That is why the uniquely mapped alignments rate is close to 100. (paired-end , length of forward75, reverse,75)
Can I say the mapping is improved using these parameters ?
--outFilterMatchNmin 20 --seedSearchStartLmax 30 --outFilterScoreMinOverLread 0 --outFilterMatchNminOverLread 0 --outFilterMismatchNoverLmax 9
* I examine 3 samples out of 12, all three tests emerge the exact value of 98.24 rate of uniquely mapped reads !!!!!
Following is the output without these parameters:
Started job on | Dec 06 09:32:03
Started mapping on | Dec 06 09:32:11
Finished on | Dec 06 09:33:52
Mapping speed, Million of reads per hour | 47.10
Number of input reads | 1321477
Average input read length | 152
UNIQUE READS:
Uniquely mapped reads number | 1281329
Uniquely mapped reads % | 96.96%
Average mapped length | 151.01
Number of splices: Total | 701323
Number of splices: Annotated (sjdb) | 693399
Number of splices: GT/AG | 697684
Number of splices: GC/AG | 1968
Number of splices: AT/AC | 703
Number of splices: Non-canonical | 968
Mismatch rate per base, % | 0.43%
Deletion rate per base | 0.01%
Deletion average length | 1.53
Insertion rate per base | 0.01%
Insertion average length | 1.29
MULTI-MAPPING READS:
Number of reads mapped to multiple loci | 16029
% of reads mapped to multiple loci | 1.21%
Number of reads mapped to too many loci | 286
% of reads mapped to too many loci | 0.02%
UNMAPPED READS:
% of reads unmapped: too many mismatches | 0.00%
% of reads unmapped: too short | 1.80%
% of reads unmapped: other | 0.01%
CHIMERIC READS:
Number of chimeric reads | 0
% of chimeric reads | 0.00%
and with those parameters:
Started job on | Dec 06 09:45:39
Started mapping on | Dec 06 09:45:43
Finished on | Dec 06 09:48:04
Mapping speed, Million of reads per hour | 33.74
Number of input reads | 1321477
Average input read length | 152
UNIQUE READS:
Uniquely mapped reads number | 1298240
Uniquely mapped reads % | 98.24%
Average mapped length | 150.22
Number of splices: Total | 705133
Number of splices: Annotated (sjdb) | 696785
Number of splices: GT/AG | 701438
Number of splices: GC/AG | 1991
Number of splices: AT/AC | 710
Number of splices: Non-canonical | 994
Mismatch rate per base, % | 0.45%
Deletion rate per base | 0.01%
Deletion average length | 1.51
Insertion rate per base | 0.01%
Insertion average length | 1.29
MULTI-MAPPING READS:
Number of reads mapped to multiple loci | 22727
% of reads mapped to multiple loci | 1.72%
Number of reads mapped to too many loci | 406
% of reads mapped to too many loci | 0.03%
UNMAPPED READS:
% of reads unmapped: too many mismatches | 0.00%
% of reads unmapped: too short | 0.00%
% of reads unmapped: other | 0.01%
CHIMERIC READS:
Number of chimeric reads | 0
% of chimeric reads | 0.00%
Okay you increase the fraction of mapped reads, but how do you know those alignments are also "correct"? The percentage aligned, although informative, shouldn't be your only parameter for optimization.
I also wouldn't bother about a difference of only 2%.
Ref: Improving the mapping rate by aligner parameters
XBria : Creating new threads with variations of the questions from before is not going to help much. At the high end of alignment %, you are splitting hairs. Like @Wouter said above, if those 2 additional % are adding meaningfully to your analysis is questionable. You should be moving on to the actual differential expression analysis.
Dear Genomax, Could you share a link that clearly explains the trade-offs, I could not find any comprehensive resources on this issues. I need to know more about this. Thanks for understanding me as a beginner in this scope
What trade-offs are you referring to?
I mean the best choices among different optimization results. (it may result in withdrawal of some parameters to be set) How to clearly recognize if we are heading an optimal way ? based on which criteria ?
The aim is to improve mapping through parameters setting. How may I know if they are correct ? Is it not showing a good improvements then, is that right ?