Hi,
Can anyone please let me know which percentages are showing precision and recall of reads in the star log.final.out file ? Is it the annotated splice junctions/total number of splice junctions ?
how about mapped correctly and mapped incorrectly ?
Thanks
So, how can I recognize if a mapping command with an optimized parameters is the best one among the rest of options by only the files of STAR?
You could:
1) find or create a test set, with known true location of each read, and perform your parameter tuning on this set - you can then calculate precision, recall, and so forth. However, parameter optimizations will likely by data-dependent, at least to some extent, and there is no guarantee the optimized parameters from the test set will be the best to your real data.
2) Perform dowstream analysis on the several alignments and see the quality of the results, or try to find out which ones "makes more sense" or is "biologically more relevant". If you are not careful, this will result on an intricate form of p-hacking, though.
3) Explore the mappings visually with IGV or other genome browser. You can open several bam files simultaneously in IGV (better convert them to bigWig, however) and look for discordant regions, to perform visual assessment of the mapping.
4) Stop overthinking (Improving the mapping rate by aligner parameters, STAR outputs interpretation, Parameter optimization STAR) and use the default parameters. At these other threads, you have been told your mapping rate seems just fine and similar to the mapping rate of good datasets.
Probably (4) is the best suggestion, and it is not the first time you heard it.
I have to write about the optimization process. That is why I need to perform optimization, however the default parameters values seem the best.
Thanks
If you had said this first, in your first question, you probably could have gotten better answers.