Question

How to interpret log.final.out in star

0

Entering edit mode

6.8 years ago

XBria ▴ 90

Hi,

Can anyone please let me know which percentages are showing precision and recall of reads in the star log.final.out file ? Is it the annotated splice junctions/total number of splice junctions ?

how about mapped correctly and mapped incorrectly ?

Thanks

RNA-Seq • 3.8k views

ADD COMMENT • link updated 6.8 years ago by h.mon 35k • written 6.8 years ago by XBria ▴ 90

score 2 · Accepted Answer · 2018-03-02

2

Entering edit mode

6.8 years ago

h.mon 35k

There is no such information at the .Log.final.out. To calculate precision and recall, STAR would have to know the true location of each read, map them, and check if it mapped it correctly or not. With this information it could calculate precision and recall.

STAR (or any other mapper) doesn't know anything about the true position of the reads. STAR is trying to estimate the true mapping location of each read, but doesn't know if got it right or not.

ADD COMMENT • link 6.8 years ago by h.mon 35k

0

Entering edit mode

So, how can I recognize if a mapping command with an optimized parameters is the best one among the rest of options by only the files of STAR?

ADD REPLY • link 6.8 years ago by XBria ▴ 90

1

Entering edit mode

You could:

1) find or create a test set, with known true location of each read, and perform your parameter tuning on this set - you can then calculate precision, recall, and so forth. However, parameter optimizations will likely by data-dependent, at least to some extent, and there is no guarantee the optimized parameters from the test set will be the best to your real data.

2) Perform dowstream analysis on the several alignments and see the quality of the results, or try to find out which ones "makes more sense" or is "biologically more relevant". If you are not careful, this will result on an intricate form of p-hacking, though.

3) Explore the mappings visually with IGV or other genome browser. You can open several bam files simultaneously in IGV (better convert them to bigWig, however) and look for discordant regions, to perform visual assessment of the mapping.

4) Stop overthinking (Improving the mapping rate by aligner parameters, STAR outputs interpretation, Parameter optimization STAR) and use the default parameters. At these other threads, you have been told your mapping rate seems just fine and similar to the mapping rate of good datasets.

Probably (4) is the best suggestion, and it is not the first time you heard it.

ADD REPLY • link 6.8 years ago by h.mon 35k

0

Entering edit mode

I have to write about the optimization process. That is why I need to perform optimization, however the default parameters values seem the best.

Thanks

ADD REPLY • link 6.8 years ago by XBria ▴ 90

0

Entering edit mode

If you had said this first, in your first question, you probably could have gotten better answers.

ADD REPLY • link 6.8 years ago by swbarnes2 14k