Question

Comparison of log file : STAR vs Hisat

0

Entering edit mode

7.7 years ago

XBria ▴ 90

Hello,

I am going to draw a table of comparison between STAR and Hisat output. How to put them in one common table while they represent different features ?! example of STAR : Started job on | Nov 29 12:19:49 Started mapping on | Nov 29 12:19:53 Finished on | Nov 29 12:22:38 Mapping speed, Million of reads per hour | 28.83

                      Number of input reads |   1321477
                  Average input read length |   152
                                UNIQUE READS:
               Uniquely mapped reads number |   1281331
                    Uniquely mapped reads % |   96.96%
                      Average mapped length |   151.01
                   Number of splices: Total |   701317
        Number of splices: Annotated (sjdb) |   693405
                   Number of splices: GT/AG |   697677
                   Number of splices: GC/AG |   1967
                   Number of splices: AT/AC |   703
           Number of splices: Non-canonical |   970
                  Mismatch rate per base, % |   0.43%
                     Deletion rate per base |   0.01%
                    Deletion average length |   1.53
                    Insertion rate per base |   0.01%
                   Insertion average length |   1.29
                         MULTI-MAPPING READS:
    Number of reads mapped to multiple loci |   16028
         % of reads mapped to multiple loci |   1.21%
    Number of reads mapped to too many loci |   285
         % of reads mapped to too many loci |   0.02%
                              UNMAPPED READS:
   % of reads unmapped: too many mismatches |   0.00%
             % of reads unmapped: too short |   1.80%
                 % of reads unmapped: other |   0.01%
                              CHIMERIC READS:
                   Number of chimeric reads |   0
                        % of chimeric reads |   0.00%

Hisat output :

1321477 reads; of these:

1321477 (100.00%) were paired; of these:

108522 (8.21%) aligned concordantly 0 times

1042850 (78.92%) aligned concordantly exactly 1 time

170105 (12.87%) aligned concordantly >1 times
----
108522 pairs aligned concordantly 0 times; of these:

  4952 (4.56%) aligned discordantly 1 time
----
103570 pairs aligned 0 times concordantly or discordantly; of these:

  207140 mates make up the pairs; of these:

    99460 (48.02%) aligned 0 times

    82638 (39.89%) aligned exactly 1 time

    25042 (12.09%) aligned >1 times

96.24% overall alignment rate

Anyone can please help me.

RNA-Seq • 4.3k views

ADD COMMENT • link 7.7 years ago by XBria ▴ 90

1

Entering edit mode

At face value the STAR results look so much better that I don't think I'd bother making a table of things. If the STAR results are correct then that's the only thing that matters (this would be what every published comparison I've seen has indicated).

ADD REPLY • link 7.7 years ago by Devon Ryan 105k

0

Entering edit mode

Could you please let me know that how you would say if STAR results seem much better ?

ADD REPLY • link 7.7 years ago by XBria ▴ 90

0

Entering edit mode

STAR has a 97% unique alignment rate, hisat2 is showing closer to 80% for that.

ADD REPLY • link 7.7 years ago by Devon Ryan 105k

0

Entering edit mode

what the article says is other than what you mention.

"Because we prepared the data for this protocol by aligning all reads in the initial data sets to the whole genome and then extracting only those reads that aligned to chromosome X and their mates, we expect a mapping rate close to 100% for the reads in our reduced data set."

I think STAR is 96.96 and hisat 96.24 overall alignment rate. I hope I am right.

ADD REPLY • link 7.7 years ago by XBria ▴ 90

1

Entering edit mode

The article doesn't mention anything remotely related to what you're talking about. As I wrote, STAR has a unique alignment rate or ~97% and hisat2 has a unique alignment rate of ~80%. hisat2 does not have a unique alignment rate of 96% (unsurprisingly, STAR essentially always out performs hisat2 in comparisons).

ADD REPLY • link 7.7 years ago by Devon Ryan 105k

0

Entering edit mode

Are you sure they were aligned to the same exact reference in comparable ways? Hisat is counting 10x as many reads aligning to multiple loci.

ADD REPLY • link 7.7 years ago by swbarnes2 15k