I've run the RNA-seq alignment software HISAT2 on 75bp PE reads in fastq files like this:
hisat2 \
-q \
--phred33 \
--n-ceil L,0,0.15 \
--pen-cansplice 0 \
--pen-noncansplice 12 \
--pen-canintronlen G,-8,1 \
--pen-noncanintronlen G,-8,1 \
--min-intronlen 20 \
--max-intronlen 500000 \
--known-splicesite-infile Homo_sapiens.GRCh38.splicesites.tsv \
--novel-splicesite-outfile out_HISAT/38.89/pass1/ERR188083/splicesites.novel.tsv \
--rna-strandness FR \
--mp 6,2 \
--sp 2,1 \
--np 1 \
--rdg 5,3 \
--rfg 5,3 \
--score-min L,0.0,-0.2 \
-k 5 \
--fr \
--summary-file out_HISAT/38.89/pass1/ERR188083/summary.txt \
--new-summary \
-p 8 \
--mm \
--seed 0 \
--remove-chrname \
-x Homo_sapiens.GRCh38 \
-1 ../../../data/geuv/fastq/ERR188083_1.fastq.gz \
-2 ../../../data/geuv/fastq/ERR188083_2.fastq.gz \
-S out_HISAT/38.89/pass1/ERR188083/ERR188083.sam
But I get a very poor alignment:
HISAT2 summary stats:
Total pairs: 26025190
Aligned concordantly or discordantly 0 time: 24148025 (92.79%)
Aligned concordantly 1 time: 1178218 (4.53%)
Aligned concordantly >1 times: 686294 (2.64%)
Aligned discordantly 1 time: 12653 (0.05%)
Total unpaired reads: 48296050
Aligned 0 time: 47600213 (98.56%)
Aligned 1 time: 505745 (1.05%)
Aligned >1 times: 190092 (0.39%)
Overall alignment rate: 8.55%
I was expecting it to be better than STAR, but it seems that's not the case. What is currently considered the best RNAseq spliced aligner? The 2013 review by Engström is a bit dated now. Based on that review I would choose STAR. Is that still the consensus?
If you are going to perform differential expression analysis, as WouterDeCoster suggested, Salmon or Kallisto will be helpful if a reference transciptome is available. You can also continue following using HISAT2 and use all default settings, but using --dta (--downstream-transcriptome-assembly) may be helpful.
As per the manual, HISAT2 provides options for transcript assemblers (e.g., StringTie and Cufflinks) to work better with the alignment from HISAT2 (see options such as --dta and --dta-cufflinks).
There are many reviews comparing STAR with HISAT2 (latest being https://www.nature.com/articles/s41467-017-00050-4), although it deals with all kind of downstream analysis workflows possible with RNAseq data, but they have given a very nice comparison between HISAT2 and STAR based on the kind of analysis you would like to perform after alignment.
Thanks for pointing me to that very recent review @prasundutta87! I'll have a look. I think I'll stick with STAR, because it gives me a vastly superior alignment rate compared to HISAT2; i.e. well above 90%.