It's my first time using HISAT2 and the options have gotten me all confused. How can I run HISAT2 such that I increase the number of mapped reads by relaxing the parameters? Essentially, how can I increase the number of mismatches, etc allowed (which options do I need to use and how)?
left reads: min. length=75, max. length=10506, 15483 kept reads (0 discarded)
So this clearly is not raw Illumina, Ion Torrent or 454 data. Are you still aligning the same dataset? What is the source of this data? Are these assembled transcripts, or PacBio reads?
As a general rule of thumb for the majority of the tools, if you don't know which parameters to set: stick to the defaults and trust the judgement of the developers.
Yes, I understand that. However, I am using the default and am not getting what I expected. I just switched from using tophat2 to hisat2 and I know that hisat2 is much more sensitive. Is there a similar option on hisat2 to change the --read-mismatches, --read-edit-dist, and --read-gap-length?
In case someone is searching this topic and coming up with nothing: you can increase the number of reads deemed "aligned" by changing the min score from the default (L,0,-0.2) to e.g. (L,0,-0.6) with --score-min L,0,-0.6. If there are reads that are close enough to the reference, this will result in them being reported as mapped.
From your other thread ( Tophat2 Error gzip pipe broken? zipped file empty? ), I see an unusual pattern for NGS datasets, which usually consists of millions of short reads:
So this clearly is not raw Illumina, Ion Torrent or 454 data. Are you still aligning the same dataset? What is the source of this data? Are these assembled transcripts, or PacBio reads?
Oops, sorry I didn't clarify this before. This is Oxford Nanopore Sequencing data.
Go with minimap2, which seems to be the current go-to tool for Nanopore data. It is optionally splice-aware.
... that's quite crucial to mention and may explain some results from other aligners :-)
I agree with ATpoint, use minimap2.
As a general rule of thumb for the majority of the tools, if you don't know which parameters to set: stick to the defaults and trust the judgement of the developers.
Hi,
Yes, I understand that. However, I am using the default and am not getting what I expected. I just switched from using tophat2 to hisat2 and I know that hisat2 is much more sensitive. Is there a similar option on hisat2 to change the --read-mismatches, --read-edit-dist, and --read-gap-length?
Thanks!