Hey Everyone
I am mapping to mouse genome using HISAT2 and graph index (created using SNPs and InDels). The problem is that although HISAT2 finds alignments over the SNPs in the graph, it finds no alignments over InDels.
I am using one chromsome, where I have around 288K SNPs and 36K InDels in the Index (counted using hisat2-inspect). I know there are reads in my test data that align over InDels. I tried using version 2.0.4 and 2.0.5.
I posted this issue on GitHub and also on the HISAT2 mailing list, but haven't heard from the authors so far. Does anyone come across this issue and can it be solved by changing one of the mapping parameters?
Thanks, Vivek
Related post: Stranger Things: unexpected limitations of popular tools
How long are these indels? And I'm a bit confused about what you are trying to do. Are you processing synthetic RNA-seq data with artificial indels? Also, I don't see anything in the HISAT2 manual about hisat2-inspect reporting indels, nor anything on that page about indels except for a link to Samtools.
The InDels are 1 to 20 bp long, mostly on the shorter end of this range. I am simply trying to test whether graph index using known SNPs and InDels will give me better alignments in my data. It's a real RNA-seq data set from a mouse strain with known SNPs and InDels. I mapped initially using another approach, then extracted reads from one chromosome to test mapping with HISAT2.
Yes HISAT2 manual doesn't mention much about InDels, but it does point out that alignment should work with short InDels in the index. When using
hisat-inspect
with--snp
option, it reports snps as well as indels from the index.This study pointed out that HISAT2 isn't the best aligner when it comes to InDel detection. However since I have the InDels already in the graph index, I expect it to align the reads over them.