Hi,
I am trying to create a HISAT2 index with annotation for Rattus_norvegicus (RAT) genome I downloaded from the Ensembl release 94.
I am currently using 220GB memory with 16 cores. My assumption is the memory which I am providing is adequate enough. But I can not create the HISAT2 index, and it gives the error of
Reserving space for joined string
Joining reference sequences
Time to join reference sequences: 00:00:16
Time to read SNPs and splice sites: 00:00:04
is not reverse-deterministic, so reverse-determinize...
Ran out of memory; automatically trying more memory-economical parameters.
is not reverse-deterministic, so reverse-determinize...
and eventually fail with
Could not find approrpiate bmax/dcv settings for building this index.
Switching to a packed string representation.
Total time for call to driver() for forward index: 08:45:56
HISAT2 website does have rat index but they do not have the annotation.
Iam using the command
hisat2-build -p 16 --exon ${EXON} --ss ${SPLICE} ${FASTA_File} ${BASE_NAME}
to create the index.
Any ideas is greatly appreciated.
Thanks
Do you run on a cluster, and if so, what is the exact command, including the header lines for the scheduler? Did you request the entire memory of the node you are on?
No the node has 256GB memory and I only asked for 220GB of RAM. I never asks for the full amount since the node needs some memory to work with. In previous occasions I have only asked for 200GB.
In pervious cases for the same index I have asked for 300GB of RAM where the node had 512GB of memory , which didn't work as well.
If you share the links to the necessary files, I can try to build it on a 3TB node if that helps you.
Yes that might help me , Thank you very much for the help.
I am using the files hosted by Ensembl Data Base, and using the hisat2 version 2.1.0 to build the index. Following is the SLURM script which I use to build it. I will post it, where the memory, partition and qos might change depending on the cluster and the the scheduler which is been used.
If the normal hisat2 build does not work, you can also try to build the large index using the commented part as well.
Thanks again for the help.
I just started it and will come back once finished.
thanks, appreciate it. if it complete successfully would like to know, how much memory did it used ?
It finished without issues on a 1.5TB node. Used about 500GB at max. I am compressing and uploading it now to a cloud, and will share the download link once finished:
There it is: https://uni-muenster.sciebo.de/s/ztztgCWvQujnhjq
Thank you very much, I really appreciate the help you gave me, going out of the way. I was able to download the index from the link.
Again thank you very much.
You‘re very welcome :)
found a solution to generate the index using more memory
Cheers!
Hi,
I am having the same issue with HISAT2 Indexing using annotation for Rattus norvegicus. I currently don't have access to a cluster with sufficient memory and I am stuck with my transcriptome analyses. I have seen that @ATpoint made these indexes available but the link is dead.
Would @ATpoint or any of you be able to share these indexes again?
Thank you in advance,
I do not have them anymore. Why don't you use a tool such as
salmon
to quantify directly against the transcriptome. It barely requires any memory.