Since indexing requires 9 GB of RAM for chromosome X and 160 GB for the whole human genome so that it might not be possible to perform it on a desktop computer. (As the instruction from the this protocol : doi:10.1038/nprot.2016.095)
Do anyone have already-built Hisat2 indexes files for human genome (GRCh37 ) ? Could you upload and send it to me please ? I would really appreciate that because I really desperated to build that indexes files with my desktop computer now. It already took me more than 4 days running constantly but it's not finished yet.
Hi Atonidas,
If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Cheers,
Wouter
I tried to index the latest release of Human genome from ensemble GRCh38.p10 but have encountered with the same fate. Low computational power. So I have decided to go with the pre-indexed genomes. But I found that the available index is for Ensembl release 84 while the current release is 90. Release 84 dates back to Feb 2016 and we are almost into 2018. So I prefer a latest one will be much more appropriate. Is there any chance that it will be updated soon?
Additionally Inside the compressed file downloaded from the above mentioned ftp sources I found a *.sh file. Do I need to execute it or is it for reference?
You realize that human genome builds don't have anything to do with Ensemble releases. Ensembl contains many other things besides the human genome and thus warrants regular updates. You can safely use pre-built GRCh38 indexes available.
BTW: If you had run into difficulties making the indexes you may also have problems doing alignments on the same hardware.
Does it require 200GB for alignment as well?
No. But you will need at least 8GB. If you have more then better.