Hi, I am new to STAR, and I am trying to align sequences. First, I moved the reference genome into my new project directory; this reference genome was one .fa file that was created after the .fa files of each chromosome were concatenated. I then created a genome directory, called genomeDir, with the path to this reference genome file; my path started from the root of the project directory, although I did try to just give it the direct path to the reference genome, which I don't believe makes a difference. After making the genomeDir file, I was getting an "unable to access and write to file" error, which I learned could be solved by creating a STAR file within the genome directory. The command that I used to actually create the index was STAR --runMode genomeGenerate --genomeDir genomeDir --genomeFastaFiles hg19.fa --runThreadN 4
. This process kept on getting killed without a clear error message, so I tried two things: calling the make command from the STAR source directory and running the commands from the same folder and adding --genomeSAsparseD 2 to the command. None of these worked. Do you know if this is simply a RAM issue, or why else this might be happening?
Thank you so much for your help!
Not quite answer the STAR's problem. But here is my approach for the same need:
Although quite a bulky tool, I would rely on bcbio-nextgen to manage my aligner indexes and reference genome sequences:
If you do not have much time, you can find premade indexes here http://labshare.cshl.edu/shares/gingeraslab/www-data/dobin/STAR/STARgenomes/
With that said, like @WouterDeCoster mentioned, I find
hisat2
quite interesting and have been using it recently. It has premade index https://ccb.jhu.edu/software/hisat2/index.shtml and can incorporate information about SNP for alignment. In the output BAM file, if a read covers a specific SNP, the SNP RS number is also used to annotate that read. So, it will be useful in some cases. Hisat2 is a part of "the new tuxedo" which I find interesting as well: https://www.nature.com/articles/nprot.2016.095Please use tags appropriately, as such experts can easily find your question. In this case
star
would have been very logical, so I have added it to your question.Note that, if I'm not mistaken, you are talking about creating the index, and not yet about alignment. Therefore I have adapted your title to better reflect what this question is about.
Finally, you suggest it might be a RAM issue, which I agree, but maybe you should then tell us how much RAM your system has.
Sorry, I actually tried adding the STAR tag, but it wasn't showing up automatically. So, I just left it out.
Also, you are right I meant to say indexing. Sorry, about that as well. My ultimate goal is alignment
I have 16GB of RAM. I actually just edited the last command to explicitly say "parameter 2" instead of just 2. And now the computer is just running the process for a really long time. Would you know why?
Thank you for help in advance.