Hi again,
I want to use STAR to run my RNA-seq analysis however I'm having issues at the first hurdle trying to generate a reference genome.
I want to use the newest rat rn6 build but keep getting errors with genomeGenerate. here is my command :
--runMode genomeGenerate --genomeDir /path/to/directory --genomeFastaFiles ~/path/to/directory/rn6_chr1.fa rn6_chr2.fa rn6_chr3.fa rn6_chr4.fa rn6_chr5.fa rn6_chr6.fa rn6_chr7.fa rn6_chr8.fa rn6_chr9.fa rn6_chr10.fa rn6_chr11.fa rn6_chr12.fa rn6_chr13.fa rn6_chr14.fa rn6_chr15.fa rn6_chr16.fa rn6_chr17.fa rn6_chr18.fa rn6_chr19.fa rn6_chr20.fa rn6_chrMT.fa rn6_chrX.fa rn6_chrY.fa --sjdbGTFfile ~/path/to/directory/rn6.gtf --sjdbOverhang 49 --runThreadN 12 --outFileNamePrefix /path/to/directory/rn6
and here is my error
EXITING because of INPUT ERROR: could not open genomeFastaFile: path/to/directory/rn6_chr1.fa
So here are some points and errors I've already covered after reading posts and forums
- I'm using separate chromsome fasta files as I read that using toplevel.dna files is not good and there isn't a primary.dna file for rn6 yet. I tried toplevel fa file with no success.
-I've gone through and checked that every directory where my files are stored and my output directories etc are fully writable, readable and executable with chmod.
- my genomeDir is completely empty and is situated on a RAID with tons of free space.
- my fasta files and gtf file was downloaded from ensembl and both look fine.
- I'm running this on a Mac Pro which has a 12 core processor and 64gb of RAM and have played with the thread settings which had no effect.
- my reads are 50 bp in length and paired end hence me using the 49 sjdbOverhang setting
I'm completely stuck and lost guys. The manual isn't helping and I've exhausted all the STAR google group and biostars posts relating to this. Can anyone help??
Hi guys,
Thanks so much for the help. I'll try playing around with the file path later and see if that works and I'll change the Overhang setting as suggested. The reason I have the separate chromosome files is because I started off with the toplevel.dna.fa file from Ensembl and genomeGenerate wasn't working. I read that we shouldn't use toplevel fasta files as they contain all the haplotype data etc etc and that it can cause issues. Since there is no primary.dna.fasta file available on Ensembl, I went for the separate chromosome files instead. However, I'd appreciate your opinion on the matter.....
Were you able to resolve this issue? I am having the same problem!