I try to create an index with STAR version STAR-2.5.2b, I got an error at the "processing annotations GTF" step with a GTF file, so I try with the associate GFF3 and it's working, question is, why ? I know, I could use that GFF3 file but I don't want to introduce an other file in my RNA-seq workflow.
Here is the stuff you need :
Reference genome : ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_mouse/release_M16/GRCm38.p5.genome.fa.gz
STAR : STAR-2.5.2b
I subsampled the reference genome to only keep annotate chromosomes in GTF file, this way I dodge reads that could possibly match outside the annotation ( Looking for a thorough annotation for non-primary assembly units in GRCm38 ). I named it GRCm38.p5.genome_subsampled.fa
I use a cluster to do my job, I set for both strategy (GTF and GFF3), h_vmem (specify the amount of maximum memory required) at 64G and mem (specify the amount of maximum memory required) at 16G, which is enought. I also use 8 threads to process.
Here are my commands :
GTF strategy
$star --runThreadN 8 --runMode genomeGenerate --genomeDir /home/hbastien/work/MGRS/star_index --genomeFastaFiles /home/hbastien/save/MGRS/GRCm38.p5.genome_subsampled.fa --sjdbGTFfile /home/hbastien/save/MGRS/gencode.vM16.chr_patch_hapl_scaff.annotation.gtf --sjdbGTFtagExonParentTranscript Parent --sjdbOverhang 75;
GFF3 strategy
$star --runThreadN 8 --runMode genomeGenerate --genomeDir /home/hbastien/work/MGRS/star_index --genomeFastaFiles /home/hbastien/save/MGRS/GRCm38.p5.genome_subsampled.fa --sjdbGTFfile /home/hbastien/save/MGRS/gencode.vM16.chr_patch_hapl_scaff.annotation.gff3 --sjdbGTFtagExonParentTranscript Parent --sjdbOverhang 75;
In around 30 minutes with GTF
I got in my error output file :
terminate called after throwing an instance of 'std::out_of_range'
what(): vector::_M_range_check
/var/spool/sge/node002/job_scripts/7117238: line 17: 57352 Abandon
$star --run ThreadN 8 --runMode genomeGenerate --genomeDir /home/hbastien/work/MGRS/star_index --genomeFastaFiles /home/hbastien/save/MGRS/GRCm38.p5.genome_subsampled.fa --sjdbGTFfile /home/hbastien/save/MGRS/gencode.vM16.chr_patch_hapl_scaff.annotation.gtf --sjdbGTFtagExonParentTranscript Parent --sjdbOverhang 75
Your job has been killed.
This may happen if one of the followings hold :
you exceeded one of the queue/job limits (run time, memory, etc)
you (or admin) killed the job using qdel
something bad happened.
Now, just in case something bad happened, here are the debug information about your job : total 0
And in my standard output file :
Feb 21 13:45:20 ..... started STAR run
Feb 21 13:45:20 ... starting to generate Genome files
Feb 21 13:46:34 ... starting to sort Suffix Array. This may take a long time...
Feb 21 13:46:51 ... sorting Suffix Array chunks and saving them to disk...
Feb 21 14:09:59 ... loading chunks from disk, packing SA...
Feb 21 14:11:18 ... finished generating suffix array
Feb 21 14:11:18 ... generating Suffix Array index
Feb 21 14:14:43 ... completed Suffix Array index
Feb 21 14:14:43 ..... processing annotations GTF
Whereas with the GFF file, in around 45 minutes
My error output file is empty.
And in my standard output file :
Feb 21 15:34:20 ..... started STAR run
Feb 21 15:34:20 ... starting to generate Genome files
Feb 21 15:35:39 ... starting to sort Suffix Array. This may take a long time...
Feb 21 15:36:04 ... sorting Suffix Array chunks and saving them to disk...
Feb 21 16:05:14 ... loading chunks from disk, packing SA...
Feb 21 16:06:33 ... finished generating suffix array
Feb 21 16:06:33 ... generating Suffix Array index
Feb 21 16:11:03 ... completed Suffix Array index
Feb 21 16:11:03 ..... processing annotations GTF
Feb 21 16:11:27 ..... inserting junctions into the genome indices
Feb 21 16:14:54 ... writing Genome to disk ...
Feb 21 16:14:56 ... writing Suffix Array to disk ...
Feb 21 16:15:13 ... writing SAindex to disk
Feb 21 16:15:14 ..... finished successfully
Epilog : job finished at mer. févr. 21 16:15:14 CET 2018
I tried to increase memory following the error 'std::out_of_range', but that didn't do the trick...
The two Log.out files are a bit huge to be display here, but if you need it I can share them.
If you have any hints !
Thanks a lot
Well played, works better now. I didn't think this option could interfer... If you want to add this as an answer, I'll mark it as accepted. Thank you