Entering edit mode
6.2 years ago
alvarocentron91
▴
10
Hello, I'm having some problems indexing my genome with STAR, I have used the following line code:
STAR --runThreadN 5 --runMode genomeGenerate --genomeDir STAR_index --genomeFastaFiles p3_t237631_Ust_maydi_v2GB.scaf.fa --sjdbGTFfile p3_t237631_Ust_maydi_v2GB.gtf --sjdbOverhang 88 --limitGenomeGenerateRAM 80000000000
I have modified the cores to 10, 20, 40 and the RAM to 40, 50, 60, 70, 80 GB and I always have the same problem, the program gets stuck in "finished successfully" but never ends.
Here it is the end of the Log.out:
STAR version=STAR_2.6.0c
STAR compilation time,server,dir=vie sep 7 11:05:55 CEST 2018 nodo00.ladon.local:/apps/ebuild/.local/easybuild/build/STAR/2.6.0c/foss-2018a/STAR-2.6.0c/source
-------------------------------
##### Final effective command line:
/apps/ebuild/.local/easybuild/software/STAR/2.6.0c-foss-2018a/bin/STAR --runMode genomeGenerate --runThreadN 5 --genomeDir STAR_index --genomeFastaFiles p3_t237631_Ust_maydi_v2GB.scaf.fa --limitGenomeGenerateRAM 80000000000 --sjdbGTFfile p3_t237631_Ust_maydi_v2GB.gtf --sjdbOverhang 88
##### Final parameters after user input--------------------------------:
versionSTAR 20201
versionGenome 20101 20200
parametersFiles -
sysShell -
runMode genomeGenerate
runThreadN 5
runDirPerm User_RWX
runRNGseed 777
genomeDir STAR_index
genomeLoad NoSharedMemory
genomeFastaFiles p3_t237631_Ust_maydi_v2GB.scaf.fa
genomeChainFiles -
genomeSAindexNbases 14
genomeChrBinNbits 18
genomeSAsparseD 1
genomeSuffixLengthMax 18446744073709551615
genomeFileSizes 0
genomeConsensusFile -
readFilesType Fastx
readFilesIn Read1 Read2
readFilesPrefix -
readFilesCommand -
readMatesLengthsIn NotEqual
readMapNumber 18446744073709551615
readNameSeparator /
inputBAMfile -
bamRemoveDuplicatesType -
bamRemoveDuplicatesMate2basesN 0
limitGenomeGenerateRAM 80000000000
limitIObufferSize 150000000
limitOutSAMoneReadBytes 100000
limitOutSJcollapsed 1000000
limitOutSJoneRead 1000
limitBAMsortRAM 0
limitSjdbInsertNsj 1000000
outFileNamePrefix ./
outTmpDir -
outTmpKeep None
outStd Log
outReadsUnmapped None
outQSconversionAdd 0
outMultimapperOrder Old_2.4
outSAMtype SAM
outSAMmode Full
outSAMstrandField None
outSAMattributes Standard
outSAMunmapped None
outSAMorder Paired
outSAMprimaryFlag OneBestScore
outSAMreadID Standard
outSAMmapqUnique 255
outSAMflagOR 0
outSAMflagAND 65535
outSAMattrRGline -
outSAMheaderHD -
outSAMheaderPG -
outSAMheaderCommentFile -
outBAMcompression 1
outBAMsortingThreadN 0
outBAMsortingBinsN 50
outSAMfilter None
outSAMmultNmax 18446744073709551615
outSAMattrIHstart 1
outSAMtlen 1
outSJfilterReads All
outSJfilterCountUniqueMin 3 1 1 1
outSJfilterCountTotalMin 3 1 1 1
outSJfilterOverhangMin 30 12 12 12
outSJfilterDistToOtherSJmin 10 0 5 10
outSJfilterIntronMaxVsReadN 50000 100000 200000
outWigType None
outWigStrand Stranded
outWigReferencesPrefix -
outWigNorm RPM
outFilterType Normal
outFilterMultimapNmax 10
outFilterMultimapScoreRange 1
outFilterScoreMin 0
outFilterScoreMinOverLread 0.66
outFilterMatchNmin 0
outFilterMatchNminOverLread 0.66
outFilterMismatchNmax 10
outFilterMismatchNoverLmax 0.3
outFilterMismatchNoverReadLmax 1
outFilterIntronMotifs None
outFilterIntronStrands RemoveInconsistentStrands
clip5pNbases 0
clip3pNbases 0
clip3pAfterAdapterNbases 0
clip3pAdapterSeq -
clip3pAdapterMMp 0.1
winBinNbits 16
winAnchorDistNbins 9
winFlankNbins 4
winAnchorMultimapNmax 50
winReadCoverageRelativeMin 0.5
winReadCoverageBasesMin 0
scoreGap 0
scoreGapNoncan -8
scoreGapGCAG -4
scoreGapATAC -8
scoreStitchSJshift 1
scoreGenomicLengthLog2scale -0.25
scoreDelBase -2
scoreDelOpen -2
scoreInsOpen -2
scoreInsBase -2
seedSearchLmax 0
seedSearchStartLmax 50
seedSearchStartLmaxOverLread 1
seedPerReadNmax 1000
seedPerWindowNmax 50
seedNoneLociPerWindow 10
seedMultimapNmax 10000
seedSplitMin 12
alignIntronMin 21
alignIntronMax 0
alignMatesGapMax 0
alignTranscriptsPerReadNmax 10000
alignSJoverhangMin 5
alignSJDBoverhangMin 3
alignSJstitchMismatchNmax 0 -1 0 0
alignSplicedMateMapLmin 0
alignSplicedMateMapLminOverLmate 0.66
alignWindowsPerReadNmax 10000
alignTranscriptsPerWindowNmax 100
alignEndsType Local
alignSoftClipAtReferenceEnds Yes
alignEndsProtrude 0 ConcordantPair
alignInsertionFlush None
peOverlapNbasesMin 0
peOverlapMMp 0.1
chimSegmentMin 0
chimScoreMin 0
chimScoreDropMax 20
chimScoreSeparation 10
chimScoreJunctionNonGTAG -1
chimMainSegmentMultNmax 10
chimJunctionOverhangMin 20
chimOutType Junctions
chimFilter banGenomicN
chimSegmentReadGapMax 0
chimMultimapNmax 0
chimMultimapScoreRange 1
chimNonchimScoreDropMin 20
sjdbFileChrStartEnd -
sjdbGTFfile p3_t237631_Ust_maydi_v2GB.gtf
sjdbGTFchrPrefix -
sjdbGTFfeatureExon exon
sjdbGTFtagExonParentTranscript transcript_id
sjdbGTFtagExonParentGene gene_id
sjdbOverhang 88
sjdbScore 2
sjdbInsertSave Basic
varVCFfile -
waspOutputMode None
quantMode -
quantTranscriptomeBAMcompression 1
quantTranscriptomeBan IndelSoftclipSingleend
twopass1readsN 18446744073709551615
twopassMode None
----------------------------------------
EXITING because of fatal ERROR: could noSep 19 12:48:03 ... starting to generate Genome files
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 0 "Um_chr01" chrStart: 0
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 1 "Um_chr02" chrStart: 2621440
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 2 "Um_chr03" chrStart: 4718592
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 3 "Um_chr04" chrStart: 6553600
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 4 "Um_chr05" chrStart: 7602176
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 5 "Um_chr06" chrStart: 9175040
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 6 "Um_chr07" chrStart: 10223616
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 7 "Um_chr08" chrStart: 11272192
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 8 "Um_chr09" chrStart: 12320768
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 9 "Um_chr10" chrStart: 13107200
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 10 "Um_chr11" chrStart: 13893632
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 11 "Um_chr12" chrStart: 14680064
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 12 "Um_chr13" chrStart: 15466496
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 13 "Um_chr14" chrStart: 16252928
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 14 "Um_chr15" chrStart: 17039360
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 15 "Um_chr16" chrStart: 17825792
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 16 "Um_chr17" chrStart: 18612224
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 17 "Um_chr18" chrStart: 19398656
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 18 "Um_chr19" chrStart: 20185088
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 19 "Um_chr20" chrStart: 20971520
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 20 "Um_chr21" chrStart: 21495808
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 21 "Um_chr22" chrStart: 22020096
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 22 "Um_chr23" chrStart: 22544384
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 23 "Um_scaf_contig_1.256" chrStart: 23068672
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 24 "Um_scaf_contig_1.264" chrStart: 23330816
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 25 "Um_scaf_contig_1.265" chrStart: 23592960
p3_t237631_Ust_maydi_v2GB.scaf.fa : chr # 26 "Um_scaf_contig_1.271" chrStart: 23855104
Number of SA indices: 39283312
Sep 19 12:48:03 ... starting to sort Suffix Array. This may take a long time...
Number of chunks: 5; chunks size limit: 78566624 bytes
Sep 19 12:48:04 ... sorting Suffix Array chunks and saving them to disk...
Writing 2698344 bytes into STAR_index/SA_4 ; empty space on disk = 16819015385088 bytes ... done
Writing 77908240 bytes into STAR_index/SA_3 ; empty space on disk = 16818816155648 bytes ...Writing 77601568 bytes into STAR_index/SA_0 ; empty space on disk = 16818884313088 bytes ...WritinWriting 77984464 bytes into STAR_index/SA_2 ; empty space on disk = 16818749046784 bytWritinWriting 78073880 bytes into STAR_index/SA_1 ; empty space on disk = 16818749046784 byt done
done
done
done
Sep 19 12:48:53 ... loading chunks from disk, packing SA...
Sep 19 12:48:56 ... finished generating suffix array
Sep 19 12:48:56 ... generating Suffix Array index
Sep 19 12:49:03 ... completed Suffix Array index
Sep 19 12:49:03 ..... processing annotations GTF
Processing pGe.sjdbGTFfile=p3_t237631_Ust_maydi_v2GB.gtf, found:
6786 transcripts
9745 exons (non-collapsed)
2951 collapsed junctions
Sep 19 12:49:03 ..... finished GTF processing
Sep 19 12:49:03 Loaded database junctions from the GTF file: p3_t237631_Ust_maydi_v2GB.gtf: 2951 total junctions
WARNING: long repeat for junction # 1350 : Um_chr06 1022586 1023311; left shift = 255; right shift = 2
Sep 19 12:49:03 Finished preparing junctions
Sep 19 12:49:03 ..... inserting junctions into the genome indices
Sep 19 12:49:04 Finished SA search: number of new junctions=2951, old junctions=0
Sep 19 12:49:05 Finished sorting SA indicesL nInd=1038370
Sep 19 12:49:05 Finished inserting junction indices
Sep 19 12:49:10 Finished SAi
Sep 19 12:49:10 ..... finished inserting junctions into genome
Sep 19 12:49:10 ... writing Genome to disk ...
Writing 24639575 bytes into STAR_index/Genome ; empty space on disk = 16818882215936 bytes ... done
SA size in bytes: 166326942
Sep 19 12:49:11 ... writing Suffix Array to disk ...
Writing 166326942 bytes into STAR_index/SA ; empty space on disk = 16819014336512 bytes ... done
Sep 19 12:49:13 ... writing SAindex to disk
Writing 8 bytes into STAR_index/SAindex ; empty space on disk = 16820313522176 bytes ... done
Writing 120 bytes into STAR_index/SAindex ; empty space on disk = 16820313522176 bytes ... done
Writing 1565873491 bytes into STAR_index/SAindex ; empty space on disk = 16820313522176 bytes ... done
Sep 19 12:49:36 ..... finished successfully
DONE: Genome generation, EXITING
I canceled the run and I have the files generated but I don't know if I can trust them :/
Thank you
Can you separate the
stdout
andstderr
output into two files instead of writing them tolog.out
? Do something likeThere is some indication of something going wrong but the output is not clearly captured.
I'm not sure how to do that, however, I re-ran the program increasing the RAM up to 90GB and now the message:
There is only one warning message:
But the problem is still the same, the program doesn't end even when the last 2 lanes of the log are:
It has been stuck there for almost 30 min and I don't know if I must stop it or not. (All the files are generated)
Try running the command this way:
Then show us what
log.error
andlog.output
file have in them.Are you running the command directly on the command line or are you using a job scheduler of some sort?
There are no log.error files generated, and in the subsection "Log files" from STAR's manual there are no indications on how to generate them, just the log.out and the log.progress.out (which has not been generated).
I'm running the command directly in a Slurm job allocation.
The thing is that 1 week ago I didn't have this problem when I generated the index for the maize genome. I may try to generate the index with HISAT2 to know if the problem comes because of my data
Is it possible that even 80GB of RAM are not enough? Depending on your genome, STAR indexes can be very large and their creation might require even more.
Based on file names it appears to be a fungal genome so more than likely 80G should be enough.