Hi ,
I downloaded the repeat genome and gtf (RepeatMasker) files from UCSC genome table browser. I want to build repeat genome index to remove the reads which may be spurious artifacts from rRNA (& other)
repetitive reads. But the error is always exceeding memory limit. I adjust the memory from 30GB to 120GB.
The repeat genome file size is 2.1GB and gtf file size is 552 MB.
Nov 18 17:58:19 ..... started STAR run Nov 18 17:58:19 ... starting to generate Genome files slurmstepd: Job 11091167 exceeded memory limit (123675052 > 122880000), being killed slurmstepd: Exceeded job memory limit slurmstepd: * JOB 11091167 CANCELLED AT 2019-11-18T13:20:19 * on node311
<h6>############################################### Script</h6>/home/ychen10/STAR-2.7.3a/bin/Linux_x86_64/STAR
--runThreadN 4 \
--runMode genomeGenerate \
--genomeDir index \
--genomeFastaFiles repeatSeq.fa \
--sjdbGTFfile repeatSeq.gtf \
--sjdbOverhang 99 \
--genomeChrBinNbits 16 \
--genomeSAindexNbases 10 \
--genomeSAsparseD 4
I am not sure the problem is caused by the repeat genome or the memory. Thanks.
Thanks. I tired the --limitGenomeGenerateRAM. It produced same error.
comments are for answers, please use the reply button (yeah it's a bit strange but it makes finding much easier!).
The same error from slurm? If so, something is going wrong because STAR shouldn't be using more than the limit specified. Can you try supplying say 50gig of memory but limit STAR to 40gig?
Thanks. I tired to limit STAR to 40gb. The error is same. I think the problem may caused by the input files.
repeatSeq.fa
repeatSeq.gtf head -5 repeatSeq.gtf
Beyond me I'm sorry. I suggest posting an issue on the github page of STAR. The maintainer is excellent with troubleshooting weird cases.
Yes. Thanks very much
Why not remove the repeat region maps with repeatmask regions after the alignment?
Thanks. I will try it.