Question

The STAR program does not work after the mapping

0

Entering edit mode

2.5 years ago

pavelasquezv ▴ 50

Hi all,

The STAR program does not work after the mapping. I am making a loop to map multiple fastq files:

`cat lista | while read l; do STAR \
--genomeDir index \
--runThreadN 30 \
--readFilesIn $l\_1*PE.fastq $l\_2*PE.fastq \
--outFileNamePrefix STAR/$l\_ \
--quantMode GeneCounts \
--outSAMtype BAM SortedByCoordinate; done`

The program works up to the mapping. But it doesn't finish and doesn't have any message as a result:

  `Jul 04 15:15:09 ..... started STAR run
   Jul 04 15:15:09 ..... loading genome
   Jul 04 15:15:40 ..... started mapping`

The program does not end

Please, help me!

Many thanks

STAR • 2.4k views

ADD COMMENT • link updated 2.5 years ago by GenoMax 148k • written 2.5 years ago by pavelasquezv ▴ 50

1

Entering edit mode

Can you confirm the program finished (with or without error) ?

mapping reads can take a while

ADD REPLY • link 2.5 years ago by lieven.sterck 15k

0

Entering edit mode

Many thanks for your reply. The program finished without any error. When I want to index the genome, a Killed message appears:

Jul 04 16:47:58 ..... started STAR run
Jul 04 16:47:58 ... starting to generate Genome files
Jul 04 16:48:10 ... starting to sort Suffix Array. This may take a long time...
Jul 04 16:48:14 ... sorting Suffix Array chunks and saving them to disk...
Killed

ADD REPLY • link 2.5 years ago by pavelasquezv ▴ 50

1

Entering edit mode

The program finished without any error.

Nope. So the program fails within a couple of minutes. How much RAM are you allocating for this job? Do not use 30 threads unless you have 80-100G of RAM to assign to this job.

As long as you have 30-40 GB I would suggest using 8 cores.

--outSAMtype BAM SortedByCoordinate

Using STAR to sort the output is not efficient. It adds to RAM overhead. It would be best to sort your BAM afterwards using samtools.

ADD REPLY • link 2.5 years ago by GenoMax 148k

0

Entering edit mode

I tried with 4, 6, 8, 10, 12, 30, 50 cores but it still gives the same error. Do you have any other suggestions?

ADD REPLY • link 2.5 years ago by pavelasquezv ▴ 50

0

Entering edit mode

pelase! many thanks for your reply and tips!

ADD REPLY • link 2.5 years ago by pavelasquezv ▴ 50

1

Entering edit mode

My suggestion still stands. How much memory do you have available? If you don't have enough RAM and the job keeps getting killed because of that number of cores will not matter. For human genome (for size reference, if you are using something else) STAR needs 30-40G of free RAM.

You pre-created the index using --runMode genomeGenerate correct? If not, you need to do that first before you align. That part also needs the same amount of RAM as noted above.

ADD REPLY • link 2.5 years ago by GenoMax 148k

0

Entering edit mode

Yes, I think you are right because I ran the same code on my pc that has 32 ram and it was perfect. Yes, I am working with --runMode genomeGenerate. I am working on the University server. Please, can you tell me how I can know how much memory was allocated to me?

#!/bin/bash

#$ -q all.q
#$ -V /Storage/data1
#$ -cwd /Storage/data2
# -pe smp 8

module load STAR/2.7.3a
module load FastQC/0.11.8
module load sratoolkit/3.0.0
module load Trimmomatic/0.38

#mkdir genomes

#wget https://ftp.ncbi.nlm.nih.gov/genomes/refseq/invertebrate/Helicoverpa_armigera/latest_assembly_versions/GCF_002156985.1_Harm_1.0/GCF_002156985.1_Harm_1.0_genomic.gtf.gz -P genomes
#wget https://ftp.ncbi.nlm.nih.gov/genomes/refseq/invertebrate/Helicoverpa_armigera/latest_assembly_versions/GCF_002156985.1_Harm_1.0/GCF_002156985.1_Harm_1.0_genomic.fna.gz -P genomes

#gzip -d  genomes/*

STAR \
  --runThreadN 8 \
  --runMode genomeGenerate \
  --sjdbGTFfile genomes/GCF_002156985.1_Harm_1.0_genomic.gtf \
  --genomeDir index \
  --genomeFastaFiles genomes/GCF_002156985.1_Harm_1.0_genomic.fna \
  --genomeSAindexNbases 13

ADD REPLY • link 2.5 years ago by pavelasquezv ▴ 50

0

Entering edit mode

I already got to see. I have only 3.2 Gb. I think that's really the problem. I'll talk to the manager. Thank you all very much for your collaboration.

ADD REPLY • link 2.5 years ago by pavelasquezv ▴ 50

1

Entering edit mode

are you working on a cluster ? if yes, do you _use_ it ?

ADD REPLY • link 2.5 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

Yes Pierre, i am working on a cluster of the University. Pero no tengo permisos para aumentar mi memoria :(

ADD REPLY • link 2.5 years ago by pavelasquezv ▴ 50

0

Entering edit mode

I am sorry. I am working on a cluster of the University. But I do not have I don't have administrator permissions to increase my memory

ADD REPLY • link 2.5 years ago by pavelasquezv ▴ 50

1

Entering edit mode

but do you __use__ the cluster using qsub or sbatch ?

ADD REPLY • link 2.5 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

I am using qsub Pierre. But now it is working, I think the problem was the insufficient memory. Many thanks my friend!

ADD REPLY • link 2.5 years ago by pavelasquezv ▴ 50

1

Entering edit mode

cat lista | while read l; do STAR \

use a workflow manager like snakemake or nextflow.

ADD REPLY • link 2.5 years ago by Pierre Lindenbaum 164k

score 1 · Answer 1 · 2022-07-04

1

Entering edit mode

2.5 years ago

GenoMax 148k

Looks like your cluster is using SGE scheduler. So you should add something like

#$ -l m_mem_free=32G

to ask for more memory. By default the job must get 3 GB or so.

Please ask local admins since clusters may be set up in different ways.

ADD COMMENT • link 2.5 years ago by GenoMax 148k

0

Entering edit mode

Hi GenoMax. Many thanks for your help. Now it is working, I think the problem was the insufficient memory. Many thanks my friend!

ADD REPLY • link 2.5 years ago by pavelasquezv ▴ 50