Question

STAR solo segmentation fault after 'started Solo counting'

0

Entering edit mode

2.2 years ago

bp22 ▴ 80

Dear all,

I have some 10x v3 single cell rna seq fastq files that I am trying to map to hg38 human genome using STAR aligner and generate read counts. However, I am getting the following error and hope that some of you can help:

zsh: segmentation fault  STAR --outSAMattributes All --outSAMtype BAM Unsorted --quantMode GeneCounts

I am using STAR version 2.7.10a and the STAR run fails after the following steps:

Nov 28 11:17:04 ..... started STAR run
Nov 28 11:17:06 ..... loading genome
Nov 28 11:20:17 ..... processing annotations GTF
Nov 28 11:20:35 ..... inserting junctions into the genome indices
Nov 28 11:21:30 ..... started mapping
Nov 28 12:19:41 ..... finished mapping
Nov 28 12:19:44 ..... started Solo counting
zsh: segmentation fault  STAR --outSAMattributes All --outSAMtype BAM Unsorted --quantMode GeneCounts

As per the generated log-file, the run fails after the following:

...
Nov 28 12:19:44 ..... started Solo counting
Nov 28 12:19:44 ... Starting Solo post-map for Gene
Nov 28 12:19:44 ... Finished allocating arrays for Solo 1.25473 GiB

Also, to start with I am running the following command:

STAR --outSAMattributes All \
     --outSAMtype BAM Unsorted \
     --quantMode GeneCounts \
     --readFilesCommand gunzip -c \
     --runThreadN 7 \
     --sjdbGTFfile $GTFFILE \
     --outReadsUnmapped Fastx \
     --outMultimapperOrder Random \
     --genomeDir $GENOMEDIR \
     --readFilesIn ${INPUTDIR}/R2.fastq.gz ${INPUTDIR}/R1.fastq.gz \
     --outFileNamePrefix $OUTPREFIX \
     --soloType CB_UMI_Simple \
     --soloCBwhitelist $WHITELIST \
     --soloUMIlen 12 \
     --soloCBlen 16 \
     --soloUMIstart 17 \
     --soloCBstart 1 \
     --soloBarcodeReadLength 0 \
     --soloUMIfiltering MultiGeneUMI_CR \
     --soloUMIdedup 1MM_CR \
     --clipAdapterType CellRanger4 \
     --outFilterScoreMin 30 \
     --soloCBmatchWLtype 1MM_multi_Nbase_pseudocounts \
     --soloCellFilter EmptyDrops_CR

Any help is appreciated.

Thank you

chromium STARsolo 10X STAR • 1.7k views

ADD COMMENT • link updated 23 months ago by friguiahlem8 ▴ 30 • written 2.2 years ago by bp22 ▴ 80

0

Entering edit mode

How much memory is available?

ADD REPLY • link 2.2 years ago by ATpoint 86k

0

Entering edit mode

Hi ATpoint,

Thanks for your response. I am running it on a system with 64GB memory and 2TB of storage. I am allocating 7 out of 8 cores available and have ~ 1TB of free disk space. I have tried this alignment on the same system before on a different set of 10X scRNA-seq data and the run was successful. As you can see below:

...
Jul 15 15:03:14 ..... started Solo counting
Jul 15 15:03:14 ... Starting Solo post-map for Gene
Jul 15 15:03:14 ... Finished allocating arrays for Solo 2.07388 GiB
Jul 15 15:06:47 ... Finished reading reads from Solo files nCB=2853747, nReadPerCBmax=298352, yesWLmatch=0
Jul 15 15:09:26 ... Finished collapsing UMIs
Jul 15 15:09:26 ... Solo: writing raw matrix
Solo output directory directory created: RPE1_BRCA1_KO_TALA_RESSolo.out/Gene//raw/
Jul 15 15:09:41 ... Solo: cell filtering
cellFiltering: simple: nUMImax=81876; nUMImin=8188; nCellsSimple=5449
Jul 15 15:09:41 ... starting emptyDrops_CR filtering
Jul 15 15:09:42 ... finished ambient cells counting
Jul 15 15:09:42 ... finished SGT
Jul 15 15:09:42 ... finished ambient profile
Jul 15 15:09:42 ... candidate cells: minUMI=500; number of candidate cells=5234
Jul 15 15:09:42 ... finished observed logProb
Jul 15 15:09:48 ... finished simulations
Jul 15 15:09:49 ... finished emptyDrops_CR filtering: number of additional non-ambient cells=3799
Jul 15 15:09:57 ..... finished Solo counting
ALL DONE!

For this run I have also tried aligning just a part of the scRNA-seq data and it still failed at the same point. Hence, I am a bit puzzled by this segmentation fault and it might not be related to the avaiable memory for the run.

Best, BP

ADD REPLY • link 2.2 years ago by bp22 ▴ 80

0

Entering edit mode

Hello, I'm new in this field and I'm trying to compare results generated by cellranger with those generated with STARsolo. My question why did you use the annotation file (GTF) in your script in order to generate the count matrix. In my case I used GTF + fasta file only for indexing the reference genome and I called the indexed genome in --genomeDir. Correct me if I'm wrong please

ADD REPLY • link 23 months ago by friguiahlem8 ▴ 30

score 2 · Accepted Answer · 2022-11-30

2

Entering edit mode

2.2 years ago

bp22 ▴ 80

Dear all,

It seems like the issue was because of macOS update to macOS Ventura. A reinstall of STAR aligner solved the issue and now the run is sucessful on the same system.

Thank you.

Best, BP

ADD COMMENT • link 2.2 years ago by bp22 ▴ 80

0

Entering edit mode

+1 Thanks for following up!

ADD REPLY • link 2.2 years ago by ATpoint 86k