Question

Chemistry Detection Error in CellRanger for Single-Cell RNA-Seq Data

0

Entering edit mode

10 months ago

97sun3 ▴ 10

I have downloaded public single-cell RNA sequencing data with the following GSM IDs: GSM5718030, GSM5718026, GSM5718021, GSM5718019, and GSM5718013. The data was generated using the Illumina HiSeq 2500 platform, and the sequencing was implemented through the 10x Genomics Chromium platform according to the recommended procedures.

I am attempting to process the data using CellRanger, which should be compatible since the data was processed with the Chromium platform. Here is the code snippet I'm using for mapping:

# run cellranger count (maximum CPUs 8; maximum RAM 24GB)
cellranger count \
  --id=${srr_id} \
  --transcriptome=$ref \
  --fastqs=$fastq \
  --sample=${srr_id} \
  --create-bam=true \
  --chemistry=auto \
  --localcores=8 \
  --output-dir=${srr_id}_results

However, I encountered the following error during chemistry detection:

[error] Pipestance failed. Error log at: SRR17134429_results/SC_RNA_COUNTER_CS/SC_MULTI_CORE/MULTI_CHEMISTRY_DETECTOR/DETECT_COUNT_CHEMISTRY/fork0/chnk0-u43bcd1dc45/_errors

Log message: An extremely low rate of correct barcodes was observed for all the candidate chemistry choices for the input: Sample SRR17134429 in "/data/project/bio/shivashankar/sunho/old_data/scPBMC_breast/FASTQ_files2". Please check your input data.
- 0.1% for chemistry SC3Pv4
- 0.1% for chemistry SC3Pv3
- 0.1% for chemistry SC3Pv3HT
- 0.0% for chemistry SC5P-PE-v3
- 0.0% for chemistry SC5P-PE
- 0.0% for chemistry SC3Pv2
- 0.0% for chemistry ARC-v1
- 0.0% for chemistry SC3Pv3LT

It seems that the CellRanger is failing to detect the correct chemistry for the input data, resulting in a very low rate of correct barcodes for all candidate chemistry options.

Does anyone know why this might be happening? Is there something specific about the data or the code that I might need to adjust to resolve this issue?

cellranger single-cell RNA • 1.3k views

ADD COMMENT • link updated 10 months ago by GenoMax 152k • written 10 months ago by 97sun3 ▴ 10

score 3 · Accepted Answer · 2024-08-30

which should be compatible since the data was processed with the Chromium platform.

This does not appear to be a 10x dataset. See example description for: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM5718030

CD8+ T cells were stained with monoclonal antibodies for CD8, PD-1, CD39, CD103, CD69, CD137, and CCR7 as described above and then single-cell index sorted using a BD ARIA III FACS system into BD Genomics Precise WTA 96 well plates for whole transcriptome analysis.

This is from BD Precise Platform. This platform appears to be no longer on BD website. If you are able to figure out what the read structure is then you may be able to use the two reads to do re-analysis.

Only reference to this is in a press release which has following line, which may make actual reanalysis difficult:

The sequencing data is processed and analyzed by a proprietary analysis pipeline specific for RNA quantification for each cell analyzed.

You may have to start with the raw counts provided with the sample ( which are for hg18 being relatively old):

Supplementary_files_format_and_content: Both raw counts and imputed counts were provided here with the meta-data table for each cell