Transcriptome Improvement Using Cell Ranger Count vs. ARC for RNA-Seq Data from Frozen Nuclei in Multimodal Analysis
1
0
Entering edit mode
3 months ago
NaomiK • 0

Hi everyone,

I'm working on a multimodal analysis of single cells from frozen nuclei, capturing both ATAC and RNA data from the same nuclei. While processing my data, I noticed a significant improvement in the transcriptome when using the Cell Ranger Count pipeline (v7.1.0), which handles only RNA-seq, compared to Cell Ranger ARC (v2.0.2), which handles both ATAC and RNA.

Here are the anonymized metrics I gathered:

Metric Sample1 (ARC Count) Sample1 (Count) Sample2 (ARC Count) Sample2 (Count)
Estimated Number of Cells 1,053 464 562 564
Mean Raw Reads per Cell 31,233.53 70,881 57,313.97 57,111
Fraction of Transcriptomic Reads in Cells 19.3% 38.3% 22.6% 43.8%
Median UMI Counts per Cell 144 1,056 365 1,063
Median Genes per Cell 117 894 275 818
Total Genes Detected 21,462 27,316 18,662 26,112
Reads Mapped to Genome 63.4% 63.4% 45.1% 45.1%
Reads Mapped Confidently to Genome 49.7% 49.7% 28.4% 28.4%
Reads Mapped Confidently to Intergenic Regions 16.7% 16.7% 6.6% 6.6%
Reads Mapped Confidently to Intronic Regions 25.0% 25.0% 15.7% 15.7%
Reads Mapped Confidently to Exonic Regions 8.0% 8.0% 6.0% 6.0%
Reads Mapped Confidently to Transcriptome 21.4% 21.4% 16.4% 16.4%
Reads Mapped Antisense to Gene 11.3% 11.3% 5.1% 5.1%

The key metrics that stood out to me were:

  • The Fraction of Transcriptomic Reads in Cells increased from 19.3% (ARC) to 38.3% (Count) for Sample1 and from 22.6% (ARC) to 43.8% (Count) for Sample2.
  • The Median UMI Counts per Cell also saw a huge jump when using Cell Ranger Count: from 144 to 1,056 for Sample1 and from 365 to 1,063 for Sample2.
  • A significant increase in Median Genes per Cell and Total Genes Detected was also observed in both datasets when using Count over ARC.

Has anyone else encountered a similar issue when running multimodal analyses on frozen nuclei? If so, what steps did you take moving forward? Did you modify any processing or analysis steps to resolve this? I’m curious to know how others have approached this.

Thanks in advance!

multiomics scATAC cellranger scRNA 10X • 1.1k views
ADD COMMENT
0
Entering edit mode

10x recommends that when analyzing multi-modal data you should use cellranger-arc.

From: https://kb.10xgenomics.com/hc/en-us/articles/360059656912-Can-I-analyze-only-the-Gene-Expression-data-from-my-single-cell-multiome-experiment

Note: This type of analysis is not officially supported. We do not recommend or advise the analysis of single modalities from the multiome assay. The main benefits of the multiome assay i.e. joint cell-calling and feature linkages will not be available when analysis is carried out in this way.

ADD REPLY
0
Entering edit mode
3 months ago

The key metrics that stood out to me were:

  • The Fraction of Transcriptomic Reads in Cells increased from 19.3% (ARC) to 38.3% (Count) for Sample1 and from 22.6% (ARC) to 43.8% (Count) for Sample2.
  • The Median UMI Counts per Cell also saw a huge jump when using Cell Ranger Count: from 144 to 1,056 for Sample1 and from 365 to 1,063 for Sample2.
  • A significant increase in Median Genes per Cell and Total Genes Detected was also observed in both datasets when using Count over ARC.

Some of these changes are presumably due to differences in the cell calling between the two. You have many more (likely erroneous) cell calls for Sample1 with cellranger-arc. The sample 2 discrepancies are more confusing. Providing the commands may be illuminating. Are you using the ARC-v1 chemistry in your cellranger count commands?

Genomax pointed out the relevant portion of the docs that list a key point - joint cell calling. cellranger-arc uses both the ATAC and gene expression to inform and call cells. However, it's pretty generous with what it calls a cell. And in the multiome, you have two data modalities that can each fail, so you tend to end up with more droplets/cells lost during QC/filtering than with typical snRNA.

I'd be interesting in hearing 10X's response to this as well though.

ADD COMMENT
0
Entering edit mode

I ran the following command using the ARC-v1 chemistry in the Cell Ranger Count pipeline:

cellranger count --id=run_sample2 --transcriptome=/path/to/transcriptome --fastqs=/path/to/fastqs --sample=sample2 --localcores=32 --localmem=64 --chemistry=ARC-v1 --create-bam true --include-introns true

Thanks again for your helpful feedback!

ADD REPLY
0
Entering edit mode

I'd kick this to 10X for sure. cellranger being closed source makes it difficult to determine what may be going on in the backend. If you find out anything else, please report back here!

My guess is still all stems from the cells called with only GEX or with both GEX and ATAC being different given the actual mapping statistics are identical. The joint cell calling tends to accept worse metrics in one assay if the other performed well than you'd probably be okay with in isolation. I expect if you compared the barcodes of the "cells" identified in each, you would not have as much overlap as you might think.

ADD REPLY

Login before adding your answer.

Traffic: 1939 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6