Question

Best approach for SNP calling in tumor single-cell multiome data (scRNA-seq + scATAC-seq) for CNV analysis

0

Entering edit mode

15 days ago

nilaylale88 • 0

Hi everyone,

I'm working with 10x single-cell multiome tumor data (scRNA-seq + scATAC-seq), and my goal is to develop a method to identify CNV patterns using SNP profiles, with a focus on somatic mutations and potential haplotype assignment.

What I have done so far:

I used cellSNP-lite with a variant list filtered for MAF > 0.05 (~7M SNPs).
- It outputs AD and DP matrices, which I load into an AnnData object. Each feature is a SNP as inCHR_POS_ALT_REF format.
- I filter outliers, bin SNPs, and then plan to explore CNV patterns.

With this setup:

I detect ~1M SNPs for GEX (RNA)
~2.5M SNPs for ATAC -> All manageable and interpretable.

Problem with larger variant database:

When I use a larger variant database (MAF > 0.0005, ~36.6M SNPs):

~6M SNPs for GEX
~13M SNPs for ATAC

That’s a lot of data. My concern is: Since this is tumor data, it has high heterogeneity and a rich mutation landscape. Relying only on known variants might cause me to miss relevant somatic SNPs.

Ideas I’m considering:

Running cellSNP-lite in de novo mode (without a variant list), but the runtime increases exponentially, and I don’t know how large the resulting data would get.

An anlternative is:

Split BAMs by cell barcode
Run bcftools mpileup in parallel
Build AD-DP matrices from VCFs
Create an AnnData object from these counts.

But again, this would probably also result in immense data size.

My questions:

Is de novo approach for SNP calling worth the extra huge computational cost (for tumor datasets for CNV pattern detection)?
Would filtering BAMs (based on flags) and splitting by cell be a reasonable and scalable pipeline?
And importantly, should I just stick to a known variant list and accept the tradeoff in sensitivity?

Any insights or experiences would be greatly appreciated!

variant SNP single-cell • 348 views

ADD COMMENT • link 15 days ago by nilaylale88 • 0