Best approach for SNP calling in tumor single-cell multiome data (scRNA-seq + scATAC-seq) for CNV analysis
0
0
Entering edit mode
15 days ago

Hi everyone,

I'm working with 10x single-cell multiome tumor data (scRNA-seq + scATAC-seq), and my goal is to develop a method to identify CNV patterns using SNP profiles, with a focus on somatic mutations and potential haplotype assignment.

What I have done so far:

  • I used cellSNP-lite with a variant list filtered for MAF > 0.05 (~7M SNPs).

    • It outputs AD and DP matrices, which I load into an AnnData object. Each feature is a SNP as inCHR_POS_ALT_REF format.

    • I filter outliers, bin SNPs, and then plan to explore CNV patterns.

With this setup:

  • I detect ~1M SNPs for GEX (RNA)
  • ~2.5M SNPs for ATAC -> All manageable and interpretable.

Problem with larger variant database:

When I use a larger variant database (MAF > 0.0005, ~36.6M SNPs):

  • ~6M SNPs for GEX
  • ~13M SNPs for ATAC

That’s a lot of data. My concern is: Since this is tumor data, it has high heterogeneity and a rich mutation landscape. Relying only on known variants might cause me to miss relevant somatic SNPs.

Ideas I’m considering:

  • Running cellSNP-lite in de novo mode (without a variant list), but the runtime increases exponentially, and I don’t know how large the resulting data would get.

An anlternative is:

  • Split BAMs by cell barcode
  • Run bcftools mpileup in parallel
  • Build AD-DP matrices from VCFs
  • Create an AnnData object from these counts.

But again, this would probably also result in immense data size.

My questions:

  1. Is de novo approach for SNP calling worth the extra huge computational cost (for tumor datasets for CNV pattern detection)?
  2. Would filtering BAMs (based on flags) and splitting by cell be a reasonable and scalable pipeline?
  3. And importantly, should I just stick to a known variant list and accept the tradeoff in sensitivity?

Any insights or experiences would be greatly appreciated!

variant SNP single-cell • 348 views
ADD COMMENT

Login before adding your answer.

Traffic: 4147 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6