Question

How does 10x Genomics 3' single cell data have reads aligning to exons that aren't terminal?

0

Entering edit mode

21 months ago

kenneditodd ▴ 50

I have 10x single cell data from a 3' expression kit. From what I understand the poly(dT)VN captures the end of the transcript. We sequenced 150 bp. I am now visualizing my BAM file in IGV. I filtered the BAM file to only include reads used for UMI counting. What I don't understand is if i'm only sequencing 150 bp wouldn't I only expect to see reads aligning to maybe the last two exons of a transcript? However, this is not what I'm seeing. Can someone explain how this is possible in the library prep steps? Is it something with the fragmentation? IGV snapshot of reads aligning to Clu gene showed

single-nucleus rna-seq igv single-cell • 2.5k views

ADD COMMENT • link updated 13 months ago by Ram 44k • written 21 months ago by kenneditodd ▴ 50

0

Entering edit mode

Can you tell us how you made the BAM file? Normal cellranger pipeline?

ADD REPLY • link 21 months ago by GenoMax 148k

0

Entering edit mode

I used the possorted_genome_bam.bam file output cellranger-7.1.0 pipeline. Introns are included. I then used samtools to filter the bam file and only include reads with the xf:i:25 tag - which are reads that contribute to UMI counting. Also this is with reference mm39 and there are three Clu isoforms but the three isoforms only vary in the coordinates of the first exon.

ADD REPLY • link 21 months ago by kenneditodd ▴ 50

score 6 · Answer 1 · 2023-03-10

6

Entering edit mode

21 months ago

Rob 6.9k

This happens regularly in the 10x Chromium chemistries. I can't tell if this is the only annotated transcript (isoform) at this locus or not. If not, then you could imagine that alternative isoforms with different terminal exons may describe some or all of the pileup that you see.

However, even in the absence of that, there are other well-documented causes for what you observe. Specifically, there is not perfect isolation of spliced (mature) from unspliced (nascent) RNA, and so both types of molecules are tagged and sequenced. This is highlighted in the tech-note from 10x from last year, and this preprint among other places. When unspliced RNA are sequenced, priming can occur not only at the polyA tail, but at internal polyA and near polyA motifs, which can lead to both intronic reads and, when the polyA motif is near the boundary of the preceding exon, exonic reads for non-terminal exons or exons far from the polyA tail. In fact, we talked about this in some detail in our recent preprint — see specifically section 2.3.

ADD COMMENT • link 21 months ago by Rob 6.9k

1

Entering edit mode

Adding on this, remember that unlike regular DNA amplification PCR (high temperature, specific primers) the entire RT reaction is done at low temperature with enrichment- rather than specific primers. PolyT (binding polyA) is just a motif as Rob says, it (together with low temperature as done in RT PCR) will bind to plenty of regions that have high A content. It is really just an enrichment (or depletion of non polyA-elements) rather than a specific amplification. Goal is really to get rid of the > 95% of RNA that is not relevant for the assay, such as ribosomal and structural elements.

ADD REPLY • link 21 months ago by ATpoint 86k

0

Entering edit mode

what's structural elements? like hairpins? so A-depleted by theory?

ADD REPLY • link 21 months ago by benformatics 4.1k

1

Entering edit mode

Thank you so much. This very helpful.

ADD REPLY • link 21 months ago by kenneditodd ▴ 50