Question

weird distribution of intronic reads in single-nuclei data

0

Entering edit mode

4.2 years ago

Tim Padvitski • 0

I observe weird distribution of intronic reads in my single-nuclei data where specific parts of introns and intergenic regions have unexpectedly high coverage. First I though I did something wrong, but I checked other single cell and single nuclei RNAseq experiments (including publicly available) and apparently it's very common issue. On the picture below I've plotted side by side coverage tracks for my snRNAseq experiment, another snRNAseq, single cell RNAseq and bulk RNAseq experiments. It turns out that other single-cell/single-nuclei datasets reproduce the same picture that I see in my data. This plot shows Wt1 gene but many other genes behave similarly.

read coverage over Wt1 in snRNAseq, scRNAseq and bulk RNAseq data

https://drive.google.com/file/d/1ulq9yAGS7MAJOl_-Wx_qo_tHE6S0_0gn/view?usp=sharing

Details:

I mapped reads with STARsolo and then visualised pseudo-bulk read coverage in IGV browser.
I double-checked genome and gene annotation versions.
To rule out that STAR does something wrong with single cell data I've inspected also coverage in bam files produced with cell-ranger and it was virtually identical with STAR bams.

To me it seems like single-cell/single-nuclei RNAseq protocols introduce a bias in distribution of intronic reads that was perhaps not well documented previously. Has anyone described this before? How would you explain these results?

scRNA snRNAseq intrones coverage • 1.3k views

ADD COMMENT • link 4.2 years ago by Tim Padvitski • 0

score 2 · Answer 1 · 2020-08-28

2

Entering edit mode

4.2 years ago

rpolicastro 13k

We know from work done with RNA-velocity that mis-priming events within introns are pretty common in regular scRNA-seq (10-30% of reads on average), and correspond to nascent transcripts. Thus I would expect it to be even more common in single-nuclei data, since by sequencing nuclear RNA you are naturally enriching for nascent transcripts. Here's a 10X Genomics page that has a little blurb on the phenomenon.

ADD COMMENT • link 4.2 years ago by rpolicastro 13k

0

Entering edit mode

Thanks! this seems like a plausible explanation.

UPD: I checked my data and found that spikes in the distribution of intronic reads indeed correspond to polyA/polyT signal!!

ADD REPLY • link 4.2 years ago by Tim Padvitski • 0