Gene expression at 0 in single cell experiments
0
0
Entering edit mode
7 weeks ago
Gerard ▴ 10

Hello.

I had a previous problem, in which I was unable to compute a violin plot or a feature plot for some genes on my seurat objects because of NA values.

By checking my raw matrixes, I finally identified the source of the problem, but cannot understand it ...

## I generated my seurat object : 
data <- Read10X("mypath/filtered_feature_bc_matrix")
object <- CreateSeuratObject(counts = data, min.cells = 3, min.features = 200)

## Then, I checked the presence of my gene :
"IFNA5" %in% rownames(object)

## Which returned
False

## Then, I redid all, but this time I didn't add min.cells and min.features arguments, and this time it returned

True

I then pre-processed my object using the standard pipeline, and computed a FeaturePlot of this gene.

The result is clear :

All cells have the same value (0) of “IFNA5”

Which explains it all : The expression is at 0 for EVERY cell. This is the same for all of my interferon genes. And this is unexplainable, I have PBMCs so I should have at least some interferon somewhere, knowing that I'm working with stimulated samples in which the interferon production should be skyrocketting. And actually, I see the production of interferon-induced genes like IFITs and ISGs very high, but the interferon itself is at 0. This cannot be biological I reckon.

Does anyone have an explanation ? Why would my interferon genes be at 0 everywhere, can it come from the sequencing itself ? There are some other genes with which I have the same problem.

Thank you in advance

seurat single-cell • 547 views
ADD COMMENT
0
Entering edit mode

Do you have some QC metrics to show us ? Like the number of cells you have, the number of counts and features, percentage of mitochondrial genes etc...

ADD REPLY
0
Entering edit mode

It's the raw matrixes so I have 2 million features, going down to ~7000 after the following filtering : nCount_RNA > 500 & nCount_RNA < 20000 & percent.mt < 12 & percent.ribo < 30

What worries me is that in any case the expression of my genes is at 0 everywhere, even withouh any filtering and processing, so the problem seems to be in the original data

ADD REPLY
0
Entering edit mode

Your features is this context are genes not cells.

From 2M cells to 7000 with this QC parameters sounds fishy. Can you make a violin plot of each QC metrics.

Quoting this :

my genes is at 0 everywhere

Do you mean all your genes of interest (interferon related) or 100% of your genes in the genesXcells matrix ?

ADD REPLY
0
Entering edit mode

My genes of interest are at 0. And some others. Globally, aside from these, the data is super clean and downstream analyses work very well.

Here are the plots :

enter image description here

enter image description here

ADD REPLY
1
Entering edit mode

Ask the sequencing plateform to provide you the GTF/GFF they used to count your features. Look into this file if you can find your genes of interest and how they are annotated. If you cannot find them, that was a problem of annotation file. If you can find them, check in your bam files if you can find some reads at the position of your genes of interest in the annotation file. Investigate.

ADD REPLY
0
Entering edit mode

What was your method for mapping reads? What reference did you use?

ADD REPLY
0
Entering edit mode

I didn't do it, the sequencing platform directly provided the matrixes after running CellRanger

ADD REPLY

Login before adding your answer.

Traffic: 1515 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6