In Seurat, How Do nCount_RNA Differ from nFeature_RNA?
2
10
Entering edit mode
5.1 years ago

I'm reading up on the Seurat user guide: https://satijalab.org/seurat/v3.1/pbmc3k_tutorial.html And they mention for QC utilizing

The number of unique genes detected in each cell and The total number of molecules detected within a cell

They then refer to them as nCount_RNA and nFeature_RNA, but I'm not sure which is which. So my question is:

1.) What are the nCount_RNA and what are the nFeature_FNA 2.) Later in the pipeline, when you're normalizing the data, it says they "normalizes the feature expression measurements for each cell by the total expression." Can anybody explain that?

seurat single-cell • 63k views
ADD COMMENT
0
Entering edit mode

How is nCount_RNA different from library size? Thanks!

ADD REPLY
41
Entering edit mode
5.1 years ago

nFeature_RNA is the number of genes detected in each cell. nCount_RNA is the total number of molecules detected within a cell. Low nFeature_RNA for a cell indicates that it may be dead/dying or an empty droplet. High nCount_RNA and/or nFeature_RNA indicates that the "cell" may in fact be a doublet (or multiplet). In combination with %mitochondrial reads, removing outliers from these groups removes most doublets/dead cells/empty droplets, hence why filtering is a common pre-processing step.

The NormalizeData step is basically just ensuring expression values across cells are on a comparable scale. By default, it will divide counts for each gene by the total counts in the cell, multiply that value for each gene by the scale.factor (10,000 by default), and then natural log-transform them.

ADD COMMENT
0
Entering edit mode

Hey Jared, appreciate the easy-to-understand response! Quick question, do you know how exactly Seurat determines the number of molecules within a cell?

ADD REPLY
0
Entering edit mode

You know, I hunted around a bit and couldn't find exactly where nCount_RNA was defined. Presumably, it pulls that info during Read10x (from the .mtx file) or ReadAlevin and summarized for each cell to nCount_RNA during CreateSeuratObject.

ADD REPLY
0
Entering edit mode

Hi, Jared,

The definition of nCount_RNA (nCount_RNA is the total number of molecules detected within a cell) is pretty clear. However, I am still confused by the definition of nFeature_RNA (nFeature_RNA is the number of unique genes detected in each cell.)

What do you mean by "unique genes"? What are the detected genes unique relative to? For example, in a cell, there are 1000 genes detected totally and 400 genes detected uniquely. Does it suggest that these 400 genes are only detected in this cell? these 400 genes are not detected in any other cells? So these 400 genes are unique in this cell relative to all other cells? Am I correct?

Thanks.

ADD REPLY
0
Entering edit mode

Sorry, that phrasing was indeed a bit confusing - it's just the number of genes detected in each cell. They are not "unique" to that cell. I will edit my answer to clarify.

ADD REPLY
0
Entering edit mode

What is meant by not unique? They should be unique to a cell right?

ADD REPLY
1
Entering edit mode

What he was clarifying was the nFeature_RNA column reports the total number of genes in each cell that have at least one UMI count. As originally stated it could be confused as reporting genes that only have detectable UMI counts in a single cell.

ADD REPLY
0
Entering edit mode

Yes, nFeature_RNA is a cell-specific metric, but the genes detected in each cell (and thus reported by nFeature_RNA ) are not found only in that cell.

ADD REPLY
0
Entering edit mode
3.1 years ago

Hi,

I stumbled upon this question and had a follow-up question. I am re-analysing a single-cell RNA-seq dataset with two samples (plus minus treatment) and have downloaded preprocessed data from the geodataset as two .csv files. The authors state these files contain matrices that have been QC and logNormalized - and scaled.

After creating a Seurat object for both datasets, I checked the nFeatures_RNA and nCount_RNA for either dataset and got around twice as many nFeatures as nCounts_RNA. I can't explain this. To me UMIs are the nCount_RNA and I can't find anything on the internet proving otherwise. If nCount_RNA is UMIs, and there are only half the UMIs as genes detected, how have the genes been detected? I believe that you can't have two RNA molecules from different genes detected by the same UMI.

I attach a plot of the nCount_RNA against nFeatures_RNA and hope someone with a kind heart can clarify my question. If it helps these cells should be endothelial cells.

Thank you in advance. /Maibritt

enter image description here

ADD COMMENT
0
Entering edit mode

Please ask a new question. Piggy-backing is discouraged. You can feel free to link to this post as reference.

edit: I see you've already done so here. You may want to delete this post since it's not actually an answer.

ADD REPLY

Login before adding your answer.

Traffic: 1107 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6