Highest variable features in single cell data
0
0
Entering edit mode
8 months ago

I used the 'FindVariableFeatures' function from the Seurat package to identify variable features, but some of the genes appearing in the results are only expressed in 10-15 cells, and these cells are not even in a single cluster. What should I do in this situation?

single-cell • 647 views
ADD COMMENT
0
Entering edit mode

and these cells are not even in a single cluster.

What does this mean? If you run clustering on all cells then every cell is assigned to one cluster. Check if strange cells are outliers in a QC metric. Or it's just a celltype that is not abundant or poorly captured by the single-cell tech.

ADD REPLY
0
Entering edit mode

I apologize for the confusion. What I meant to say is that, for example, the Trbv17 gene appears among the variable genes. However, when I plot the feature plot, this gene is expressed in only a very small number of cells, and these cells are scattered across random clusters on the UMAP plot.

Compared to the total number of cells, the number of cells expressing Trbv17 is very very small. I don't understand how this gene can be detected as a highly variable gene in this case since majority of the cells dont express this gene.

ADD REPLY
0
Entering edit mode

You need to be clear about how Seurat defines highly variable genes here. Highly variable genes are the genes that have very high expression in some cells and low or no-expression in other cells. Thus in your case, Trbv17 gene is rightly picked as a variable gene as you are seeing in your featureplot its expression in a very few cells, which is totally expected. There is nothing wrong with it.

ADD REPLY
0
Entering edit mode

Thank you.

But can we say that this is biologically informative ? If this gene had been detected in excess specifically within a single cluster, then I would say it had some meaning. However, since the cells producing this gene are randomly distributed on the UMAP plot, I conclude it cannot have any significance, indicating that it doesn't contribute much to the clustering aspect either.

Can we impose a filter that requires a gene to be produced in at least 20 cells in order to be selected as a variable gene?

ADD REPLY

Login before adding your answer.

Traffic: 2210 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6