scRNA-seq analysis using TPM counts
1
0
Entering edit mode
2.7 years ago
Gene_MMP8 ▴ 240

I have found several posts in Biostars and other online communities where the general advice has been that one can use TPM values instead of raw counts in Seurat. I have a single-cell dataset with TPM values only and there's no way to get the raw count data. I checked using colSums() and the values came to 10^6 (to make sure it was really TPM). So, I started the analysis by creating the Seurat object as follows (I didn't perform any log transformation):

ctrl <- CreateSeuratObject(raw.data =  TPM_matrix, min.cells = 3, min.features = 200)  

Next, I wanted to check the top variable features that would be indicative of the signal I have in the data. Using the following command,

ctrl <- FindVariableFeatures(ctrl, selection.method = "vst", nfeatures = 2000, verbose = FALSE)

When I check the top 30 features, I get a lot of noncoding RNA genes, SNORD47, MIR1244-1, MIR1244-2, MIR1244-3, etc. None of these were among the top biomarker genes implicated in the disease of interest. I am not sure if this is because I am missing any steps in between.

RNA-seq TPM singlecell Seurat • 1.1k views
ADD COMMENT
0
Entering edit mode
2.7 years ago
Soheil ▴ 110

Feature selection using the "VST" method needs raw count data. See this thread for further details: https://github.com/satijalab/seurat/issues/2641

ADD COMMENT

Login before adding your answer.

Traffic: 2287 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6