Hi,
I'm download some datasets from Geo Database (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE155960)
I found the names are in ENSEMBL nomenclature and I need to convert into Gene symbol in order to do the QC metrics in the Seurat pipeline.
I'm using this code to convert the ENSEMBL to gene symbol:
library(Seurat)
library(patchwork)
library (dplyr)
library(Biomart)
library(org.Hs.eg.db)
library(ggplot2)
library(Matrix
countsData<- read.csv(file = "~/GSE155960_RAW/CD45N-L1.csv", header = TRUE, row.names = 1)
ensembl <- useMart("ensembl", dataset="hsapiens_gene_ensembl")
bm <- getBM(attributes=c("ensembl_gene_id", "hgnc_symbol"), values=rownames(countsData), mart=ensembl)
hgnc.symbols <- bm$hgnc_symbol[match(rownames(countsData), bm$ensembl_gene_id)]
countsData <- as.matrix(countsData)
rownames(countsData) <- hgnc.symbols
CD45N_L1 <- CreateSeuratObject(counts = t(countsData), project = "H_Fat", min.cells = 3, min.features = 200
CD45N_L1[["percent.mt"]] <- PercentageFeatureSet(CD45N_L1, pattern = "^MT-")
However when I ran the vlnplot to see the percent of mt, I saw all in 0. The samples are human.
When I generated the bm file, I can see some MT-genes matching with the ENSEMBL names, however after I generated the seurat object I dont see any MT-genes. Not sure why Im loosing the MT-genes during the conversion. I hope somebody have a solution for that.
Thanks.