How to add Ensembl ids after Pseudobulk analysis by DESeq2
1
0
Entering edit mode
7 months ago
Sara ▴ 30

Hi all,

I used DESeq2 to do Pseudobulk analysis on my Seurat object. I have a problem converting gene names to Ensembl IDs. My row names are, some with ENSG, some with gene names. I want to have Ensembl IDs and chromosome names as well. Here is the part of my DESeq code for Pseudobulk analysis:

dds <- DESeqDataSetFromMatrix(countData = counts_bcell,
                              colData = colData,
                              design = ~Age+Sex+condition)

#filter
keep <- rowSums(counts(dds)) >=10
dds <- dds[keep,]

colData(dds)$condition <- relevel(colData(dds)$condition, ref = "Control")

#run DESeq2
dds <- DESeq(dds, test = "LRT", reduced = ~Age+Sex)


#check the coefficients for the comparison
resultsNames(dds)

#Generate result object
res <- results(dds, name = "condition_Patient_vs_Control")
mapped <- data.frame(GeneName = rownames(res),
                     ensemblID = mapIds(org.Hs.eg.db, keys =rownames(res), keytype = "SYMBOL", column="ENSEMBL"))

res$ensembl_gene_id <- mapped$ensemblID

If we look at mapped it looks like as below for the gene names with ENSG I don't get any ensemblID.

> mapped
                       GeneName       ensemblID
ENSG00000238009 ENSG00000238009            <NA>
ENSG00000241860 ENSG00000241860            <NA>
ENSG00000290385 ENSG00000290385            <NA>
ENSG00000291215 ENSG00000291215            <NA>
ENSG00000229905 ENSG00000229905            <NA>
LINC01409             LINC01409            <NA>
ENSG00000290784 ENSG00000290784            <NA>
FAM87B                   FAM87B ENSG00000177757
LINC00115             LINC00115            <NA>

Any suggestions, please, or a better way to add ensemblID and chromosome name and biotype?

I appreciate your help. Many thanks!

Seurat Pseudobulk single-cell DESeq2 scRNA • 788 views
ADD COMMENT
0
Entering edit mode
7 months ago

Go back to your original counts matrix or input data and assign consistent IDs during its generation.

ADD COMMENT
0
Entering edit mode

I used Seurat, and in Seurat, I have gene names (which some are with gene-symbols and some with ENSG ids). Then I did Pseudobulk. how can I convert them or add ENS IDs as alternatives in another column in Seurat?

ADD REPLY
0
Entering edit mode

Why is your GeneName column in mapped a mix of Ensembl IDs and gene names? What Jared wants to say is that during the preprocessing you should already have made sure that you only have a constant identifier (Ensembl IDs) present, and not this wild mix. From a constant identifier it is easy to convert, e.g. by loading a GTF file that contains both ID and name, and then just do a left join with that.

ADD REPLY
0
Entering edit mode

This is exactly my point. You should not have gotten data to this state unintentionally, so you need to double-check what was done upstream to see where the swaps occurred and rectify it at that point.

ADD REPLY
0
Entering edit mode

It was 10X data, and I processed it using Seurat. Then, I came to the point of Pseudobulk using DESeq2. Does this mean I have to check which parameters they used in Cellranger, or do I have to check/change something in my Seurat analysis?

ADD REPLY

Login before adding your answer.

Traffic: 1992 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6