Question

How to convert mouse gene names to ensembl IDs?

0

Entering edit mode

2.9 years ago

AA • 0

I have a count file output from featureCounts. Column one contains gene names that I'd like to convert to Ensembl IDs.

What is the easiest way for me to do this?

RNA-seq geneID ensembl • 4.4k views

ADD COMMENT • link updated 2.8 years ago by Ben Moore ★ 2.4k • written 2.9 years ago by AA • 0

1

Entering edit mode

Past thread if you want to use R: Conversion of Gene Name to Ensembl ID

ADD REPLY • link 2.9 years ago by GenoMax 148k

0

Entering edit mode

I've found this thread, but from what I understand the instructions are just to convert a list of gene names?

However, I don't understand how I can use my count file with multiple columns of different biological replicates, and just convert the first column of gene names to ensembl IDs.

ADD REPLY • link 2.8 years ago by AA • 0

score 0 · Answer 1 · 2022-02-14

0

Entering edit mode

2.9 years ago

Ben Moore ★ 2.4k

You can do this using BioMart. There is a help video here and documentation here.

ADD COMMENT • link 2.9 years ago by Ben Moore ★ 2.4k

0

Entering edit mode

I can't figure out how to put my matrix into BioMart.

I've tried just copying my list of genes from the featureCounts count matrix and using that to filter in BioMart. But I noticed it outputs fewer genes than how many are in my count file, and I don't know how I can quickly filter out my count file using the Biomart output list.

ADD REPLY • link 2.8 years ago by AA • 0

1

Entering edit mode

Copying and pasting the list of gene symbols to act as a filter should work.

It is important to remember that gene symbols are not stable and there may not always be an Ensembl gene ID associated with a particular gene name. I would suggest searching for the gene symbols (the ones without an Ensembl stable ID mapping) using the Ensembl genome browser and looking to see if there is an Ensembl gene ID associated with that symbol in the current version of Ensembl.

ADD REPLY • link 2.8 years ago by Ben Moore ★ 2.4k

0

Entering edit mode

Is there a good way to filter in Excel? The ensembl ID list is 43,000+ genes and my counts file has 48,000+ genes. It just freezes up if I try to highlight duplicates using conditional formatting and sort by color.

I also tried instead using Advanced Filtering in excel (to filter one list from another) and am finding that after that I still somehow get more cells than contained in the ensembl list.

ADD REPLY • link 2.8 years ago by AA • 0

0

Entering edit mode

Excel is spreadsheet software that isn't really designed for doing analyses of this scale. There is some advice online about comparing lists: https://stackoverflow.com/questions/11165182/difference-between-two-lists-using-bash

ADD REPLY • link 2.8 years ago by Ben Moore ★ 2.4k