I have a count file output from featureCounts. Column one contains gene names that I'd like to convert to Ensembl IDs.
What is the easiest way for me to do this?
I have a count file output from featureCounts. Column one contains gene names that I'd like to convert to Ensembl IDs.
What is the easiest way for me to do this?
I can't figure out how to put my matrix into BioMart.
I've tried just copying my list of genes from the featureCounts count matrix and using that to filter in BioMart. But I noticed it outputs fewer genes than how many are in my count file, and I don't know how I can quickly filter out my count file using the Biomart output list.
Copying and pasting the list of gene symbols to act as a filter should work.
It is important to remember that gene symbols are not stable and there may not always be an Ensembl gene ID associated with a particular gene name. I would suggest searching for the gene symbols (the ones without an Ensembl stable ID mapping) using the Ensembl genome browser and looking to see if there is an Ensembl gene ID associated with that symbol in the current version of Ensembl.
Is there a good way to filter in Excel? The ensembl ID list is 43,000+ genes and my counts file has 48,000+ genes. It just freezes up if I try to highlight duplicates using conditional formatting and sort by color.
I also tried instead using Advanced Filtering in excel (to filter one list from another) and am finding that after that I still somehow get more cells than contained in the ensembl list.
Excel is spreadsheet software that isn't really designed for doing analyses of this scale. There is some advice online about comparing lists: https://stackoverflow.com/questions/11165182/difference-between-two-lists-using-bash
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Past thread if you want to use R: Conversion of Gene Name to Ensembl ID
I've found this thread, but from what I understand the instructions are just to convert a list of gene names?
However, I don't understand how I can use my count file with multiple columns of different biological replicates, and just convert the first column of gene names to ensembl IDs.