Question

converting ensembl transcript ID to symbol

0

Entering edit mode

3.7 years ago

Maka ▴ 20

Hi everyone, I am pretty new in bioinformatics and I am struggling to convert the ensembl transcript ID I got from my salmon quant.sf into symbol. The problem is that when I run the script below, I get more rows than my query, so I can't align my annotations in my results table. I hope I was clear enough. If you could give me a hit would be highly appreciated :)

library(biomaRt)
    listMarts()
    ensembl=useMart("ENSEMBL_MART_ENSEMBL")
    listDatasets(ensembl) 
    ensembl <- useDataset("mmusculus_gene_ensembl", mart = ensembl)

    listAttributes(ensembl)

    filterType <- "ensembl_transcript_id_version"
    filterValues <- rownames(res4)        #this are my results from Deseq2

    attributeNames <- c('ensembl_transcript_id_version', 'entrezgene_id', 'mgi_symbol')

    annot <- getBM(attributes=attributeNames, 
                   filters = filterType, 
                   values = filterValues, 
                   mart = ensemble)

length(rownames(res4))

[1] 58400

annot

58655

I know that it is normal to get more transcripts per gene, but how I can align then my results? Thanks in advance :)

conversion rnaseq biomart • 1.2k views

ADD COMMENT • link updated 3.7 years ago by manaswwm ▴ 570 • written 3.7 years ago by Maka ▴ 20

0

Entering edit mode

This is a frequently asked question. One relevant past thread: Question: Convert Ensembl Transcript Ids Ensmust To Gene Symbol In R

ADD REPLY • link 3.7 years ago by GenoMax 154k

0

Entering edit mode

thank you for your fast reply. the problem is the number of rows that I obtain. I tried to follow already the example in the link you posted, but it did not solve my problem :(

ADD REPLY • link 3.7 years ago by Maka ▴ 20

0

Entering edit mode

can you share an example of duplicate transcript entries to better understand the problem?

ADD REPLY • link 3.7 years ago by manaswwm ▴ 570