Entering edit mode
2.7 years ago
Maka
▴
20
Hi everyone, I am pretty new in bioinformatics and I am struggling to convert the ensembl transcript ID I got from my salmon quant.sf into symbol. The problem is that when I run the script below, I get more rows than my query, so I can't align my annotations in my results table. I hope I was clear enough. If you could give me a hit would be highly appreciated :)
library(biomaRt)
listMarts()
ensembl=useMart("ENSEMBL_MART_ENSEMBL")
listDatasets(ensembl)
ensembl <- useDataset("mmusculus_gene_ensembl", mart = ensembl)
listAttributes(ensembl)
filterType <- "ensembl_transcript_id_version"
filterValues <- rownames(res4) #this are my results from Deseq2
attributeNames <- c('ensembl_transcript_id_version', 'entrezgene_id', 'mgi_symbol')
annot <- getBM(attributes=attributeNames,
filters = filterType,
values = filterValues,
mart = ensemble)
length(rownames(res4))
[1] 58400
annot
58655
I know that it is normal to get more transcripts per gene, but how I can align then my results? Thanks in advance :)
This is a frequently asked question. One relevant past thread: Question: Convert Ensembl Transcript Ids Ensmust To Gene Symbol In R
thank you for your fast reply. the problem is the number of rows that I obtain. I tried to follow already the example in the link you posted, but it did not solve my problem :(
can you share an example of duplicate transcript entries to better understand the problem?