Entering edit mode
6 months ago
ashkan
▴
160
I am trying to convert transcript ID
(which is one column in my csv
file) to gene symbol
using Biomart
(for all rows and do not mind to have each gene symbol multiple times) using the following few lines:
library(biomaRt)
ensembl <- useEnsembl(biomart = "genes", dataset = "hsapiens_gene_ensembl")
data <- read.csv2('abundance.tsv', sep='\t')
transcript_ids <- as.character(data$target_id)
result <- getBM(
attributes = c("ensembl_transcript_id", "ensembl_gene_id","hgnc_symbol"),
filters = "ensembl_transcript_id",
values = transcript_ids,
mart = ensembl
)
the last part with getBM
returned:
[1] ensembl_transcript_id ensembl_gene_id hgnc_symbol
<0 rows> (or 0-length row.names)
to check what the problem is I started with:
transcript_ids <- as.character(data$target_id)
print(transcript_ids)
the results are transcript IDs. then checked the next step to ensure that the connection to BioMart
is successful. to test it I did the following :
test_result <- getBM(
attributes = c("ensembl_transcript_id", "ensembl_gene_id", "hgnc_symbol"),
filters = "ensembl_transcript_id",
values = "ENST00000489730",
mart = ensembl
)
and here is the results:
ensembl_transcript_id ensembl_gene_id hgnc_symbol
1 ENST00000489730 ENSG00000069812 HES2
so it was fine. then I checked retrieving and merging the data using the following:
result <- getBM(
attributes = c("ensembl_transcript_id", "ensembl_gene_id", "hgnc_symbol"),
filters = "ensembl_transcript_id",
values = transcript_ids,
mart = ensembl
)
and I got this:
[1] ensembl_transcript_id ensembl_gene_id hgnc_symbol
<0 rows> (or 0-length row.names)
Do you know how to fix the problem?
Honestly, why the overhead with biomart? Get a GTF file that matches your annotations, and just do a left join with your transcript IDs, it's quite trivial actually. If you post a few example lines of transcripts and which annotations were used I can suggest code if you want.. Do you even need transcript-level annotations? Or do you want gene level counts? Consider using the tximport package if so.