Entering edit mode
4.8 years ago
marongiu.luigi
▴
730
Hello,
this is probably an old question but I could not find a viable solution on the posts here or elsewhere.
I have a list of NCBI terms (not all refseqs) for proteins that I would like to convert into GO terms. The organisms are viruses, not humans.
I tried with biomart:
library(biomaRt)
df = read.table(dataFile.tsv)
> head(df)
Accession
1 BAA78224
2 ARM10145
3 ARM63896
4 AVQ94044
5 ARM06518
6 ARM06542
id = df$Accession
database = useMart("ensembl")
go = getBM(attributes = c("refseq", "GO"),
filters = 'entrezgene_id',
values = id,
mart = database)
but I got:
> database
Object of class 'Mart':
Using the ENSEMBL_MART_ENSEMBL BioMart database
No dataset selected.
> go = getBM(attributes = c("ncbi", "GO"),
+ filters = 'entrezgene_id',
+ values = id,
+ mart = database)
Error in martCheck(mart) :
No dataset selected, please select a dataset first. You can see the available datasets by using the listDatasets function see ?listDatasets for more information. Then you should create the Mart object by using the useMart function. See ?useMart for more information
Since the terms are from viruses, what dataset should I select? Or is another way to convert the terms?
Error is clear, you need to choose the dataset within selected mart, for example for "ensembl" we have 202 datasets.
Thank you. Since on these 202 entries there are no viruses, I guess I can't do the conversion...
I see very few viruses in Go Ontology browser. So there may be very few viruses that have GO annotation that is publicly available. If your virus is in this list then you are in luck otherwise there is likely no information available.
No, this means, you need to find the right mart that has virus datasets (Don't know if it exists).