Hello all,
I have a list of GO indentifiers and term (from a yeast database) from which I need to filter the GO terms associated to a list of differentially expressed genes I have.
This is how I obtain the GO terms list:
gsets = AnnotationDbi::as.list(org.Sc.sgdGO2ALLORFS)
Then I subset it to have only those present in my list of genes of interest as this:
gsets = sapply(gsets, function(gset) intersect(rownames(AverageDataSet), unique(as.character(gset))))
This is the result with the yeast GO terms list:
head(sapply(gsets, head))
$
GO:0000001
IMP IGI IMP IPI IMP IMP "YAL048C" "YAL048C" "YDL006W" "YDL029W" "YDL029W" "YDL239C"$
GO:0000002
IMP IMP IMP IEA IGI IDA "IMI1" "YAL015C" "YBR163W" "YBR163W" "YBR192W" "YCR028C-A"$
GO:0000003
NAS IGI IGI IMP IGI IMP "MATA1" "YAL018C" "YAL020C" "YAL020C" "YAL026C" "YAL029C"$
GO:0000011
IMP IPI IEP IEA IMP IGI "YBR097W" "YCL063W" "YCL063W" "YCL063W" "YCL063W" "YCL063W"
Then I try to obtain the index of the GO terms. The part that is giving me a problem is this:
AnnotationDbi::as.list(GOTERM[names(gsets)])
Error in .checkKeys(value, Lkeys(x), x@ifnotfound) : value for "GO:0000208" not found
but when I subset the gsets list I can actually find it.
gsets["GO:0000208"]
$
GO:0000208
IMP "YJL128C"
I have done a little research and apparently this GO identifier is now obsolete. If so I wouldn't mind to drop it but why is it in the data base anyway?
This is probably some basic issue that I fail to identify. Actually, I don't even understand why I get this error. Any help is really appreciated.
Best, Jose.