Hi guys,
Anybody knows how to parse the complete drugbank database in xml into R. I need the drug mechanism of action, so I can't use the other downloadable files from drugbank.
Thanks
Hi guys,
Anybody knows how to parse the complete drugbank database in xml into R. I need the drug mechanism of action, so I can't use the other downloadable files from drugbank.
Thanks
I successfully used the xmlEventParse() function in R (https://www.rdocumentation.org/packages/XML/versions/3.98-1.9/topics/xmlEventParse) to extract selected fields from the DrugBank database. (After experimenting with loading the full 600+ MB database into memory, and finding that that was not working, I ended up using this SAX parsing method.)
I've included a subset of my code to give you a feel for what this looks like:
library(XML)
library(xml2)
library(gdata)
drug.name <- array(dim = 0)
# Define function to extract necessary data from each drug (= each main node)
getDrug <- function(x, ...) {
# name the current drug for easy reference
current_drug <- read_xml(toString.XMLNode(x));
# extract properties related to drug
drug.name <- xml_text(xml_find_first(current_drug, './name'))
# remove the current node from memory when finished with it
rm(x)
}
# Use event-driven SAX parser to process the XML without requiring the full tree structure to be loaded into memory
# Call the function defined above
xmlEventParse(file = filename, handlers = NULL, trim = FALSE, branches = list(drug = getDrug))
Hope this helps.
I know it is an old post, but for anyone how might be having the same question. There is a new package called dbparser to parse drugbank database into several R datasets https://github.com/Dainanahan/dbparser
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
hello I am trying this but i am not getting results. and its throwing end of line error. can somoone help in processing this? it would be greatful ? i tried as the above mentioned script but of no luck. please do help me thank you