You've got to do some inspect source and HTML navigation to get to the table:
dat <- read_html('https://mpmp.huji.ac.il/maps/VATPASE.html')
dat %>% html_element("body") %>% html_elements("div") %>% html_element("table.table-bordered") %>% head(1) %>% html_table()
[[1]]
# A tibble: 13 × 6
PFID `PFID Old` Annotation `Formal Annotation` EC Transcript
<chr> <chr> <chr> <chr> <chr> <lgl>
1 PF3D7_0721900 PF07_0090a V-type ATPase V0 subunit e, putative V-type ATPase V0 subunit e, putative "" NA
2 PF3D7_1354400 MAL13P1.271 V-type proton ATPase 21 kDa proteolipid subunit, putative V-type proton ATPase 21 kDa proteolipid subunit, putative "3.6.3.14" NA
3 PF3D7_0519200 PFE0965c V-type proton ATPase c 16 kDa proteolipid subunit V-type proton ATPase 16 kDa proteolipid subunit "3.6.3.14" NA
4 PF3D7_1311900 PF13_0065 V-type proton ATPase catalytic subunit A V-type proton ATPase catalytic subunit A "3.6.3.14" NA
5 PF3D7_0806800 PF08_0113 v-type proton atpase subunit a, putative v-type proton atpase subunit a, putative "3.6.3.14" NA
6 PF3D7_0406100 PFD0305c V-type proton ATPase subunit B V-type proton ATPase subunit B "3.6.3.14" NA
7 PF3D7_1464700 PF14_0615 V-type proton ATPase subunit c ATP synthase (C/AC39) subunit, putative "3.6.3.14" NA
8 PF3D7_0106100 PFA0300c V-type proton ATPase subunit C, putative V-type proton ATPase subunit C, putative "3.6.3.14" NA
9 PF3D7_1341900 PF13_0227 V-type proton ATPase subunit D, putative V-type proton ATPase subunit D, putative "3.6.3.14" NA
10 PF3D7_0934500 PFI1670c V-type proton ATPase subunit E, putative V-type proton ATPase subunit E, putative "3.6.3.14" NA
11 PF3D7_1140100 PF11_0412 V-type proton ATPase subunit F, putative V-type proton ATPase subunit F, putative "3.6.3.14" NA
12 PF3D7_1323200 PF13_0130 V-type proton ATPase subunit G, putative V-type proton ATPase subunit G, putative "3.6.3.6" NA
13 PF3D7_1306600 PF13_0034 V-type proton ATPase subunit H, putative V-type proton ATPase subunit H, putative "3.6.3.14" NA
I'm using head(1)
since I can't seem to find a unique class identifier in a parent element. Digging deeper might help with that and remove the index guessing operation.
EDIT
This is the line you want for all (I'm making an educated guess) URLs in that website:
read_html("https://mpmp.huji.ac.il/maps/aminosugmetpath.html") %>% html_elements("table.table-bordered.table-hover") %>% html_table()
Why not contact the data owners instead of scraping web pages?
I tried that but I didn't get any reply. Plus the database keep on updating so I though maybe I will write a scraper that can be used every six month to update the pathways.