Date last run: 14Mar2020

In an RStudio Community message the question was raised how to retrieve a table from a webpage that was generated by javascript . The problem was that the page did not contain the table itself but only a reference to the javascript code. Because I was busy with a similar project, I decided to see if I could solve it. The suggestion to solve the problem was described in a stack overflow entry but it did not work out for the questioner and myself. In the entry camile mentioned the Selenium.

Therefore I decided to use the R package RSelenium. The following code extracts the table. The only problem is that it does not free the port. After running the code it is necessary to restart RStudio . A restart of the R session or closing the R project (when more sessions are open) is not enough to free the port. In my latest experiments I could no longer create this blog entry until I restarted the computer. I begin to see the attraction of Docker for these use cases.


rD <- rsDriver(browser = 'firefox',port=4567L,verbose=F) 
remDr <- rD[["client"]] <- "saperda+tridentata"
url <- paste("",, sep="")

doc = xml2::read_html(remDr$getPageSource()[[1]])

df= rvest::html_table(doc)[[1]]

# stop the selenium server
#> [1] TRUE
#>           used (Mb) gc trigger (Mb) max used (Mb)
#> Ncells  799500 42.7    1561221 83.4  1134935 60.7
#> Vcells 1421543 10.9    8388608 64.0  2309339 17.7
# port is still in use (only after RStudio restart available again)

The table:

EPPOCode Name Type Language Preferred
SAPETR Saperda tridentata animal Scientific NA

