New beta release for CBS OData4 dataportal

Han Oostdijk

2020/05/14

Date last run: 14May2020

Introduction

In the LinkedIn group Centraal Bureau voor de Statistiek; Open Data I saw the article New beta release for CBS OData4 dataportal . The article points to page CBS Dataportal on their website for more information and mentions the new root pointer.

In the past I included two functions in package HOQCutil: get_table_cbs_odata4 and get_table_cbs_odata4_GET for version OData4. In this blog entry I check if the two functions in the HOQCutil still work.

OData3

In the past I made the package odataR for OData3. I rebuilt this package for R 4.0.0. and did not find any errors. The remainder of this document only concerns OData4 .

OData4

For the previous beta some of the functionality was already tested. The results of that test can be found in the pdf file opendata_beta_versie4_dec2018_20181225.pdf . In this document I will describe the tests done for the new version.

OData4 CBS documentation

On the CBS Dataportal the following documentation can be found:

Changes made in the HOQCutil package

While trying to check if the two package functions get_table_cbs_odata4 and get_table_cbs_odata4_GET were still working, I realized that I should have made unit tests for the various functionality in OData and my functions. So I decided to do this now. The test functions can be found in the package subfolder testthat.
I also took the opportunity to add the possibility for JSON output. I renamed the response parameter (it is now called restype). The three possible values for restype with their meaning:

Test results

The root for the tables

As announced in the CBS blog the root for the CBS tables has been changed. Therefore the default for parameter odata_root in the function get_table_cbs_odata4 is now changed to https://beta-odata4.cbs.nl .

The list ‘Welke OData 4 commando’s zijn beschikbaar?’

The list in the FAQ is not complete. The list lacks the following functions that worked in the previous beta and still work now:

$count

This works, but differently in OData3 than in OData4. In OData3 the result is an integer and in OData4 the result is character and preceded by a unicode character. NB. because of the different buildup of the data it is not surprising that the reported numbers are different.

# Odata3
count=odataR::odataR_get_table(
                   table_id='81589NED',
                   query="$count") 
str(count)
#>  int 80244
# Odata4
count=HOQCutil::get_table_cbs_odata4(
                   table_id='81589NED',
                   subtable='Observations',
                   query="$count",
                   verbose=T)
#> generated url  : https://beta-odata4.cbs.nl/CBS/81589NED/Observations/$count
#> unencoded query: $count
str(count)
#>  chr "<U+FEFF>1034682"

resp = httr::GET('https://beta-odata4.cbs.nl/CBS/81589NED/Observations/$count')
count=httr::content(resp, as = "text",encoding='UTF-8') 
str(count)
#>  chr "<U+FEFF>1034682"

In the last case I also used the ‘raw’ httr function calls (with the same result) to show that the unicode string result is not caused by the get_table_cbs_odata4 function.
I think that the current behaviour of $count is an error.

Other functions working but not documented

The following OData3 functions are not documented in the list ‘Welke OData 4 commando’s zijn beschikbaar?’ but are working in version 4:

Functions working but poorly documented

I think it is advisible to tell the reader that the second parameter of the function substring works with zero origin: the first character of a string is character 0 . The same goes for function indexof but that function is not mentioned at all.

OData3 functions that are not working in OData4

$orderby

The $orderby function is in OData4 considered as a function: it is recognized and when used requires an argument. However it has no effect on the order whatever the additional argument is: You can specify ‘asc’, ‘desc’, ‘??’ or no additional argument and it will not influence the order.
Because it is recognized as a function, I think that the behaviour of $orderby is an error.

Conclusion

Session Info

This document was produced on 14May2020 with the following R environment:

  #> R version 4.0.0 (2020-04-24)
  #> Platform: x86_64-w64-mingw32/x64 (64-bit)
  #> Running under: Windows 10 x64 (build 18363)
  #> 
  #> Matrix products: default
  #> 
  #> locale:
  #> [1] LC_COLLATE=English_United States.1252 
  #> [2] LC_CTYPE=English_United States.1252   
  #> [3] LC_MONETARY=English_United States.1252
  #> [4] LC_NUMERIC=C                          
  #> [5] LC_TIME=English_United States.1252    
  #> 
  #> attached base packages:
  #> [1] stats     graphics  grDevices utils     datasets  methods   base     
  #> 
  #> other attached packages:
  #> [1] HOQCutil_0.1.22
  #> 
  #> loaded via a namespace (and not attached):
  #>  [1] Rcpp_1.0.4.6    digest_0.6.25   R6_2.4.1        jsonlite_1.6.1 
  #>  [5] magrittr_1.5    evaluate_0.14   httr_1.4.1      odataR_0.1.4   
  #>  [9] rlang_0.4.6     stringi_1.4.6   curl_4.3        rmarkdown_2.1  
  #> [13] tools_4.0.0     stringr_1.4.0   glue_1.4.0      purrr_0.3.4    
  #> [17] xfun_0.13       compiler_4.0.0  htmltools_0.4.0 knitr_1.28