Open Street Map in combination with Dutch Kadaster data

Han Oostdijk

2019/02/13

Date last run: 15Sep2019

Introduction

This document shows how one can display WFS data ( see here for some links about the Web Feature Service ) against a background of Open Street Map tiles. In this document we use data of the Kadaster. The description in English indicates: The Netherlands’ Cadastre, Land Registry and Mapping Agency – in short Kadaster – collects and registers administrative and spatial data on property and the rights involved. This also goes for ships, aircraft and telecom networks. Here I use this information to create a map of property in my neighbourhood.

Attach packages

All packages can be installed from The Comprehensive R Archive Network CRAN with exception of the HOQCutil package that can be installed from GitHub.

library('osmdata')
#> Warning: package 'osmdata' was built under R version 3.6.1
library('stringr')
library('dplyr')
#> Warning: package 'dplyr' was built under R version 3.6.1
library('sf')
#> Warning: package 'sf' was built under R version 3.6.1
library('ggplot2')
#> Warning: package 'ggplot2' was built under R version 3.6.1
library('ggspatial')
library('xml2')
#> Warning: package 'xml2' was built under R version 3.6.1
library('purrr')
library('glue')
library('jsonlite')
# devtools::install_github("HanOostdijk/HOQCutil")
library('HOQCutil')

Auxiliary functions

To retrieve the Kadaster data I use the functions GetCapabilities and GetFeature to handle the WFS protocol that is used. See the Auxiliary functions section at the end of the document for the code. The baseurl for the Kadaster data is already filled in but the code is generally applicable by specifying a different url.

I also use the HOQCutil::cap.out function to restrict print output (especially when viewing xml structures). It is not needed to produce the maps.

Available Kadaster data

The data of the Kadaster is available under the name ‘Kadastrale Kaart v3 WFS’ and described on the following metadata page.

FeatureTypes

The metadata page tells in words which FeatureTypes are available:

It appears that the specified FeatureTypes on the metadata page are only the suffixes of the actual FeatureType names: they are preceded by ‘kadastralekaartv3:’ . We can check that by viewing the GetCapabilities page that is mentioned on the metadata page.

It is easier to use the following code (remember that the baseurl defaults to the Kadaster data). FT1 contains the full FeatureType names and FT2 the abbreviations:

GetFeatureTypes <- function (
	  baseurl  = "https://geodata.nationaalgeoregister.nl/kadastralekaartv3/wfs?",
	  short=F) {
	wfs_gc_xml <- GetCapabilities(baseurl = baseurl)
	# xml2::xml_ns_strip(wfs_gc_xml)
	FeatureTypes <- xml2::xml_find_all(wfs_gc_xml,
		".//FeatureTypeList//FeatureType//Name")
	ft = purrr::map_chr(FeatureTypes,
		xml2::xml_text)
	if (short)
		ft <- stringr::str_extract(ft,'(?<=:)(.*)$')
	ft
}
  ( FT1 = GetFeatureTypes() )
#> [1] "kadastralekaartv3:annotatie"       "kadastralekaartv3:bebouwing"      
#> [3] "kadastralekaartv3:kadastralegrens" "kadastralekaartv3:perceel"
  ( FT2 = GetFeatureTypes(short=TRUE) )
#> [1] "annotatie"       "bebouwing"       "kadastralegrens" "perceel"

Which coordinate system to use

We want to retrieve only Kadaster data for my own neighbourhood. That raises the question which coordinate system (crs) should be used. With the following code we see that for all FeatureTypes the DefaultCRS is urn:ogc:def:crs:EPSG::28992.

GetDefaultCRS <- function (
	baseurl  = "https://geodata.nationaalgeoregister.nl/kadastralekaartv3/wfs?") {
	wfs_gc_xml <- GetCapabilities(baseurl = baseurl)
	FT <- GetFeatureTypes(baseurl)
	crs_names <- purrr::map_chr(FT,
		function(ft) {
			search_DefaultCRS <-
				glue(".//FeatureTypeList//FeatureType[contains(Name,'",
					{
						ft
					},
				"')]//DefaultCRS")
			crs <- xml_find_all(wfs_gc_xml, search_DefaultCRS) %>%
				xml2::xml_text()
		})
	set_names(crs_names,nm=FT)
}
( DefaultCRS = GetDefaultCRS() )
#>       kadastralekaartv3:annotatie       kadastralekaartv3:bebouwing 
#>     "urn:ogc:def:crs:EPSG::28992"     "urn:ogc:def:crs:EPSG::28992" 
#> kadastralekaartv3:kadastralegrens         kadastralekaartv3:perceel 
#>     "urn:ogc:def:crs:EPSG::28992"     "urn:ogc:def:crs:EPSG::28992"
( dcrs = stringr::str_extract(DefaultCRS[1],'(?<=::)(.*)$') )
#> [1] "28992"

According to epsg.io 28992 is a Dutch coordinate system. It is easier to specify coordinates in the standard lateral/longitude system known as epsg::4326 or WGS84 (World Geodetic System 1984) and convert those to epsg::28992.

Specifying coordinates for my neighbourhood

Therefore I specify the lower-left and upper-right points of the relevant area in the standard GPS coordinates and convert these to sf (simple features) object bb1 that is a representation of these two points. bb2 is the corresponding sf object in epsg::28992.

  my_area = c (4.8652, 52.3091,  4.8663, 52.3100)
  bb1 = sf::st_sfc(sf::st_point(my_area[1:2]), sf::st_point(my_area[3:4]))
  bb1 = sf::st_sf(geometry=bb1,crs=4326)
  bb2 = sf::st_transform(bb1,crs=as.numeric(dcrs))

Retrieve the Kadaster data

I use bb2 to retrieve the Kadaster data in that area (boundingbox). Because I do a query (only annotations for buildings and not for roads), I have to specify the geometry for the annotatie FeatureType. See the GetFeature section for details.

  x1 = GetFeature(FT1[1],bbObject=bb2,geometry='geom',
  	   query="classificatiecode eq 'Z02'") # annotatie
  x2 = GetFeature(FT1[2],bbObject=bb2)  # bebouwing 
  x3 = GetFeature(FT1[3],bbObject=bb2)  # kadastralegrens 
  x4 = GetFeature(FT1[4],bbObject=bb2)  # perceel 

Because I used the default value for the SRSName function argument (“EPSG:4326”) (and because json=T by default) all coordinates in x1, x2, x3 and x4 are in this coordinate reference system.

Correct for a strange ‘perceel’

I noticed that there is an object (feature) in x4 that is far outside the area I specified. Apparently it intersects this area. To remove this feature I use the sf::st_within function to select a ‘perceel’ only when it is fully in the (bounding box) polygon bb1p that contains the two points in bb1.

  bb1p = sf::st_sf(geometry = sf::st_as_sfc(sf::st_bbox(bb1)),crs=4326)
  x4w  = sf::st_within(x4, bb1p, sparse = FALSE, prepared = TRUE)
#> although coordinates are longitude/latitude, st_within assumes that they are planar
  x4w  = x4[x4w[,1],]

Now every ‘perceel’ in x4w is fully contained in the original area.

Retrieve center point of ‘perceel’

The geometry of x4w describes the polygon for each ‘perceel’, but x4w also contains the field plaatscoordinaten. For each feature this field consist of a json string with the location of a center point (that can be used to display the ‘perceelnummer’):

  pc =x4w$plaatscoordinaten
  print(pc[[1]])
#> [1] "{ \"type\": \"Point\", \"coordinates\": [ 4.86578814, 52.30980987 ] }"

From each of these json strings I extract the coordinates and convert them to a point and the list of points is then converted to an sfc object. The new sf object x4x is created with as contents x4w and the new sfc object as geometry. To be able to use the field perceelnummer_rotatie as angle argument in ggplot2 this field is converted to numeric.

  pcn = purrr::map(pc,function(x) {
  	nbs = jsonlite::fromJSON(x)
  	nbs = purrr::pluck(nbs,'coordinates')
  	sf::st_point(nbs)
   }
  )
  x4x = sf::st_set_geometry(x4w,sf::st_sfc(pcn,crs=4326)) %>%
  	dplyr::mutate(perceelnummer_rotatie=as.numeric(perceelnummer_rotatie))

Create the plot

The function annotation_map_tile ensures that an Open Street Map background will be used. On this background the following sf objects are used to create Figure 1 :

bb  = sf::st_bbox(bb1)
ccp = sf::st_sf(
		label="Data Kadaster and Open Street Map",
		geometry=sf::st_sfc(
			sf::st_point(x = c((bb$xmin + bb$xmax)/2, bb$ymin), dim = "XY")),
		crs = sf::st_crs(bb1),stringsAsFactors =T
	)
h = ggplot() +
  annotation_map_tile(zoom = 20,progress = "none") + 
	geom_sf(data = bb1,
		aes(geometry = geometry),
		alpha = 0, col = "black", size = 1) +
	# geom_sf_text(data = x1, # annotation 
	# 	aes(geometry = geometry, label = tekst),
	# 	alpha = 1, col = "red", size = 2) +
	geom_sf(data = x2, # bebouwing (buildings) 
		aes(geometry = geometry),
		alpha = 0, col = "red", size = 0.2, linetype="solid") +
	# geom_sf(data = x3, # kadastralegrens (boundaries according Kadaster)
	# 	aes(geometry = geometry),
	# 	alpha = 0, col = "blue", size = 0.2) +
  geom_sf(data = x4w, # perceel (specific area according Kadaster )
  	aes(geometry = geometry),
  	alpha = 0, col = "black", size = 0.2) +
	geom_sf_text(data = x4x, # perceel (number of  ) 
		aes(geometry = geometry,label=perceelnummer,angle=perceelnummer_rotatie),
		col = "black", size = 2) +
  geom_sf_label(data = ccp, 
			aes(label = label),
			alpha = 1, col = "blue", size = 2) +
	labs(x = "", y = "")
print(h)

Overlay of OSM map by Kadaster data

Figure 1: Overlay of OSM map by Kadaster data

Auxiliary functions

GetCapabilities

I use the GetCapabilities to find out the names of the FeatureTypes and the DefaultCRS.

GetCapabilities <- function (
	baseurl = "https://geodata.nationaalgeoregister.nl/kadastralekaartv3/wfs?",
	wfs_service = "WFS",
	wfs_version = "2.0.0") {
	wfs_request <- "GetCapabilities"
	wfs_request_gc  <- glue::glue(
		baseurl
		,	"service={wfs_service}"
		,	"&version={wfs_version}"
		,	"&request={wfs_request}"
	)
	wfs_gc_xml <- xml2::read_xml((wfs_request_gc), options = "NOWARNING")
	xml2::xml_ns_strip(wfs_gc_xml)
	wfs_gc_xml
}

GetFeature

I use the GetFeature function always with a selection on the bounding box (bbox). This can be done with a cql_filter or a bbox statement. The disadvantage of the first method is that the geometry must be named in the filter and therefore I prefer the second method. Because cql_filter and bbox are mutually exclusive it is necessary to use the first one when another selection is made. In that case both selections will be included by GetFeature with an and clause and a geometry must be specified. The bbObject is the bounding box of the area I am interested in. The crs of the bbObject must correspond with the major crs of the feature type. The SRSName specifies the crs of the output (when json=F) !
My preference is to get an sf object while using GetFeature. In case of problems (e.g. when using an query that appears later to be not fully correct) I use json=F because that gives more relevant error messages (in xml format). However notice that With json=F the output is not in the crs specified by SRSName!

I found out which geometry to use (for the case with an additional query) by leaving out the additional query, setting count to a low value and setting json=F. See section Find out which geometry to use how this is done.

GetFeature <- function (
	FeatureType,
	bbObject = NULL,
	query    = NULL,
	count    = NULL,
	json     = T,
	geometry = 'geom',
	baseurl  = "https://geodata.nationaalgeoregister.nl/kadastralekaartv3/wfs?",
	wfs_service = "WFS",
	wfs_version = "2.0.0",
	SRSName     ="EPSG:4326",
	geometry_column = 1L,
	verbose     = F
){
	if (is.null(bbObject) )
		bbox_fr   <- ''
	else {
		b         <- glue::glue_collapse(st_bbox(bb2),sep=',')
		bbox_fr   <- URLencode(b, reserved = T)
		bbox_fr   <- glue::glue("&bbox={bbox_fr}")
	}
	if ( (!is.null(query)) && (!is.null(bbObject)) ) {
		bbox_fr   <- ''
		if ( (!is.null(geometry)) && stringr::str_length(geometry)>0)
			geometry <- paste0(geometry,',')
		else
			geometry <- ''
		query     <- glue::glue("BBOX({geometry}{b}) and {query}")
	}
	cql_filter <- ifelse ( is.null(query)  ,
		'' , cql_filter <- URLencode(query, reserved = T) )
	cql_filter_fr <-ifelse( stringr::str_length(cql_filter) == 0  ,
		'' , glue::glue("&cql_filter={cql_filter}") )
	count_fr = ifelse(!is.null(count) ,
		glue::glue("&count={count}") ,
		"" )
	outputFormat <- ifelse (!json,
		'', URLencode("application/json", reserved = T) )
	outputFormat_fr = ifelse( stringr::str_length(outputFormat) == 0 ,
		'', glue::glue("&outputFormat={outputFormat}") )
	SRSName_fr = ifelse( stringr::str_length(outputFormat) == 0 ,
		'', glue::glue("&SRSName={SRSName}") )
	wfs_request="GetFeature"
	wfs_request_gf1 <- glue::glue(
		baseurl
		,"service={wfs_service}"
		,"&version={wfs_version}"
		,"&request={wfs_request}"
		,"&typeNames={FeatureType}"
		,outputFormat_fr
		,SRSName_fr
		,cql_filter_fr
		,count_fr
		,bbox_fr
	)
	if (verbose)
		print(wfs_request_gf1)
	if (json)
		sf::st_read(wfs_request_gf1,quiet=!verbose,
			stringsAsFactors=F,geometry_column = geometry_column)
	else
		xml2::read_xml((wfs_request_gf1),options="NOWARNING")
}

Find out which geometry to use

In GetFeature I indicated how to find the geometry to use. Remember that that it is only necessary to specify the geometry when there is an additional query (besides the bounding box). For each of the FeatureTypes annotatie, bebouwing, kadastralegrens and perceel I request and print one result element in xml format. To just select the third line of the print output I use the HOQCutil::cap.out function. As an example I show the code and output for the first FeatureType (annotatie) :

xml_res = GetFeature(FT1[1],json=F,bbObject=bb2,count=1)
HOQCutil::cap.out( print(xml_res,width=20000),lines=3,se=20000)
#> [1] <wfs:member>\n  <kadastralekaartv3:annotatie gml:id="annotatie.23416
#> 33">\n    <kadastralekaartv3:classificatiecode>Z06</kadastralekaartv3:cl
#> assificatiecode>\n    <kadastralekaartv3:rotatiehoek>-1.501</kadastralek
#> aartv3:rotatiehoek>\n    <kadastralekaartv3:tekst>Runmoolen</kadastralek
#> aartv3:tekst>\n    <kadastralekaartv3:geom>\n      <gml:Point srsName="u
#> rn:ogc:def:crs:EPSG::28992" srsDimension="2">\n        <gml:pos>119429.2
#> 6 480264.588</gml:pos>\n      </gml:Point>\n    </kadastralekaartv3:geom
#> >\n  </kadastralekaartv3:annotatie>\n</wfs:member>

From this I see that the end-tag of the geometry is just before the end-tag of the feature type. Coded for all FeatureTypes:

find_geometries <- function (FT,
	   baseurl= "https://geodata.nationaalgeoregister.nl/kadastralekaartv3/wfs?") {
	purrr::walk(FT,
		function(ft) {
			xml_res = GetFeature(ft,
				json = F,
				baseurl = baseurl,
				count = 1,
				verbose=F)
			xml2::xml_ns_strip(xml_res)
			chld = xml2::xml_find_first(xml_res, glue::glue('//{ft}'))
			tt = tail(purrr::map_chr(xml2::xml_children(chld), xml2::xml_name), 1)
			cat(glue::glue("FeatureType '{ft}': geometry name '{tt}'"),
				'\n')
		})
}
find_geometries(FT1)

This takes rather some time and I replaced the code by the following one

GetGeometries <- function (
	baseurl="https://geodata.nationaalgeoregister.nl/kadastralekaartv3/wfs?") {
	dft = DescribeFeatureType(verbose=F,baseurl=baseurl)
	ct  = xml2::xml_find_all(dft,'//xsd:complexType')
  ft  = xml2::xml_find_all(dft,'xsd:element') %>%
  				xml2::xml_attr(.,'type') %>%
  	     stringr::str_extract(.,"(.*)(?=Type$)")
	geom_names = purrr::map(ct,
		function(ct1) {
			geom = xml2::xml_find_all(ct1,
					  './/xsd:element[@type="gml:GeometryPropertyType"]') %>%
		   	xml2::xml_attr(.,'name')
		}
	)
	set_names(geom_names,ft)
}
(GetGeometries())
#> $`kadastralekaartv3:annotatie`
#> [1] "geom"
#> 
#> $`kadastralekaartv3:bebouwing`
#> [1] "geom"
#> 
#> $`kadastralekaartv3:kadastralegrens`
#> [1] "grenslijn"
#> 
#> $`kadastralekaartv3:perceel`
#> [1] "begrenzingperceel" "plaatscoordinaten"

The perceel FeatureType here has two ‘geometries’ ??

Session Info

sessionInfo()
#> R version 3.6.0 (2019-04-26)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18362)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_United States.1252 
#> [2] LC_CTYPE=English_United States.1252   
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C                          
#> [5] LC_TIME=English_United States.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] HOQCutil_0.1.10 jsonlite_1.6    glue_1.3.1      purrr_0.3.2    
#>  [5] xml2_1.2.2      ggspatial_1.0.3 ggplot2_3.2.1   sf_0.7-7       
#>  [9] dplyr_0.8.3     stringr_1.4.0   osmdata_0.1.1  
#> 
#> loaded via a namespace (and not attached):
#>  [1] tidyselect_0.2.5   xfun_0.8           lattice_0.20-38   
#>  [4] colorspace_1.4-1   htmltools_0.3.6    rlang_0.4.0       
#>  [7] e1071_1.7-0        pillar_1.4.2       withr_2.1.2       
#> [10] DBI_1.0.0          sp_1.3-1           plyr_1.8.4        
#> [13] munsell_0.5.0      blogdown_0.15      gtable_0.3.0      
#> [16] rvest_0.3.4        evaluate_0.14      knitr_1.24        
#> [19] prettymapr_0.2.2   curl_4.0           class_7.3-15      
#> [22] highr_0.8          Rcpp_1.0.2         KernSmooth_2.23-15
#> [25] scales_1.0.0       classInt_0.3-3     captioner_2.2.3   
#> [28] fs_1.3.1           png_0.1-7          digest_0.6.20     
#> [31] stringi_1.4.3      rosm_0.2.5         grid_3.6.0        
#> [34] rgdal_1.4-4        tools_3.6.0        magrittr_1.5      
#> [37] lazyeval_0.2.1     tibble_2.1.3       crayon_1.3.4      
#> [40] pkgconfig_2.0.2    lubridate_1.7.4    assertthat_0.2.1  
#> [43] rmarkdown_1.15     httr_1.4.1         R6_2.4.0          
#> [46] units_0.6-2        compiler_3.6.0