diff --git a/10-gis.md b/10-gis.md index 69f12367c..08de88fd2 100644 --- a/10-gis.md +++ b/10-gis.md @@ -27,8 +27,8 @@ A defining feature of [interpreted](https://en.wikipedia.org/wiki/Interpreter_(c rather than relying on pointing and clicking on different parts of a screen, you type commands into the console and execute them with the `Enter` key. A common and effective workflow when using interactive development environments such as RStudio or VS Code is to type code into source files in a source editor and control interactive execution of the code with a shortcut such as `Ctrl+Enter`. -CLIs are not unique to R: most early computing environments relied on a command line 'shell' and it was only after the invention and widespread adoption of the computer mouse in the 1990s that graphical user interfaces (GUIs)\index{graphical user interface} became common. -GRASS GIS the longest-standing continuously-developed open source GIS\index{GIS} software, for example, relied on its command line interface before it gained a GUI [@landa_new_2008]. +Command line interfaces (CLIs) are not unique to R: most early computing environments relied on a command line 'shell' and it was only after the invention and widespread adoption of the computer mouse in the 1990s that graphical user interfaces (GUIs)\index{graphical user interface} became common. +GRASS GIS the longest-standing continuously developed open source GIS\index{GIS} software, for example, relied on its CLI before it gained a GUI [@landa_new_2008]. Most popular GIS software projects are GUI-driven. You *can* interact with QGIS\index{QGIS}, SAGA\index{SAGA}, GRASS GIS\index{GRASS GIS} and gvSIG from system terminals and embedded CLIs, but their design encourages most people to interact with them by 'pointing and clicking'. An unintended consequence of this is that most GIS users miss out on the advantages of CLI-driven and scriptable approaches. @@ -36,11 +36,11 @@ According to the creator of the popular QGIS software [@sherman_desktop_2008]: > With the advent of 'modern' GIS software, most people want to point and click their way through life. That’s good, but there is a tremendous amount of flexibility and power waiting for you with the command line. Many times you can do something on the command line in a fraction of the time you can do it with a GUI. -The 'CLI vs GUI' debate does not have to be adverserial: both ways of working have advantages, depending on a range of factors including the task (with drawing new features being well suited to GUIs), the level of reproducibility desired, and the user's skillset. +The 'CLI vs GUI' debate does not have to be adversarial: both ways of working have advantages, depending on a range of factors including the task (with drawing new features being well suited to GUIs), the level of reproducibility desired, and the user's skillset. GRASS GIS is a good example of GIS software that is primarily based on a CLI but which also has a prominent GUI. Likewise, while R is focused on its CLI, IDEs such as RStudio provide a GUI for improving accessibility. -Software cannot be neatly categorised into CLI-based or GUI-based. -However, interactive command line interfaces have several important advantages in terms of: +Software cannot be neatly categorized into CLI or GUI-based. +However, interactive command-line interfaces have several important advantages in terms of: - Automating repetitive tasks - Enabling transparency and reproducibility @@ -63,11 +63,11 @@ Such bridges to these computational recipes for enhancing R's capabilities for s IDEs such as RStudio and VS Code provide code auto-completion and other features to improve the user experience when developing code.\EndKnitrBlock{rmdnote} R is a natural choice for people wanting to build bridges between reproducible data analysis workflows and GIS because it *originated* as an interface language. -A key feature of R (and its predecessor S) is that it provides access to statistical algorithms in other languages (particularly FORTRAN\index{FORTRAN} and C), but from a powerful high level functional language with an intuitive REPL environment, which C and FORTRAN lacked [@chambers_extending_2016]. +A key feature of R (and its predecessor S) is that it provides access to statistical algorithms in other languages (particularly FORTRAN\index{FORTRAN} and C), but from a powerful high-level functional language with an intuitive REPL environment, which C and FORTRAN lacked [@chambers_extending_2016]. R continues this tradition with interfaces to numerous languages, notably C++\index{C++}. Although R was not designed as a command line GIS, its ability to interface with dedicated GISs gives it astonishing geospatial capabilities. -With GIS bridges, R can replicate more diverse workflows, with the additional reproducibility, scalability and productity benefits of controlling them from a programming environment and a consistent CLI. +With GIS bridges, R can replicate more diverse workflows, with the additional reproducibility, scalability and productivity benefits of controlling them from a programming environment and a consistent CLI. Furthermore, R outperforms GISs in some areas of geocomputation\index{geocomputation}, including interactive/animated map-making (see Chapter \@ref(adv-map)) and spatial statistical modeling (see Chapter \@ref(spatial-cv)). This chapter focuses on 'bridges' to three mature open source GIS products, summarized in Table \@ref(tab:gis-comp): @@ -82,7 +82,7 @@ There have also been major developments in enabling open source GIS software to Table: (\#tab:gis-comp)Comparison between three open-source GIS. Hybrid refers to the support of vector and raster operations. -|GIS |First release |No. functions |Support | +|GIS |First Release |No. Functions |Support | |:---------|:-------------|:-------------|:-------| |QGIS |2002 |>1000 |hybrid | |SAGA |2004 |>600 |hybrid | @@ -96,7 +96,7 @@ In addition to the three R-GIS bridges mentioned above, this chapter also provid QGIS\index{QGIS} is the most popular open-source GIS (Table \@ref(tab:gis-comp); @graser_processing_2015). QGIS provides a unified interface to QGIS's native geoalgorithms, GDAL, and --- when they are installed --- from other *providers* such as GRASS GIS\index{GRASS GIS}, and SAGA\index{SAGA} [@graser_processing_2015]. -Since version 3.14 (released in summer 2020), QGIS ships with the `qgis_process` command line utility for accessing a bounty of functionality for geocomputation. +Since version 3.14 (released in summer 2020), QGIS ships with the `qgis_process` command-line utility for accessing a bounty of functionality for geocomputation. `qgis_process` provides access to 300+ geoalgorithms in the standard QGIS installation and 1,000+ via plugins to external providers such as GRASS GIS and SAGA. The **qgisprocess** package\index{qgisprocess (package)} provides access to `qgis_process` from R. @@ -144,7 +144,7 @@ qgis_enable_plugins(c("grassprovider", "processing_saga_nextgen"), quiet = TRUE) ``` -Please note that aside from installing SAGA on your system you also need to install the QGIS Python plugin Processing Saga NextGen. +Please note that aside from installing SAGA on your system, you also need to install the QGIS Python plugin Processing Saga NextGen. You can do so from within QGIS with the [Plugin Manager](https://docs.qgis.org/latest/en/docs/training_manual/qgis_plugins/fetching_plugins.html) or programmatically with the help of the Python package [qgis-plugin-manager](https://github.com/3liz/qgis-plugin-manager) (at least on Linux). `qgis_providers()` lists the name of the software and the corresponding count of available geoalgorithms. @@ -186,8 +186,8 @@ aggzone_wgs = st_transform(aggregating_zones, "EPSG:4326") ```
-Illustration of two areal units: incongruent (black lines) and aggregating zones (red borders). -

(\#fig:uniondata)Illustration of two areal units: incongruent (black lines) and aggregating zones (red borders).

+Two areal units: incongruent (black lines) and aggregating zones (red borders). +

(\#fig:uniondata)Two areal units: incongruent (black lines) and aggregating zones (red borders).

The first step is to find an algorithm that can merge two vector objects. @@ -216,7 +216,7 @@ qgis_search_algorithms("union") One of the algorithms on the above list, `"native:union"`, sounds promising. The next step is to find out what this algorithm does and how we can use it. This is the role of the `qgis_show_help()`, which returns a short summary of what the algorithm does, its arguments, and outputs.^[We can also extract some of information independently with `qgis_get_description()`, `qgis_get_argument_specs()`, and `qgis_get_output_specss()`.] -This makes it output rather long. +This makes its output rather long. The following command returns a data frame with each row representing an argument required by `"native:union"` and columns with the name, description, type, default value, available values, and acceptable values associated with each: @@ -249,8 +249,8 @@ This can be very convenient, but we recommend providing the path to your spatial This can increase algorithm runtimes. The main function of **qgisprocess** is `qgis_run_algorithm()`, which sends inputs to QGIS and returns the outputs. -It accepts the algorithm name and a set of named arguments shown in the help list, and performs expected calculations. -In our case, three arguments seem important - `INPUT`, `OVERLAY`, and `OUTPUT`. +It accepts the algorithm name and a set of named arguments shown in the help list, and it performs expected calculations. +In our case, three arguments seem important: `INPUT`, `OVERLAY`, and `OUTPUT`. The first one, `INPUT`, is our main vector object `incongr_wgs`, while the second one, `OVERLAY`, is `aggzone_wgs`. The last argument, `OUTPUT`, is an output file name, which **qgisprocess** will automatically choose and create in `tempdir()` if none is provided. @@ -273,7 +273,7 @@ union_sf = st_as_sf(union) ``` Note that the QGIS\index{QGIS} union\index{vector!union} operation merges the two input layers into one layer by using the intersection\index{vector!intersection} and the symmetrical difference of the two input layers (which, by the way, is also the default when doing a union operation in GRASS GIS\index{GRASS GIS} and SAGA\index{SAGA}). -This is **not** the same as `st_union(incongr_wgs, aggzone_wgs)` (see Exercises)! +This is **not** the same as `st_union(incongr_wgs, aggzone_wgs)` (see the Exercises)! The result, `union_sf`, is a multipolygon with a larger number of features than two input objects. Notice, however, that many of these polygons are small and do not represent real areas but are rather a result of our two datasets having a different level of detail. @@ -297,16 +297,16 @@ Importantly, we can use it through **qgisprocess**. \BeginKnitrBlock{rmdnote}
The GRASS GIS provider in QGIS was called `grass7` until QGIS version 3.34. Thus, if you have an older QGIS version, you must prefix the algorithms with `grass7` instead of `grass`.
\EndKnitrBlock{rmdnote} -Similarly to the previous step, we should start by looking at this algorithm's help. +Similar to the previous step, we should start by looking at this algorithm's help. ``` r qgis_show_help("grass:v.clean") ``` -We have omitted the output here, because the help text is quite long and contains a lot of arguments.^[Also note that these arguments, contrary to the QGIS's ones, are in lower case.] -This is because `v.clean` is a multi tool -- it can clean different types of geometries and solve different types of topological problems. -For this example, let's focus on just a few arguments, however, we encourage you to visit [this algorithm's documentation](https://grass.osgeo.org/grass-stable/manuals/v.clean.html) to learn more about `v.clean` capabilities. +We have omitted the output here, because the help text is quite long and contains a lot of arguments.^[Also note that these arguments, contrary to the QGIS's ones, are in lowercase.] +This is because `v.clean` is a multi-tool -- it can clean different types of geometries and solve different types of topological problems. +For this example, let's focus on just a few arguments, however, we encourage you to visit this [algorithm's documentation](https://grass.osgeo.org/grass-stable/manuals/v.clean.html) to learn more about `v.clean` capabilities. ``` r @@ -341,7 +341,7 @@ clean = qgis_run_algorithm("grass:v.clean", clean_sf = st_as_sf(clean) ``` -The result, the right panel of \@ref(fig:sliver), looks as expected -- sliver polygons are now removed. +The result, the right panel of Figure \@ref(fig:sliver), looks as expected -- sliver polygons are now removed.
Sliver polygons colored in red (left panel). Cleaned polygons (right panel). @@ -353,10 +353,10 @@ The result, the right panel of \@ref(fig:sliver), looks as expected -- sliver po Digital elevation models (DEMs)\index{digital elevation model} contain elevation information for each raster cell. They are used for many purposes, including satellite navigation, water flow models, surface analysis, or visualization. Here, we are interested in deriving new information from a DEM raster that could be used as predictors for statistical learning. -Various terrain parameters can be helpful, for example, for the prediction of landslides (see Chapter \@ref(spatial-cv)) +Various terrain parameters can be helpful, for example, for the prediction of landslides (see Chapter \@ref(spatial-cv)). For this section, we will use `dem.tif` -- a digital elevation model of the Mongón study area (downloaded from the Land Process Distributed Active Archive Center, see also `?dem.tif`). -It has a resolution of about 30 by 30 meters and uses a projected CRS. +It has a resolution of about 30 x 30 meters and uses a projected CRS. ``` r @@ -398,8 +398,8 @@ Therefore, we only have to specify one argument -- the input `DEM`. Of course, when applying this algorithm you should make sure that the parameter values are in correspondence with your study aim.^[The additional arguments of `"sagang:sagawetnessindex"` are well-explained at https://gis.stackexchange.com/a/323454/20955.] Before running the SAGA algorithm from within QGIS, we change the default raster output format from `.tif` to SAGA's native raster format `.sdat`. -Hence, all output rasters we do not specify ourselves will from now on be written to the `.sdat` format. -Depending on the software versions (SAGA, GDAL) you are using, this might not be necessary but often enough this will save you trouble when trying to read in output rasters created with SAGA. +Hence, all output rasters that we do not specify ourselves will from now on be written to the `.sdat` format. +Depending on the software versions (SAGA, GDAL) you are using, this might not be necessary, but often enough this will save you trouble when trying to read-in output rasters created with SAGA. ``` r @@ -420,10 +420,10 @@ dem_wetness_twi = qgis_as_terra(dem_wetness$TWI) options(qgisprocess.tmp_raster_ext = ".tif") ``` -You can see the TWI map on the left panel of Figure \@ref(fig:qgis-raster-map). +You can see the TWI map in the left panel of Figure \@ref(fig:qgis-raster-map). The topographic wetness index is unitless: its low values represent areas that will not accumulate water, while higher values show areas that will accumulate water at increasing levels. -Information from digital elevation models can also be categorized, for example, to geomorphons\index{geomorphons} -- the geomorphological phenotypes consisting of 10 classes that represent terrain forms, such as slopes, ridges, or valleys [@jasiewicz_geomorphons_2013]. +Information from digital elevation models can also be categorized, for example, to geomorphons\index{geomorphons} -- the geomorphological phenotypes consisting of ten classes that represent terrain forms, such as slopes, ridges, or valleys [@jasiewicz_geomorphons_2013]. These phenotypes are used in many studies, including landslide susceptibility, ecosystem services, human mobility, and digital soil mapping. The original implementation of the geomorphons' algorithm was created in GRASS GIS, and we can find it in the **qgisprocess** list as `"grass:r.geomorphon"`: @@ -436,7 +436,7 @@ qgis_show_help("grass:r.geomorphon") # output not shown ``` -Calculation of geomorphons requires an input DEM (`elevation`), and can be customized with a set of optional arguments. +Calculation of geomorphons requires an input DEM (`elevation`) and can be customized with a set of optional arguments. It includes, `search` -- a length for which the line-of-sight is calculated, and ``-m`` -- a flag specifying that the search value will be provided in meters (and not the number of cells). More information about additional arguments can be found in the original paper and the [GRASS GIS documentation](https://grass.osgeo.org/grass-stable/manuals/r.geomorphon.html). @@ -448,7 +448,7 @@ dem_geomorph = qgis_run_algorithm("grass:r.geomorphon", ) ``` -Our output, `dem_geomorph$forms`, contains a raster file with 10 categories -- each one representing a terrain form. +Our output, `dem_geomorph$forms`, contains a raster file with ten categories -- each representing a terrain form. We can read it into R with `qgis_as_terra()`, and then visualize it (Figure \@ref(fig:qgis-raster-map), right panel) or use it in our subsequent calculations. @@ -466,11 +466,11 @@ The largest TWI values mostly occur in valleys and hollows, while the lowest val ## SAGA {#saga} -The System for Automated Geoscientific Analyses (SAGA\index{SAGA}; Table \@ref(tab:gis-comp)) provides the possibility to execute SAGA modules via the command line interface\index{command line interface} (`saga_cmd.exe` under Windows and just `saga_cmd` under Linux) (see the [SAGA wiki on modules](https://sourceforge.net/p/saga-gis/wiki/Executing%20Modules%20with%20SAGA%20CMD/)). +The System for Automated Geoscientific Analyses (SAGA\index{SAGA}; Table \@ref(tab:gis-comp)) provides the possibility to execute SAGA modules via the command-line interface\index{command line interface} (`saga_cmd.exe` under Windows and just `saga_cmd` under Linux) (see the [SAGA wiki on modules](https://sourceforge.net/p/saga-gis/wiki/Executing%20Modules%20with%20SAGA%20CMD/)). In addition, there is a Python interface (SAGA Python API\index{API}). **Rsagacmd**\index{Rsagacmd (package)} uses the former to run SAGA\index{SAGA} from within R. -We will use **Rsagacmd** in this section to delineate areas with similar values of the normalized difference vegetation index (NDVI) of the Mongón study area in Peru from the 22nd of September 2000 (Figure \@ref(fig:sagasegments), left panel) by using a seeded region growing algorithm from SAGA\index{segmentation}.^[Read Section \@ref(local-operations) on details of how to calculate NDVI from a remote sensing image.] +We will use **Rsagacmd** in this section to delineate areas with similar values of the normalized difference vegetation index (NDVI) of the Mongón study area in Peru from September in the year 2000 (Figure \@ref(fig:sagasegments), left panel) by using a seeded region growing algorithm from SAGA\index{segmentation}.^[Read Section \@ref(local-operations) on details of how to calculate NDVI from a remote-sensing image.] ``` r @@ -518,7 +518,7 @@ Our output is a list of three objects: `variance` -- a raster map of local varia The second SAGA tool we use is `seeded_region_growing`.^[You can read more about the tool at https://saga-gis.sourceforge.io/saga_tool_doc/8.3.0/imagery_segmentation_3.html.] The `seeded_region_growing` tool requires two inputs: our `seed_grid` calculated in the previous step and the `ndvi` raster object. -Additionally, we can specify several parameters, such as `normalize` to standardize the input features, `neighbour` (4 or 8-neighborhood), and `method`. +Additionally, we can specify several parameters, such as `normalize` to standardize the input features, `neighbour` (4- or 8-neighborhood), and `method`. The last parameter can be set to either `0` or `1` (region growing is based on raster cells' values and their positions or just the values). For a more detailed description of the method, see @bohner_image_2006. @@ -550,16 +550,16 @@ ndvi_segments = ndvi_srg$segments |> The resulting polygons (segments) represent areas with similar values. They can also be further aggregated into larger polygons using various techniques, such as clustering (e.g., *k*-means), regionalization (e.g., SKATER) or supervised classification methods. -You can try to do it in Exercises. +You can try to do it in the Exercises. R also has other tools to achieve the goal of creating polygons with similar values (so-called segments). -It includes the **SegOptim** package [@goncalves_segoptim_2019] that allows running several image segmentation algorithms and **supercells** [@nowosad_extended_2022] that implements superpixels\index{segmentation!superpixels} algorithm SLIC to work with geospatial data. +It includes the **SegOptim** package [@goncalves_segoptim_2019] that allows running several image segmentation algorithms and **supercells** package [@nowosad_extended_2022] that implements superpixels\index{segmentation!superpixels} algorithm SLIC to work with geospatial data. ## GRASS GIS {#grass} The U.S. Army - Construction Engineering Research Laboratory (USA-CERL) created the core of the Geographical Resources Analysis Support System (GRASS GIS)\index{GRASS GIS} (Table \@ref(tab:gis-comp); @neteler_open_2008) from 1982 to 1995. Academia continued this work since 1997. -Similar to SAGA\index{SAGA}, GRASS GIS focused on raster processing in the beginning while only later, since GRASS GIS 6.0, adding advanced vector functionality [@bivand_applied_2013]. +Similar to SAGA\index{SAGA}, GRASS GIS focused on raster processing in the beginning while, only later since GRASS GIS 6.0, adding advanced vector functionality [@bivand_applied_2013]. GRASS GIS stores the input data in an internal database. With regard to vector data, GRASS GIS is by default a topological GIS, i.e., it only stores the geometry of adjacent features once. @@ -578,10 +578,10 @@ To quickly use GRASS GIS from within R, we will use the **link2GI** package, how See [GRASS within R](https://grasswiki.osgeo.org/wiki/R_statistics/rgrass#GRASS_within_R) for how to do so. Please note that the code instructions in the following paragraphs might be hard to follow when using GRASS GIS for the first time but by running through the code line-by-line and by examining the intermediate results, the reasoning behind it should become even clearer. -Here, we introduce **rgrass**\index{rgrass (package)} with one of the most interesting problems in GIScience - the traveling salesman problem\index{traveling salesman}. +Here, we introduce **rgrass**\index{rgrass (package)} with one of the most interesting problems in GIScience: the traveling salesman problem\index{traveling salesman}. Suppose a traveling salesman would like to visit 24 customers. -Additionally, the salesman would like to start and finish his journey at home which makes a total of 25 locations while covering the shortest distance possible. -There is a single best solution to this problem; however, to check all of the possible solutions it is (mostly) impossible for modern computers [@longley_geographic_2015]. +Additionally, the salesman would like to start and finish the journey at home which makes a total of 25 locations while covering the shortest distance possible. +There is a single best solution to this problem; however, to check all of the possible solutions, it is (mostly) impossible for modern computers [@longley_geographic_2015]. In our case, the number of possible solutions correspond to `(25 - 1)! / 2`, i.e., the factorial of 24 divided by 2 (since we do not differentiate between forward or backward direction). Even if one iteration can be done in a nanosecond, this still corresponds to 9837145 years. Luckily, there are clever, almost optimal solutions which run in a tiny fraction of this inconceivable amount of time. @@ -595,7 +595,7 @@ points = cycle_hire[1:25, ] ``` Aside from the cycle hire points data, we need a street network for this area. -We can download it with from OpenStreetMap\index{OpenStreetMap} with the help of the **osmdata** \index{osmdata (package)} package (see also Section \@ref(retrieving-data)). +We can download it from OpenStreetMap\index{OpenStreetMap} with the help of the **osmdata** \index{osmdata (package)} package (see also Section \@ref(retrieving-data)). To do this, we constrain the query of the street network (in OSM language called "highway") to the bounding box\index{bounding box} of `points`, and attach the corresponding data as an `sf`-object\index{sf}. `osmdata_sf()` returns a list with several spatial objects (points, lines, polygons, etc.), but here, we only keep the line objects with their related ids.^[As a convenience to the reader, one can attach `london_streets` to the global environment using `data("london_streets", package = "spDataLarge")`.] @@ -637,7 +637,7 @@ write_VECT(terra::vect(points[, 1]), vname = "points") The **rgrass** package expects its inputs and gives its outputs as **terra** objects. Therefore, we need to convert our `sf` spatial vectors to **terra**'s `SpatVector`s using the `vect()` function to be able to use `write_VECT()`.^[You can learn more how to convert between spatial classes in R by reading the (Conversions between different spatial classes in R)[https://geocompx.org/post/2021/spatial-classes-conversion/] blog post and the -(Coercion between object formats)[https://CRAN.R-project.org/package=rgrass/vignettes/coerce.html] vignette] +(Coercion between object formats)[https://CRAN.R-project.org/package=rgrass/vignettes/coerce.html] vignette.] Now, both datasets exist in the GRASS GIS database. To perform our network\index{network} analysis, we need a topologically clean street network\index{topology cleaning}. @@ -652,7 +652,7 @@ execGRASS( ) ``` -\BeginKnitrBlock{rmdnote}
To learn about the possible arguments and flags of the GRASS GIS modules you can you the `help` flag. +\BeginKnitrBlock{rmdnote}
To learn about the possible arguments and flags of the GRASS GIS modules, you can use the `help` flag. For example, try `execGRASS("g.region", flags = "help")`.
\EndKnitrBlock{rmdnote} It is likely that a few of our cycling station points will not lie exactly on a street segment. @@ -711,38 +711,38 @@ To find out which datasets are currently available, run `execGRASS("g.list", typ Prior to importing data into R, you might want to perform some (spatial) subsetting\index{vector!subsetting}. Use `"v.select"` and `"v.extract"` for vector data. `"db.select"` lets you select subsets of the attribute table of a vector layer without returning the corresponding geometry. -- You can also start R from within a running GRASS GIS\index{GRASS GIS} session [for more information please refer to @bivand_applied_2013]. +- You can also start R from within a running GRASS GIS\index{GRASS GIS} session [for more information, please refer to @bivand_applied_2013]. - Refer to the excellent [GRASS GIS online help](https://grass.osgeo.org/grass-stable/manuals/) or `execGRASS("g.manual", flags = "i")` for more information on each available GRASS GIS geoalgorithm\index{geoalgorithm}. ## When to use what? -To recommend a single R-GIS interface is hard since the usage depends on personal preferences, the tasks at hand and your familiarity with different GIS\index{GIS} software packages which in turn probably depends on your domain. -As mentioned previously, SAGA\index{SAGA} is especially good at the fast processing of large (high-resolution) raster\index{raster} datasets, and frequently used by hydrologists, climatologists and soil scientists [@conrad_system_2015]. +To recommend a single R-GIS interface is hard since the usage depends on personal preferences, the tasks at hand, and your familiarity with different GIS\index{GIS} software packages which in turn probably depends on your domain. +As mentioned previously, SAGA\index{SAGA} is especially good at the fast processing of large (high-resolution) raster\index{raster} datasets and frequently used by hydrologists, climatologists and soil scientists [@conrad_system_2015]. GRASS GIS\index{GRASS GIS}, on the other hand, is the only GIS presented here supporting a topologically based spatial database which is especially useful for network analyses but also simulation studies. QGISS\index{QGIS} is much more user-friendly compared to GRASS GIS and SAGA, especially for first-time GIS users, and probably the most popular open-source GIS. Therefore, **qgisprocess**\index{qgisprocess (package)} is an appropriate choice for most use cases. Its main advantages are: - A unified access to several GIS, and therefore the provision of >1000 geoalgorithms (Table \@ref(tab:gis-comp)) including duplicated functionality, e.g., you can perform overlay-operations using QGIS-\index{QGIS}, SAGA-\index{SAGA} or GRASS GIS-geoalgorithms\index{GRASS GIS} -- Automatic data format conversions (SAGA uses `.sdat` grid files and GRASS GIS uses its own database format but QGIS will handle the corresponding conversions) +- Automatic data format conversions (SAGA uses `.sdat` grid files and GRASS GIS uses its own database format, but QGIS will handle the corresponding conversions) - Its automatic passing of geographic R objects to QGIS geoalgorithms\index{geoalgorithm} and back into R - Convenience functions to support named arguments and automatic default value retrieval (as inspired by **rgrass**\index{rgrass (package)}) By all means, there are use cases when you certainly should use one of the other R-GIS bridges. -Though QGIS is the only GIS providing a unified interface to several GIS\index{GIS} software packages, it only provides access to a subset of the corresponding third-party geoalgorithms (for more information please refer to @muenchow_rqgis:_2017). +Though QGIS is the only GIS providing a unified interface to several GIS\index{GIS} software packages, it only provides access to a subset of the corresponding third-party geoalgorithms (for more information, please refer to @muenchow_rqgis:_2017). Therefore, to use the complete set of SAGA and GRASS GIS functions, stick with **Rsagacmd**\index{Rsagacmd (package)} and **rgrass**. In addition, if you would like to run simulations with the help of a geodatabase\index{spatial database} [@krug_clearing_2010], use **rgrass** directly since **qgisprocess** always starts a new GRASS GIS session for each call. Finally, if you need topological correct data and/or spatial database management functionality such as multi-user access, we recommend the usage of GRASS GIS. -Please note that there are a number of further GIS software packages that have a scripting interface but for which there is no dedicated R package that accesses these: gvSig, OpenJump, and the Orfeo Toolbox.^[Please note that **link2GI** provides a partial integration with the Orfeo Toolbox\index{Orfeo Toolbox} and that you can also access the Orfeo Toolbox geoalgorithms via **qgisprocess**. Note also that TauDEM\index{TauDEM} can be accessed from with R with package **traudem**.] +Please note that there are a number of further GIS software packages that have a scripting interface but for which there is no dedicated R package that accesses these: gvSig, OpenJump, and the Orfeo Toolbox.^[Please note that **link2GI** provides a partial integration with the Orfeo Toolbox\index{Orfeo Toolbox} and that you can also access the Orfeo Toolbox geoalgorithms via **qgisprocess**. Note also that TauDEM\index{TauDEM} can be accessed from R with package **traudem**.] ## Bridges to GDAL {#gdal} As discussed in Chapter \@ref(read-write), GDAL\index{GDAL} is a low-level library that supports many geographic data formats. -GDAL is so effective that most GIS programs use GDAL\index{GDAL} in the background for importing and exporting geographic data, rather than re-inventing the wheel and using bespoke read-write code. +GDAL is so effective that most GIS programs use GDAL\index{GDAL} in the background for importing and exporting geographic data, rather than reinventing the wheel and using bespoke read-write code. But GDAL\index{GDAL} offers more than data I/O. It has [geoprocessing tools](https://gdal.org/programs/index.html) for vector and raster data, functionality to create [tiles](https://gdal.org/programs/gdal2tiles.html#gdal2tiles) for serving raster data online, and rapid [rasterization](https://gdal.org/programs/gdal_rasterize.html#gdal-rasterize) of vector data. -Since GDAL is a command line tool, all its commands can be accessed from within R via the `system()` command. +Since GDAL is a command-line tool, all its commands can be accessed from within R via the `system()` command. The code chunk below demonstrates this functionality: `linkGDAL()` searches the computer for a working GDAL\index{GDAL} installation and adds the location of the executable files to the PATH variable, allowing GDAL to be called (usually only needed under Windows). @@ -777,13 +777,13 @@ Other commonly used GDAL tools include: - `gdalinfo`: provides metadata of a raster dataset - `gdal_translate`: converts between different raster file formats - `ogr2ogr`: converts between different vector file formats -- `gdalwarp`: reprojects, transform, and clip raster datasets +- `gdalwarp`: reprojects, transforms, and clips raster datasets - `gdaltransform`: transforms coordinates Visit https://gdal.org/programs/ to see the complete list of GDAL tools and to read their help files. The 'link' to GDAL provided by **link2GI** could be used as a foundation for doing more advanced GDAL work from the R or system CLI. -TauDEM (https://hydrology.usu.edu/taudem/) and the Orfeo Toolbox (https://www.orfeo-toolbox.org/) are other spatial data processing libraries/programs offering a command line interface -- the above example shows how to access these libraries from the system command line via R. +TauDEM (https://hydrology.usu.edu/taudem/) and the Orfeo Toolbox (https://www.orfeo-toolbox.org/) are other spatial data processing libraries/programs offering a command-line interface -- the above example shows how to access these libraries from the system command line via R. This in turn could be the starting point for creating a proper interface to these libraries in the form of new R packages. Before diving into a project to create a new bridge, however, it is important to be aware of the power of existing R packages and that `system()` calls may not be platform-independent (they may fail on some computers). @@ -792,13 +792,13 @@ On the other hand, **sf** and **terra** brings most of the power provided by GDA ## Bridges to spatial databases {#postgis} \index{spatial database} -Spatial database management systems (spatial DBMS) store spatial and non-spatial data in a structured way. +Spatial database management systems (spatial DBMSs) store spatial and non-spatial data in a structured way. They can organize large collections of data into related tables (entities) via unique identifiers (primary and foreign keys) and implicitly via space (think for instance of a spatial join). This is useful because geographic datasets tend to become big and messy quite quickly. Databases enable storing and querying large datasets efficiently based on spatial and non-spatial fields, and provide multi-user access and topology\index{topological relations} support. The most important open source spatial database\index{spatial database} is PostGIS\index{PostGIS} [@obe_postgis_2015].^[ -SQLite/SpatiaLite are certainly also important but implicitly we have already introduced this approach since GRASS GIS\index{GRASS GIS} is using SQLite in the background (see Section \@ref(grass)). +SQLite/SpatiaLite are certainly also important, but implicitly we have already introduced this approach since GRASS GIS\index{GRASS GIS} is using SQLite in the background (see Section \@ref(grass)). ] R bridges to spatial DBMSs such as PostGIS\index{PostGIS} are important, allowing access to huge data stores without loading several gigabytes of geographic data into RAM, and likely crashing the R session. The remainder of this section shows how PostGIS can be called from R, based on "Hello real-world" from *PostGIS in Action, Second Edition* [@obe_postgis_2015].^[ @@ -885,7 +885,7 @@ In fact, function names of the **sf** package largely follow the PostGIS\index{P The prefix `st` stands for space/time. ] -The last query will find all Hardee's restaurants (`HDE`) within the 35 km buffer zone (Figure \@ref(fig:postgis)). +The last query will find all Hardee's restaurants (`HDE`) within the 35-km buffer zone (Figure \@ref(fig:postgis)). ``` r @@ -931,7 +931,7 @@ PostgreSQL/PostGIS is a formidable choice as an open-source spatial database. But the same is true for the lightweight SQLite/SpatiaLite database engine and GRASS GIS\index{GRASS GIS} which uses SQLite in the background (see Section \@ref(grass)). If your datasets are too big for PostgreSQL/PostGIS and you require massive spatial data management and query performance, it may be worth exploring large-scale geographic querying on distributed computing systems. -Such systems are outside the scope of this book but it worth mentioning that open source software providing this functionality exists. +Such systems are outside the scope of this book, but it is worth mentioning that open source software providing this functionality exists. Prominent projects in this space include [GeoMesa](http://www.geomesa.org/) and [Apache Sedona](https://sedona.apache.org/). The [**apache.sedona**](https://cran.r-project.org/package=apache.sedona) package provides an interface to the latter. @@ -979,7 +979,7 @@ However, keep in mind that the availability of COGs is a big plus while browsing For larger areas of interest, requested images are still relatively difficult to work with: they may use different map projections, may spatially overlap, and their spatial resolution often depends on the spectral band. The **gdalcubes** package\index{gdalcubes (package)} [@appel_gdalcubes_2019] can be used to abstract from individual images and to create and process image collections as four-dimensional data cubes\index{data cube}. -The code below shows a minimal example to create a lower resolution (250m) maximum NDVI composite from the Sentinel-2 images returned by the previous STAC-API search. +The code below shows a minimal example to create a lower resolution (250 m) maximum NDVI composite from the Sentinel-2 images returned by the previous STAC-API search. ``` r @@ -1010,7 +1010,7 @@ For more details, please refer to this [tutorial presented at OpenGeoHub summer The combination of STAC\index{STAC}, COGs\index{COG}, and data cubes\index{data cube} forms a cloud-native workflow to analyze (large) collections of satellite imagery in the cloud\index{cloud computing}. These tools already form a backbone, for example, of the **sits** package\index{sits (package)}, which allows land use and land cover classification of big Earth observation data in R. The package builds EO data cubes from image collections available in cloud services and performs land classification of data cubes using various machine and deep learning algorithms. -For more information about **sits** visit https://e-sensing.github.io/sitsbook/ or read the related article [@rs13132428]. +For more information about **sits**, visit https://e-sensing.github.io/sitsbook/ or read the related article [@rs13132428]. ### openEO @@ -1021,8 +1021,8 @@ Implementations are available for eight different backends (see https://hub.open Since the functionality and data availability differs among the backends, the **openeo** R package [@lahn_openeo_2021] dynamically loads available processes and collections from the connected backend. Afterwards, users can load image collections, apply and chain processes, submit jobs, and explore and plot results. -The following code will connect to the [openEO platform backend](https://openeo.cloud/), request available datasets, processes, and output formats, define a process graph to compute a maximum NDVI image from Sentinel-2 data, and finally executes the graph after logging in to the backend. -The openEO\index{OpenEO} platform backend includes a free tier and registration is possible from existing institutional or internet platform accounts. +The following code will connect to the [openEO platform backend](https://openeo.cloud/), request available datasets, processes, and output formats, define a process graph to compute a maximum NDVI image from Sentinel-2 data, and finally execute the graph after logging in to the backend. +The openEO\index{OpenEO} platform backend includes a free tier, and registration is possible from existing institutional or internet platform accounts. ``` r @@ -1060,11 +1060,12 @@ compute_result(graph = result, output_file = tempfile(fileext = ".tif")) -E1. Compute global solar irradiation for an area of `system.file("raster/dem.tif", package = "spDataLarge")` for March 21 at 11:00 AM using the `r.sun` GRASS GIS through **qgisprocess**. +E1. Compute global solar irradiation for an area of `system.file("raster/dem.tif", package = "spDataLarge")` for March 21 at 11:00 am using the `r.sun` GRASS GIS through **qgisprocess**. + E2. Compute catchment area\index{catchment area} and catchment slope of `system.file("raster/dem.tif", package = "spDataLarge")` using **Rsagacmd**. @@ -1076,6 +1077,7 @@ Visualize the results. + E4. Attach `data(random_points, package = "spDataLarge")` and read `system.file("raster/dem.tif", package = "spDataLarge")` into R. Select a point randomly from `random_points` and find all `dem` pixels that can be seen from this point (hint: viewshed\index{viewshed} can be calculated using GRASS GIS). Visualize your result. @@ -1085,8 +1087,8 @@ Additionally, give `mapview` a try. -E5. Use `gdalinfo` via a system call for a raster\index{raster} file stored on disk of your choice. -What kind of information you can find there? +E5. Use `gdalinfo` via a system call for a raster\index{raster} file stored on a disk of your choice. +What kind of information can you find there? @@ -1100,6 +1102,6 @@ E7. Query all Californian highways from the PostgreSQL/PostGIS\index{PostGIS} da -E8. The `ndvi.tif` raster (`system.file("raster/ndvi.tif", package = "spDataLarge")`) contains NDVI calculated for the Mongón study area based on Landsat data from September 22nd, 2000. +E8. The `ndvi.tif` raster (`system.file("raster/ndvi.tif", package = "spDataLarge")`) contains NDVI calculated for the Mongón study area based on Landsat data from September 22, 2000. Use **rstac**, **gdalcubes**, and **terra** to download Sentinel-2 images for the same area from 2020-08-01 to 2020-10-31, calculate its NDVI, and then compare it with the results from `ndvi.tif`. diff --git a/13-transport.md b/13-transport.md index 125590cef..955f82330 100644 --- a/13-transport.md +++ b/13-transport.md @@ -570,7 +570,7 @@ routes_short_scenario = routes_short |> mutate(bicycle = bicycle + car_driver * uptake, car_driver = car_driver * (1 - uptake)) sum(routes_short_scenario$bicycle) - sum(routes_short$bicycle) -#> [1] 692 +#> [1] 598 ``` Having created a scenario in which approximately 4000 trips have switched from driving to cycling, we can now model where this updated modeled cycling activity will take place. @@ -584,11 +584,6 @@ route_network_scenario = overline(routes_short_scenario, attrib = "bicycle") The outputs of the two preceding code chunks are summarized in Figure \@ref(fig:rnetvis) below. - -``` -#> [plot mode] fit legend/component: Some legend items or map compoments do not fit well, and are therefore rescaled. Set the tmap option 'component.autoscale' to FALSE to disable rescaling. -``` -
Illustration of the percentage of car trips switching to cycling as a function of distance (left) and route network level results of this function (right).Illustration of the percentage of car trips switching to cycling as a function of distance (left) and route network level results of this function (right).

(\#fig:rnetvis)Illustration of the percentage of car trips switching to cycling as a function of distance (left) and route network level results of this function (right).

@@ -648,11 +643,6 @@ ways_centrality = ways_sfn |> mutate(betweenness = tidygraph::centrality_edge_betweenness(lengths)) ``` - -``` -#> [plot mode] fit legend/component: Some legend items or map compoments do not fit well, and are therefore rescaled. Set the tmap option 'component.autoscale' to FALSE to disable rescaling. -``` -
Illustration of route network datasets. The grey lines represent a simplified road network, with segment thickness proportional to betweenness. The green lines represent potential cycling flows (one way) calculated with the code above.

(\#fig:wayssln)Illustration of route network datasets. The grey lines represent a simplified road network, with segment thickness proportional to betweenness. The green lines represent potential cycling flows (one way) calculated with the code above.

diff --git a/conclusion.html b/conclusion.html index 8204aa34c..dd352048d 100644 --- a/conclusion.html +++ b/conclusion.html @@ -132,7 +132,7 @@

A feature of R, and open source software in general, is that there are often multiple ways to achieve the same result. The code chunk below illustrates this by using three functions, covered in Chapters 3 and 5, to combine the 16 regions of New Zealand into a single geometry:

-
+
 library(spData)
 nz_u1 = sf::st_union(nz)
 nz_u2 = aggregate(nz["Population"], list(rep(1, nrow(nz))), sum)
@@ -154,7 +154,7 @@ 

The same applies for all packages showcased in this book, although it can be helpful (when not distracting) to be aware of alternatives and being able to justify your choice of software.

A common choice, for which there is no simple answer, is between tidyverse and base R for geocomputation. The following code chunk, for example, shows tidyverse and base R ways to extract the Name column from the nz object, as described in Chapter 3:

-
+
 library(dplyr)                          # attach a tidyverse package
 nz_name1 = nz["Name"]                   # base R approach
 nz_name2 = nz |>                        # tidyverse approach
@@ -280,13 +280,13 @@ 

You could simply ask how to do this in one of the places outlined in the previous section. However, it is likely that you will get a better response if you provide a reproducible example of what you have tried so far. The following code creates a map of the world with blue sea and green land, but the land is not filled in:

-
+
 library(sf)
 library(spData)
 plot(st_geometry(world), col = "green")

If you post this code in a forum, it is likely that you will get a more specific and useful response. For example, someone might respond with the following code, which demonstrably solves the problem, as illustrated in Figure 16.1:

-
+
 library(sf)
 library(spData)
 # use the bg argument to fill in the land
diff --git a/eco.html b/eco.html
index 2b6e0f002..5d72c5b71 100644
--- a/eco.html
+++ b/eco.html
@@ -114,7 +114,7 @@ 

Prerequisites

This chapter assumes you have a strong grasp of geographic data analysis and processing, covered in Chapters 2 to 5. The chapter makes use of bridges to GIS software, and spatial cross-validation, covered in Chapters 10 and 12 respectively.

The chapter uses the following packages:

-
+
 library(sf)
 library(terra)
 library(dplyr)
@@ -167,13 +167,13 @@ 

15.2 Data and data preparation

All the data needed for the subsequent analyses is available via the spDataLarge package.

-
+
 data("study_area", "random_points", "comm", package = "spDataLarge")
 dem = rast(system.file("raster/dem.tif", package = "spDataLarge"))
 ndvi = rast(system.file("raster/ndvi.tif", package = "spDataLarge"))

study_area is a polygon representing the outline of the study area, and random_points is an sf object containing the 100 randomly chosen sites. comm is a community matrix of the wide data format (Wickham 2014) where the rows represent the visited sites in the field and the columns the observed species.100

-
+
 # sites 35 to 40 and corresponding occurrences of the first five species in the
 # community matrix
 comm[35:40, 1:5]
@@ -201,7 +201,7 @@ 

To compute catchment area and catchment slope, we can make use of the sagang:sagawetnessindex function.101 qgis_show_help() returns all function parameters and default values of a specific geoalgorithm. Here, we present only a selection of the complete output.

-
+
 # if not already done, enable the saga next generation plugin
 qgisprocess::qgis_enable_plugins("processing_saga_nextgen")
 # show help
@@ -241,7 +241,7 @@ 

Remember that we can use a path to a file on disk or a SpatRaster living in R’s global environment to specify the input raster DEM (see Section 10.2). Specifying 1 as the SLOPE_TYPE makes sure that the algorithm will return the catchment slope. The resulting rasters are saved to temporary files with an .sdat extension which is the native SAGA raster format.

-
+
 # environmental predictors: catchment slope and catchment area
 ep = qgisprocess::qgis_run_algorithm(
   alg = "sagang:sagawetnessindex",
@@ -253,7 +253,7 @@ 

This returns a list named ep containing the paths to the computed output rasters. Let’s read in catchment area as well as catchment slope into a multilayer SpatRaster object (see Section 2.3.4). Additionally, we will add two more raster objects to it, namely dem and ndvi.

-
+
 # read in catchment area and catchment slope
 ep = ep[c("AREA", "SLOPE")] |>
   unlist() |>
@@ -263,13 +263,13 @@ 

ep = c(dem, ndvi, ep) # add dem and ndvi to the multilayer SpatRaster object

Additionally, the catchment area values are highly skewed to the right (hist(ep$carea)). A log10-transformation makes the distribution more normal.

-
+
 ep$carea = log10(ep$carea)

As a convenience to the reader, we have added ep to spDataLarge:

-
+
 ep = rast(system.file("raster/ep.tif", package = "spDataLarge"))

Finally, we can extract the terrain attributes to our field observations (see also Section 6.3).

-
+
 # terra::extract adds automatically a for our purposes unnecessary ID column
 ep_rp = terra::extract(ep, random_points, ID = FALSE)
 random_points = cbind(random_points, ep_rp)
@@ -300,7 +300,7 @@

decostand() converts numerical observations into presences and absences with 1 indicating the occurrence of a species and 0 the absence of a species. Ordination techniques such as NMDS require at least one observation per site. Hence, we need to dismiss all sites in which no species were found.

-
+
 # presence-absence matrix
 pa = vegan::decostand(comm, "pa")  # 100 rows (sites), 69 columns (species)
 # keep only sites in which at least one species was found
@@ -310,7 +310,7 @@ 

One way of choosing <code>k</code> is to try <code>k</code> values between 1 and 6 and then using the result which yields the best stress value <span class="citation">(<a href="references.html#ref-mccune_analysis_2002">McCune, Grace, and Urban 2002</a>)</span>.</p>'>102 NMDS is an iterative procedure trying to make the ordinated space more similar to the input matrix in each step. To make sure that the algorithm converges, we set the number of steps to 500 using the try parameter.

-
+
 set.seed(25072018)
 nmds = vegan::metaMDS(comm = pa, k = 4, try = 500)
 nmds$stress
@@ -329,7 +329,7 @@ 

However, we already know that humidity represents the main gradient in the study area (Muenchow, Bräuning, et al. 2013; Muenchow, Schratz, and Brenning 2017). Since humidity is highly correlated with elevation, we rotate the NMDS axes in accordance with elevation (see also ?MDSrotate for more details on rotating NMDS axes). Plotting the result reveals that the first axis is, as intended, clearly associated with altitude (Figure 15.3).

-
+
 elev = dplyr::filter(random_points, id %in% rownames(pa)) |> 
   dplyr::pull(dem)
 # rotating NMDS in accordance with altitude (proxy for humidity)
@@ -358,7 +358,7 @@ 

We refer the reader to James et al. (2013) for a more detailed description of random forests and related techniques.

To introduce decision trees by example, we first construct a response-predictor matrix by joining the rotated NMDS scores to the field observations (random_points). We will also use the resulting data frame for the mlr3 modeling later on.

-
+
 # construct response-predictor matrix
 # id- and response variable
 rp = data.frame(id = as.numeric(rownames(sc)), sc = sc[, 1])
@@ -366,7 +366,7 @@ 

rp = inner_join(random_points, rp, by = "id")

Decision trees split the predictor space into a number of regions. To illustrate this, we apply a decision tree to our data using the scores of the first NMDS axis as the response (sc) and altitude (dem) as the only predictor.

-
+
 tree_mo = tree::tree(sc ~ dem, data = rp)
 plot(tree_mo)
 text(tree_mo, pretty = 0)
@@ -421,7 +421,7 @@

Having already constructed the input variables (rp), we are all set for specifying the mlr3 building blocks (task, learner, and resampling). For specifying a spatial task, we use again the mlr3spatiotempcv package (Schratz et al. 2021 & Section 12.5), and since our response (sc) is numeric, we use a regression task.

-
+
 # create task
 task = mlr3spatiotempcv::as_task_regr_st(
   select(rp, -id, -spri),
@@ -431,7 +431,7 @@ 

Using an sf object as the backend automatically provides the geometry information needed for the spatial partitioning later on. Additionally, we got rid of the columns id and spri since these variables should not be used as predictors in the modeling. Next, we go on to construct the a random forest learner from the ranger package (Wright and Ziegler 2017).

-
+
 lrn_rf = lrn("regr.ranger", predict_type = "response")

As opposed to, for example, support vector machines (see Section 12.5.2), random forests often already show good performances when used with the default values of their hyperparameters (which may be one reason for their popularity). Still, tuning often moderately improves model results, and thus is worth the effort (Probst, Wright, and Boulesteix 2018). @@ -444,7 +444,7 @@

Naturally, as trees and computing time become larger, the lower the min.node.size.

Hyperparameter combinations will be selected randomly but should fall inside specific tuning limits (created with paradox::ps()). mtry should range between 1 and the number of predictors (4) , sample.fraction should range between 0.2 and 0.9 and min.node.size should range between 1 and 10 (Probst, Wright, and Boulesteix 2018).

-
+
 # specifying the search space
 search_space = paradox::ps(
   mtry = paradox::p_int(lower = 1, upper = ncol(task$data()) - 1),
@@ -456,7 +456,7 @@ 

Specifically, we will use a five-fold spatial partitioning with only one repetition (rsmp()). In each of these spatial partitions, we run 50 models (trm()) while using randomly selected hyperparameter configurations (tnr()) within predefined limits (seach_space) to find the optimal hyperparameter combination (see also Section 12.5.2 and https://mlr3book.mlr-org.com/chapters/chapter4/hyperparameter_optimization.html#sec-autotuner, Becker et al. 2022). The performance measure is the root mean squared error (RMSE).

-
+
 autotuner_rf = mlr3tuning::auto_tuner(
   learner = lrn_rf,
   resampling = mlr3::rsmp("spcv_coords", folds = 5), # spatial partitioning
@@ -466,11 +466,11 @@ 

tuner = mlr3tuning::tnr("random_search") # specify random search )

Calling the train()-method of the AutoTuner-object finally runs the hyperparameter tuning, and will find the optimal hyperparameter combination for the specified parameters.

-
+
 # hyperparameter tuning
 set.seed(24092024)
 autotuner_rf$train(task)
-
+
 autotuner_rf$tuning_result
 #>     mtry sample.fraction min.node.size learner_param_vals  x_domain regr.rmse
 #>    <int>           <num>         <int>             <list>    <list>     <num>
@@ -482,7 +482,7 @@ 

The tuned hyperparameters can now be used for the prediction. To do so, we only need to run the predict method of our fitted AutoTuner object.

-
+
 # predicting using the best hyperparameter combination
 autotuner_rf$predict(task)
 #> <PredictionRegr> for 84 observations:
@@ -496,7 +496,7 @@ 

#> 84 0.808 0.807

The predict method will apply the model to all observations used in the modeling. Given a multilayer SpatRaster containing rasters named as the predictors used in the modeling, terra::predict() will also make spatial distribution maps, i.e., predict to new data.

-
+
 pred = terra::predict(ep, model = autotuner_rf, fun = predict)
@@ -505,7 +505,7 @@

In case, terra::predict() does not support a model algorithm, you can still make the predictions manually.

-
+
 newdata = as.data.frame(as.matrix(ep))
 colSums(is.na(newdata))  # 0 NAs
 # but assuming there were 0s results in a more generic approach
diff --git a/figures/circle-intersection-1.png b/figures/circle-intersection-1.png
index ead87a33a..0202ba52d 100644
Binary files a/figures/circle-intersection-1.png and b/figures/circle-intersection-1.png differ
diff --git a/figures/cycleways-1.png b/figures/cycleways-1.png
index 809dcde9c..7d7f8ab17 100644
Binary files a/figures/cycleways-1.png and b/figures/cycleways-1.png differ
diff --git a/figures/points-1.png b/figures/points-1.png
index b9123555f..de114c36b 100644
Binary files a/figures/points-1.png and b/figures/points-1.png differ
diff --git a/figures/rnetvis-1.png b/figures/rnetvis-1.png
index d2c0d3e57..c92de30ee 100644
Binary files a/figures/rnetvis-1.png and b/figures/rnetvis-1.png differ
diff --git a/figures/rnetvis-2.png b/figures/rnetvis-2.png
index 3e0ef6096..2ad37685f 100644
Binary files a/figures/rnetvis-2.png and b/figures/rnetvis-2.png differ
diff --git a/figures/routes-1.png b/figures/routes-1.png
index aad5bf1ed..624bba49c 100644
Binary files a/figures/routes-1.png and b/figures/routes-1.png differ
diff --git a/figures/wayssln-1.png b/figures/wayssln-1.png
index 43995e5d4..a2a5a247a 100644
Binary files a/figures/wayssln-1.png and b/figures/wayssln-1.png differ
diff --git a/gis.html b/gis.html
index 3e813db95..c13d59c32 100644
--- a/gis.html
+++ b/gis.html
@@ -130,8 +130,8 @@ 

A defining feature of interpreted languages with an interactive console — technically a read-eval-print loop (REPL) — such as R is the way you interact with them: rather than relying on pointing and clicking on different parts of a screen, you type commands into the console and execute them with the Enter key. A common and effective workflow when using interactive development environments such as RStudio or VS Code is to type code into source files in a source editor and control interactive execution of the code with a shortcut such as Ctrl+Enter.

-

CLIs are not unique to R: most early computing environments relied on a command line ‘shell’ and it was only after the invention and widespread adoption of the computer mouse in the 1990s that graphical user interfaces (GUIs) became common. -GRASS GIS the longest-standing continuously-developed open source GIS software, for example, relied on its command line interface before it gained a GUI (Landa 2008). +

Command line interfaces (CLIs) are not unique to R: most early computing environments relied on a command line ‘shell’ and it was only after the invention and widespread adoption of the computer mouse in the 1990s that graphical user interfaces (GUIs) became common. +GRASS GIS the longest-standing continuously developed open source GIS software, for example, relied on its CLI before it gained a GUI (Landa 2008). Most popular GIS software projects are GUI-driven. You can interact with QGIS, SAGA, GRASS GIS and gvSIG from system terminals and embedded CLIs, but their design encourages most people to interact with them by ‘pointing and clicking’. An unintended consequence of this is that most GIS users miss out on the advantages of CLI-driven and scriptable approaches. @@ -139,11 +139,11 @@

With the advent of ‘modern’ GIS software, most people want to point and click their way through life. That’s good, but there is a tremendous amount of flexibility and power waiting for you with the command line. Many times you can do something on the command line in a fraction of the time you can do it with a GUI.

-

The ‘CLI vs GUI’ debate does not have to be adverserial: both ways of working have advantages, depending on a range of factors including the task (with drawing new features being well suited to GUIs), the level of reproducibility desired, and the user’s skillset. +

The ‘CLI vs GUI’ debate does not have to be adversarial: both ways of working have advantages, depending on a range of factors including the task (with drawing new features being well suited to GUIs), the level of reproducibility desired, and the user’s skillset. GRASS GIS is a good example of GIS software that is primarily based on a CLI but which also has a prominent GUI. Likewise, while R is focused on its CLI, IDEs such as RStudio provide a GUI for improving accessibility. -Software cannot be neatly categorised into CLI-based or GUI-based. -However, interactive command line interfaces have several important advantages in terms of:

+Software cannot be neatly categorized into CLI or GUI-based. +However, interactive command-line interfaces have several important advantages in terms of:

  • Automating repetitive tasks
  • Enabling transparency and reproducibility
  • @@ -168,10 +168,10 @@

    IDEs such as RStudio and VS Code provide code auto-completion and other features to improve the user experience when developing code.

R is a natural choice for people wanting to build bridges between reproducible data analysis workflows and GIS because it originated as an interface language. -A key feature of R (and its predecessor S) is that it provides access to statistical algorithms in other languages (particularly FORTRAN and C), but from a powerful high level functional language with an intuitive REPL environment, which C and FORTRAN lacked (Chambers 2016). +A key feature of R (and its predecessor S) is that it provides access to statistical algorithms in other languages (particularly FORTRAN and C), but from a powerful high-level functional language with an intuitive REPL environment, which C and FORTRAN lacked (Chambers 2016). R continues this tradition with interfaces to numerous languages, notably C++.

Although R was not designed as a command line GIS, its ability to interface with dedicated GISs gives it astonishing geospatial capabilities. -With GIS bridges, R can replicate more diverse workflows, with the additional reproducibility, scalability and productity benefits of controlling them from a programming environment and a consistent CLI. +With GIS bridges, R can replicate more diverse workflows, with the additional reproducibility, scalability and productivity benefits of controlling them from a programming environment and a consistent CLI. Furthermore, R outperforms GISs in some areas of geocomputation, including interactive/animated map-making (see Chapter 9) and spatial statistical modeling (see Chapter 12).

This chapter focuses on ‘bridges’ to three mature open source GIS products, summarized in Table 10.1:

    @@ -185,8 +185,8 @@

    TABLE 10.1: Comparison between three open-source GIS. Hybrid refers to the support of vector and raster operations. GIS -First release -No. functions +First Release +No. Functions Support @@ -218,7 +218,7 @@

    QGIS is the most popular open-source GIS (Table 10.1; Graser and Olaya (2015)). QGIS provides a unified interface to QGIS’s native geoalgorithms, GDAL, and — when they are installed — from other providers such as GRASS GIS, and SAGA (Graser and Olaya 2015). -Since version 3.14 (released in summer 2020), QGIS ships with the qgis_process command line utility for accessing a bounty of functionality for geocomputation. +Since version 3.14 (released in summer 2020), QGIS ships with the qgis_process command-line utility for accessing a bounty of functionality for geocomputation. qgis_process provides access to 300+ geoalgorithms in the standard QGIS installation and 1,000+ via plugins to external providers such as GRASS GIS and SAGA.

    The qgisprocess package provides access to qgis_process from R. The package requires QGIS — and any other relevant plugins such as GRASS GIS and SAGA, used in this chapter — to be installed and available to the system. @@ -253,7 +253,7 @@

     qgis_enable_plugins(c("grassprovider", "processing_saga_nextgen"), 
                         quiet = TRUE)
    -

    Please note that aside from installing SAGA on your system you also need to install the QGIS Python plugin Processing Saga NextGen. +

    Please note that aside from installing SAGA on your system, you also need to install the QGIS Python plugin Processing Saga NextGen. You can do so from within QGIS with the Plugin Manager or programmatically with the help of the Python package qgis-plugin-manager (at least on Linux).

    qgis_providers() lists the name of the software and the corresponding count of available geoalgorithms.

    @@ -287,8 +287,8 @@ 

    aggzone_wgs = st_transform(aggregating_zones, "EPSG:4326")

    -Illustration of two areal units: incongruent (black lines) and aggregating zones (red borders).

    -FIGURE 10.1: Illustration of two areal units: incongruent (black lines) and aggregating zones (red borders). +Two areal units: incongruent (black lines) and aggregating zones (red borders).

    +FIGURE 10.1: Two areal units: incongruent (black lines) and aggregating zones (red borders).

    The first step is to find an algorithm that can merge two vector objects. @@ -309,7 +309,7 @@

    One of the algorithms on the above list, "native:union", sounds promising. The next step is to find out what this algorithm does and how we can use it. This is the role of the qgis_show_help(), which returns a short summary of what the algorithm does, its arguments, and outputs.59 -This makes it output rather long. +This makes its output rather long. The following command returns a data frame with each row representing an argument required by "native:union" and columns with the name, description, type, default value, available values, and acceptable values associated with each:

     alg = "native:union"
    @@ -337,8 +337,8 @@ 

    This can be very convenient, but we recommend providing the path to your spatial data on disk when you only read it in to submit it to a qgisprocess algorithm: the first thing qgisprocess does when executing a geoalgorithm is to export the spatial data living in your R session back to disk in a format known to QGIS such as .gpkg or .tif files. This can increase algorithm runtimes.

    The main function of qgisprocess is qgis_run_algorithm(), which sends inputs to QGIS and returns the outputs. -It accepts the algorithm name and a set of named arguments shown in the help list, and performs expected calculations. -In our case, three arguments seem important - INPUT, OVERLAY, and OUTPUT. +It accepts the algorithm name and a set of named arguments shown in the help list, and it performs expected calculations. +In our case, three arguments seem important: INPUT, OVERLAY, and OUTPUT. The first one, INPUT, is our main vector object incongr_wgs, while the second one, OVERLAY, is aggzone_wgs. The last argument, OUTPUT, is an output file name, which qgisprocess will automatically choose and create in tempdir() if none is provided.

    @@ -353,7 +353,7 @@ 

     union_sf = st_as_sf(union)

    Note that the QGIS union operation merges the two input layers into one layer by using the intersection and the symmetrical difference of the two input layers (which, by the way, is also the default when doing a union operation in GRASS GIS and SAGA). -This is not the same as st_union(incongr_wgs, aggzone_wgs) (see Exercises)!

    +This is not the same as st_union(incongr_wgs, aggzone_wgs) (see the Exercises)!

    The result, union_sf, is a multipolygon with a larger number of features than two input objects. Notice, however, that many of these polygons are small and do not represent real areas but are rather a result of our two datasets having a different level of detail. These artifacts of error are called sliver polygons (see red-colored polygons in the left panel of Figure 10.2). @@ -373,12 +373,12 @@

    The GRASS GIS provider in QGIS was called grass7 until QGIS version 3.34. Thus, if you have an older QGIS version, you must prefix the algorithms with grass7 instead of grass.

    -

    Similarly to the previous step, we should start by looking at this algorithm’s help.

    +

    Similar to the previous step, we should start by looking at this algorithm’s help.

     qgis_show_help("grass:v.clean")
    -

    We have omitted the output here, because the help text is quite long and contains a lot of arguments.60 -This is because v.clean is a multi tool – it can clean different types of geometries and solve different types of topological problems. -For this example, let’s focus on just a few arguments, however, we encourage you to visit this algorithm’s documentation to learn more about v.clean capabilities.

    +

    We have omitted the output here, because the help text is quite long and contains a lot of arguments.60 +This is because v.clean is a multi-tool – it can clean different types of geometries and solve different types of topological problems. +For this example, let’s focus on just a few arguments, however, we encourage you to visit this algorithm’s documentation to learn more about v.clean capabilities.

     qgis_get_argument_specs("grass:v.clean") |>
       select(name, description) |>
    @@ -404,7 +404,7 @@ 

    tool = "rmarea", threshold = 25000 ) clean_sf = st_as_sf(clean)

    -

    The result, the right panel of 10.2, looks as expected – sliver polygons are now removed.

    +

    The result, the right panel of Figure 10.2, looks as expected – sliver polygons are now removed.

    Sliver polygons colored in red (left panel). Cleaned polygons (right panel).

    @@ -419,9 +419,9 @@

    Digital elevation models (DEMs) contain elevation information for each raster cell. They are used for many purposes, including satellite navigation, water flow models, surface analysis, or visualization. Here, we are interested in deriving new information from a DEM raster that could be used as predictors for statistical learning. -Various terrain parameters can be helpful, for example, for the prediction of landslides (see Chapter 12)

    +Various terrain parameters can be helpful, for example, for the prediction of landslides (see Chapter 12).

    For this section, we will use dem.tif – a digital elevation model of the Mongón study area (downloaded from the Land Process Distributed Active Archive Center, see also ?dem.tif). -It has a resolution of about 30 by 30 meters and uses a projected CRS.

    +It has a resolution of about 30 x 30 meters and uses a projected CRS.

     library(qgisprocess)
     library(terra)
    @@ -449,8 +449,8 @@ 

    Therefore, we only have to specify one argument – the input DEM. Of course, when applying this algorithm you should make sure that the parameter values are in correspondence with your study aim.63

    Before running the SAGA algorithm from within QGIS, we change the default raster output format from .tif to SAGA’s native raster format .sdat. -Hence, all output rasters we do not specify ourselves will from now on be written to the .sdat format. -Depending on the software versions (SAGA, GDAL) you are using, this might not be necessary but often enough this will save you trouble when trying to read in output rasters created with SAGA.

    +Hence, all output rasters that we do not specify ourselves will from now on be written to the .sdat format. +Depending on the software versions (SAGA, GDAL) you are using, this might not be necessary, but often enough this will save you trouble when trying to read-in output rasters created with SAGA.

     options(qgisprocess.tmp_raster_ext = ".sdat")
     dem_wetness = qgis_run_algorithm("sagang:sagawetnessindex",
    @@ -463,9 +463,9 @@ 

    dem_wetness_twi = qgis_as_terra(dem_wetness$TWI) # plot(dem_wetness_twi) options(qgisprocess.tmp_raster_ext = ".tif")

    -

    You can see the TWI map on the left panel of Figure 10.3. +

    You can see the TWI map in the left panel of Figure 10.3. The topographic wetness index is unitless: its low values represent areas that will not accumulate water, while higher values show areas that will accumulate water at increasing levels.

    -

    Information from digital elevation models can also be categorized, for example, to geomorphons – the geomorphological phenotypes consisting of 10 classes that represent terrain forms, such as slopes, ridges, or valleys (Jasiewicz and Stepinski 2013). +

    Information from digital elevation models can also be categorized, for example, to geomorphons – the geomorphological phenotypes consisting of ten classes that represent terrain forms, such as slopes, ridges, or valleys (Jasiewicz and Stepinski 2013). These phenotypes are used in many studies, including landslide susceptibility, ecosystem services, human mobility, and digital soil mapping.

    The original implementation of the geomorphons’ algorithm was created in GRASS GIS, and we can find it in the qgisprocess list as "grass:r.geomorphon":

    @@ -473,7 +473,7 @@ 

    #> [1] "grass:r.geomorphon" "sagang:geomorphons" qgis_show_help("grass:r.geomorphon") # output not shown

    -

    Calculation of geomorphons requires an input DEM (elevation), and can be customized with a set of optional arguments. +

    Calculation of geomorphons requires an input DEM (elevation) and can be customized with a set of optional arguments. It includes, search – a length for which the line-of-sight is calculated, and -m – a flag specifying that the search value will be provided in meters (and not the number of cells). More information about additional arguments can be found in the original paper and the GRASS GIS documentation.

    @@ -481,7 +481,7 @@ 

    elevation = dem, `-m` = TRUE, search = 120 )

    -

    Our output, dem_geomorph$forms, contains a raster file with 10 categories – each one representing a terrain form. +

    Our output, dem_geomorph$forms, contains a raster file with ten categories – each representing a terrain form. We can read it into R with qgis_as_terra(), and then visualize it (Figure 10.3, right panel) or use it in our subsequent calculations.

     dem_geomorph_terra = qgis_as_terra(dem_geomorph$forms)
    @@ -499,10 +499,10 @@

    10.3 SAGA

    -

    The System for Automated Geoscientific Analyses (SAGA; Table 10.1) provides the possibility to execute SAGA modules via the command line interface (saga_cmd.exe under Windows and just saga_cmd under Linux) (see the SAGA wiki on modules). +

    The System for Automated Geoscientific Analyses (SAGA; Table 10.1) provides the possibility to execute SAGA modules via the command-line interface (saga_cmd.exe under Windows and just saga_cmd under Linux) (see the SAGA wiki on modules). In addition, there is a Python interface (SAGA Python API). Rsagacmd uses the former to run SAGA from within R.

    -

    We will use Rsagacmd in this section to delineate areas with similar values of the normalized difference vegetation index (NDVI) of the Mongón study area in Peru from the 22nd of September 2000 (Figure 10.4, left panel) by using a seeded region growing algorithm from SAGA.64

    +

    We will use Rsagacmd in this section to delineate areas with similar values of the normalized difference vegetation index (NDVI) of the Mongón study area in Peru from September in the year 2000 (Figure 10.4, left panel) by using a seeded region growing algorithm from SAGA.64

     ndvi = rast(system.file("raster/ndvi.tif", package = "spDataLarge"))

    To start using Rsagacmd, we need to run the saga_gis() function. @@ -533,7 +533,7 @@

    Our output is a list of three objects: variance – a raster map of local variance, seed_grid – a raster map with the generated seeds, and seed_points – a spatial vector object with the generated seeds.

    The second SAGA tool we use is seeded_region_growing.67 The seeded_region_growing tool requires two inputs: our seed_grid calculated in the previous step and the ndvi raster object. -Additionally, we can specify several parameters, such as normalize to standardize the input features, neighbour (4 or 8-neighborhood), and method. +Additionally, we can specify several parameters, such as normalize to standardize the input features, neighbour (4- or 8-neighborhood), and method. The last parameter can be set to either 0 or 1 (region growing is based on raster cells’ values and their positions or just the values). For a more detailed description of the method, see Böhner, Selige, and Ringeler (2006).

    Here, we will only change method to 1, meaning that our output regions will be created only based on the similarity of their NDVI values.

    @@ -557,9 +557,9 @@

    The resulting polygons (segments) represent areas with similar values. They can also be further aggregated into larger polygons using various techniques, such as clustering (e.g., k-means), regionalization (e.g., SKATER) or supervised classification methods. -You can try to do it in Exercises.

    +You can try to do it in the Exercises.

    R also has other tools to achieve the goal of creating polygons with similar values (so-called segments). -It includes the SegOptim package (Gonçalves et al. 2019) that allows running several image segmentation algorithms and supercells (Nowosad and Stepinski 2022) that implements superpixels algorithm SLIC to work with geospatial data.

    +It includes the SegOptim package (Gonçalves et al. 2019) that allows running several image segmentation algorithms and supercells package (Nowosad and Stepinski 2022) that implements superpixels algorithm SLIC to work with geospatial data.

    @@ -567,7 +567,7 @@

    The U.S. Army - Construction Engineering Research Laboratory (USA-CERL) created the core of the Geographical Resources Analysis Support System (GRASS GIS) (Table 10.1; Neteler and Mitasova (2008)) from 1982 to 1995. Academia continued this work since 1997. -Similar to SAGA, GRASS GIS focused on raster processing in the beginning while only later, since GRASS GIS 6.0, adding advanced vector functionality (Bivand, Pebesma, and Gómez-Rubio 2013).

    +Similar to SAGA, GRASS GIS focused on raster processing in the beginning while, only later since GRASS GIS 6.0, adding advanced vector functionality (Bivand, Pebesma, and Gómez-Rubio 2013).

    GRASS GIS stores the input data in an internal database. With regard to vector data, GRASS GIS is by default a topological GIS, i.e., it only stores the geometry of adjacent features once. SQLite is the default database driver for vector attribute management, and attributes are linked to the geometry, i.e., to the GRASS GIS database, via keys (GRASS GIS vector management).

    @@ -583,10 +583,10 @@

    To quickly use GRASS GIS from within R, we will use the link2GI package, however, one can also set up the GRASS GIS database step-by-step. See GRASS within R for how to do so. Please note that the code instructions in the following paragraphs might be hard to follow when using GRASS GIS for the first time but by running through the code line-by-line and by examining the intermediate results, the reasoning behind it should become even clearer.

    -

    Here, we introduce rgrass with one of the most interesting problems in GIScience - the traveling salesman problem. +

    Here, we introduce rgrass with one of the most interesting problems in GIScience: the traveling salesman problem. Suppose a traveling salesman would like to visit 24 customers. -Additionally, the salesman would like to start and finish his journey at home which makes a total of 25 locations while covering the shortest distance possible. -There is a single best solution to this problem; however, to check all of the possible solutions it is (mostly) impossible for modern computers (Longley 2015). +Additionally, the salesman would like to start and finish the journey at home which makes a total of 25 locations while covering the shortest distance possible. +There is a single best solution to this problem; however, to check all of the possible solutions, it is (mostly) impossible for modern computers (Longley 2015). In our case, the number of possible solutions correspond to (25 - 1)! / 2, i.e., the factorial of 24 divided by 2 (since we do not differentiate between forward or backward direction). Even if one iteration can be done in a nanosecond, this still corresponds to 9837145 years. Luckily, there are clever, almost optimal solutions which run in a tiny fraction of this inconceivable amount of time. @@ -596,7 +596,7 @@

    data("cycle_hire", package = "spData") points = cycle_hire[1:25, ]

    Aside from the cycle hire points data, we need a street network for this area. -We can download it with from OpenStreetMap with the help of the osmdata package (see also Section 8.5). +We can download it from OpenStreetMap with the help of the osmdata package (see also Section 8.5). To do this, we constrain the query of the street network (in OSM language called “highway”) to the bounding box of points, and attach the corresponding data as an sf-object. osmdata_sf() returns a list with several spatial objects (points, lines, polygons, etc.), but here, we only keep the line objects with their related ids.68

    @@ -626,7 +626,7 @@ 

    write_VECT(terra::vect(points[, 1]), vname = "points")

    The rgrass package expects its inputs and gives its outputs as terra objects. Therefore, we need to convert our sf spatial vectors to terra’s SpatVectors using the vect() function to be able to use write_VECT().69

    +(Coercion between object formats)[<a href="https://CRAN.R-project.org/package=rgrass/vignettes/coerce.html" class="uri">https://CRAN.R-project.org/package=rgrass/vignettes/coerce.html</a>] vignette.</p>'>69

    Now, both datasets exist in the GRASS GIS database. To perform our network analysis, we need a topologically clean street network. GRASS GIS’s "v.clean" takes care of the removal of duplicates, small angles and dangles, among others. @@ -638,7 +638,7 @@

    )

-To learn about the possible arguments and flags of the GRASS GIS modules you can you the help flag. +To learn about the possible arguments and flags of the GRASS GIS modules, you can use the help flag. For example, try execGRASS("g.region", flags = "help").

It is likely that a few of our cycling station points will not lie exactly on a street segment. @@ -684,7 +684,7 @@

Prior to importing data into R, you might want to perform some (spatial) subsetting. Use "v.select" and "v.extract" for vector data. "db.select" lets you select subsets of the attribute table of a vector layer without returning the corresponding geometry. -
  • You can also start R from within a running GRASS GIS session (for more information please refer to Bivand, Pebesma, and Gómez-Rubio 2013).
  • +
  • You can also start R from within a running GRASS GIS session (for more information, please refer to Bivand, Pebesma, and Gómez-Rubio 2013).
  • Refer to the excellent GRASS GIS online help or execGRASS("g.manual", flags = "i") for more information on each available GRASS GIS geoalgorithm.
  • @@ -692,34 +692,34 @@

    10.5 When to use what?

    -

    To recommend a single R-GIS interface is hard since the usage depends on personal preferences, the tasks at hand and your familiarity with different GIS software packages which in turn probably depends on your domain. -As mentioned previously, SAGA is especially good at the fast processing of large (high-resolution) raster datasets, and frequently used by hydrologists, climatologists and soil scientists (Conrad et al. 2015). +

    To recommend a single R-GIS interface is hard since the usage depends on personal preferences, the tasks at hand, and your familiarity with different GIS software packages which in turn probably depends on your domain. +As mentioned previously, SAGA is especially good at the fast processing of large (high-resolution) raster datasets and frequently used by hydrologists, climatologists and soil scientists (Conrad et al. 2015). GRASS GIS, on the other hand, is the only GIS presented here supporting a topologically based spatial database which is especially useful for network analyses but also simulation studies. QGISS is much more user-friendly compared to GRASS GIS and SAGA, especially for first-time GIS users, and probably the most popular open-source GIS. Therefore, qgisprocess is an appropriate choice for most use cases. Its main advantages are:

    • A unified access to several GIS, and therefore the provision of >1000 geoalgorithms (Table 10.1) including duplicated functionality, e.g., you can perform overlay-operations using QGIS-, SAGA- or GRASS GIS-geoalgorithms
    • -
    • Automatic data format conversions (SAGA uses .sdat grid files and GRASS GIS uses its own database format but QGIS will handle the corresponding conversions)
    • +
    • Automatic data format conversions (SAGA uses .sdat grid files and GRASS GIS uses its own database format, but QGIS will handle the corresponding conversions)
    • Its automatic passing of geographic R objects to QGIS geoalgorithms and back into R
    • Convenience functions to support named arguments and automatic default value retrieval (as inspired by rgrass)

    By all means, there are use cases when you certainly should use one of the other R-GIS bridges. -Though QGIS is the only GIS providing a unified interface to several GIS software packages, it only provides access to a subset of the corresponding third-party geoalgorithms (for more information please refer to Muenchow, Schratz, and Brenning (2017)). +Though QGIS is the only GIS providing a unified interface to several GIS software packages, it only provides access to a subset of the corresponding third-party geoalgorithms (for more information, please refer to Muenchow, Schratz, and Brenning (2017)). Therefore, to use the complete set of SAGA and GRASS GIS functions, stick with Rsagacmd and rgrass. In addition, if you would like to run simulations with the help of a geodatabase (Krug, Roura-Pascual, and Richardson 2010), use rgrass directly since qgisprocess always starts a new GRASS GIS session for each call. Finally, if you need topological correct data and/or spatial database management functionality such as multi-user access, we recommend the usage of GRASS GIS.

    -

    Please note that there are a number of further GIS software packages that have a scripting interface but for which there is no dedicated R package that accesses these: gvSig, OpenJump, and the Orfeo Toolbox.70

    +

    Please note that there are a number of further GIS software packages that have a scripting interface but for which there is no dedicated R package that accesses these: gvSig, OpenJump, and the Orfeo Toolbox.70

    10.6 Bridges to GDAL

    As discussed in Chapter 8, GDAL is a low-level library that supports many geographic data formats. -GDAL is so effective that most GIS programs use GDAL in the background for importing and exporting geographic data, rather than re-inventing the wheel and using bespoke read-write code. +GDAL is so effective that most GIS programs use GDAL in the background for importing and exporting geographic data, rather than reinventing the wheel and using bespoke read-write code. But GDAL offers more than data I/O. It has geoprocessing tools for vector and raster data, functionality to create tiles for serving raster data online, and rapid rasterization of vector data. -Since GDAL is a command line tool, all its commands can be accessed from within R via the system() command.

    +Since GDAL is a command-line tool, all its commands can be accessed from within R via the system() command.

    The code chunk below demonstrates this functionality: linkGDAL() searches the computer for a working GDAL installation and adds the location of the executable files to the PATH variable, allowing GDAL to be called (usually only needed under Windows).

    @@ -749,13 +749,13 @@ 

  • ogr2ogr: converts between different vector file formats
  • -gdalwarp: reprojects, transform, and clip raster datasets
  • +gdalwarp: reprojects, transforms, and clips raster datasets
  • gdaltransform: transforms coordinates
  • Visit https://gdal.org/programs/ to see the complete list of GDAL tools and to read their help files.

    The ‘link’ to GDAL provided by link2GI could be used as a foundation for doing more advanced GDAL work from the R or system CLI. -TauDEM (https://hydrology.usu.edu/taudem/) and the Orfeo Toolbox (https://www.orfeo-toolbox.org/) are other spatial data processing libraries/programs offering a command line interface – the above example shows how to access these libraries from the system command line via R. +TauDEM (https://hydrology.usu.edu/taudem/) and the Orfeo Toolbox (https://www.orfeo-toolbox.org/) are other spatial data processing libraries/programs offering a command-line interface – the above example shows how to access these libraries from the system command line via R. This in turn could be the starting point for creating a proper interface to these libraries in the form of new R packages.

    Before diving into a project to create a new bridge, however, it is important to be aware of the power of existing R packages and that system() calls may not be platform-independent (they may fail on some computers). On the other hand, sf and terra brings most of the power provided by GDAL, GEOS and PROJ to R via the R/C++ interface provided by Rcpp, which avoids system() calls.71

    @@ -765,12 +765,12 @@

    10.7 Bridges to spatial databases

    -Spatial database management systems (spatial DBMS) store spatial and non-spatial data in a structured way. +Spatial database management systems (spatial DBMSs) store spatial and non-spatial data in a structured way. They can organize large collections of data into related tables (entities) via unique identifiers (primary and foreign keys) and implicitly via space (think for instance of a spatial join). This is useful because geographic datasets tend to become big and messy quite quickly. Databases enable storing and querying large datasets efficiently based on spatial and non-spatial fields, and provide multi-user access and topology support.

    The most important open source spatial database is PostGIS (Obe and Hsu 2015).72 +SQLite/SpatiaLite are certainly also important, but implicitly we have already introduced this approach since GRASS GIS is using SQLite in the background (see Section <a href="gis.html#grass">10.4</a>).</p>'>72 R bridges to spatial DBMSs such as PostGIS are important, allowing access to huge data stores without loading several gigabytes of geographic data into RAM, and likely crashing the R session. The remainder of this section shows how PostGIS can be called from R, based on “Hello real-world” from PostGIS in Action, Second Edition (Obe and Hsu 2015).73

    @@ -829,7 +829,7 @@

    You find them also in the sf-package, though here they are written in lowercase characters (st_union(), st_buffer()). In fact, function names of the sf package largely follow the PostGIS naming conventions.75

    -

    The last query will find all Hardee’s restaurants (HDE) within the 35 km buffer zone (Figure 10.6).

    +

    The last query will find all Hardee’s restaurants (HDE) within the 35-km buffer zone (Figure 10.6).

     query = paste(
       "SELECT *",
    @@ -863,7 +863,7 @@ 

    PostgreSQL/PostGIS is a formidable choice as an open-source spatial database. But the same is true for the lightweight SQLite/SpatiaLite database engine and GRASS GIS which uses SQLite in the background (see Section 10.4).

    If your datasets are too big for PostgreSQL/PostGIS and you require massive spatial data management and query performance, it may be worth exploring large-scale geographic querying on distributed computing systems. -Such systems are outside the scope of this book but it worth mentioning that open source software providing this functionality exists. +Such systems are outside the scope of this book, but it is worth mentioning that open source software providing this functionality exists. Prominent projects in this space include GeoMesa and Apache Sedona. The apache.sedona package provides an interface to the latter.

    @@ -906,7 +906,7 @@

    However, keep in mind that the availability of COGs is a big plus while browsing through catalogs of data providers.

    For larger areas of interest, requested images are still relatively difficult to work with: they may use different map projections, may spatially overlap, and their spatial resolution often depends on the spectral band. The gdalcubes package (Appel and Pebesma 2019) can be used to abstract from individual images and to create and process image collections as four-dimensional data cubes.

    -

    The code below shows a minimal example to create a lower resolution (250m) maximum NDVI composite from the Sentinel-2 images returned by the previous STAC-API search.

    +

    The code below shows a minimal example to create a lower resolution (250 m) maximum NDVI composite from the Sentinel-2 images returned by the previous STAC-API search.

     library(gdalcubes)
     # Filter images by cloud cover and create an image collection object
    @@ -932,7 +932,7 @@ 

    The combination of STAC, COGs, and data cubes forms a cloud-native workflow to analyze (large) collections of satellite imagery in the cloud. These tools already form a backbone, for example, of the sits package, which allows land use and land cover classification of big Earth observation data in R. The package builds EO data cubes from image collections available in cloud services and performs land classification of data cubes using various machine and deep learning algorithms. -For more information about sits visit https://e-sensing.github.io/sitsbook/ or read the related article (Simoes, Camara, et al. 2021).

    +For more information about sits, visit https://e-sensing.github.io/sitsbook/ or read the related article (Simoes, Camara, et al. 2021).

    @@ -944,8 +944,8 @@

    Implementations are available for eight different backends (see https://hub.openeo.org) to which users can connect with R, Python, JavaScript, QGIS, or a web editor and define (and chain) processes on collections. Since the functionality and data availability differs among the backends, the openeo R package (Lahn 2021) dynamically loads available processes and collections from the connected backend. Afterwards, users can load image collections, apply and chain processes, submit jobs, and explore and plot results.

    -

    The following code will connect to the openEO platform backend, request available datasets, processes, and output formats, define a process graph to compute a maximum NDVI image from Sentinel-2 data, and finally executes the graph after logging in to the backend. -The openEO platform backend includes a free tier and registration is possible from existing institutional or internet platform accounts.

    +

    The following code will connect to the openEO platform backend, request available datasets, processes, and output formats, define a process graph to compute a maximum NDVI image from Sentinel-2 data, and finally execute the graph after logging in to the backend. +The openEO platform backend includes a free tier, and registration is possible from existing institutional or internet platform accounts.

     library(openeo)
     con = connect(host = "https://openeo.cloud")
    @@ -982,7 +982,7 @@ 

    10.9 Exercises

    -

    E1. Compute global solar irradiation for an area of system.file("raster/dem.tif", package = "spDataLarge") for March 21 at 11:00 AM using the r.sun GRASS GIS through qgisprocess.

    +

    E1. Compute global solar irradiation for an area of system.file("raster/dem.tif", package = "spDataLarge") for March 21 at 11:00 am using the r.sun GRASS GIS through qgisprocess.

    E2. Compute catchment area and catchment slope of system.file("raster/dem.tif", package = "spDataLarge") using Rsagacmd.

    E3. Continue working on the ndvi_segments object created in the SAGA section. @@ -995,13 +995,13 @@

    For example, plot a hillshade, the digital elevation model, your viewshed output, and the point. Additionally, give mapview a try.

    -

    E5. Use gdalinfo via a system call for a raster file stored on disk of your choice. -What kind of information you can find there?

    +

    E5. Use gdalinfo via a system call for a raster file stored on a disk of your choice. +What kind of information can you find there?

    E6. Use gdalwarp to decrease the resolution of your raster file (for example, if the resolution is 0.5, change it into 1). Note: -tr and -r flags will be used in this exercise.

    E7. Query all Californian highways from the PostgreSQL/PostGIS database living in the QGIS Cloud introduced in this chapter.

    -

    E8. The ndvi.tif raster (system.file("raster/ndvi.tif", package = "spDataLarge")) contains NDVI calculated for the Mongón study area based on Landsat data from September 22nd, 2000. +

    E8. The ndvi.tif raster (system.file("raster/ndvi.tif", package = "spDataLarge")) contains NDVI calculated for the Mongón study area based on Landsat data from September 22, 2000. Use rstac, gdalcubes, and terra to download Sentinel-2 images for the same area from 2020-08-01 to 2020-10-31, calculate its NDVI, and then compare it with the results from ndvi.tif.

    diff --git a/location.html b/location.html index ba20ac059..4441450b3 100644 --- a/location.html +++ b/location.html @@ -114,7 +114,7 @@

    Prerequisites
    • This chapter requires the following packages (tmaptools must also be installed):
    -
    +
     library(sf)
     library(dplyr)
     library(purrr)
    @@ -179,19 +179,19 @@ 

    The German government provides gridded census data at either 1 km or 100 m resolution. The following code chunk downloads, unzips and reads in the 1 km data.

    -
    +
     download.file("https://tinyurl.com/ybtpkwxz", 
                   destfile = "census.zip", mode = "wb")
     unzip("census.zip") # unzip the files
     census_de = readr::read_csv2(list.files(pattern = "Gitter.csv"))

    Please note that census_de is also available from the spDataLarge package:

    -
    +
     data("census_de", package = "spDataLarge")

    The census_de object is a data frame containing 13 variables for more than 360,000 grid cells across Germany. For our work, we only need a subset of these: Easting (x) and Northing (y), number of inhabitants (population; pop), mean average age (mean_age), proportion of women (women) and average household size (hh_size). These variables are selected and renamed from German into English in the code chunk below and summarized in Table 14.1. Further, mutate() is used to convert values -1 and -9 (meaning “unknown”) to NA.

    -
    +
     # pop = population, hh_size = household size
     input = select(census_de, x = x_mp_1km, y = y_mp_1km, pop = Einwohner,
                           women = Frauen_A, mean_age = Alter_D, hh_size = HHGroesse_D)
    @@ -261,9 +261,9 @@ 

    After the preprocessing, the data can be converted into a SpatRaster object (see Sections 2.3.4 and 3.3.1) with the help of the rast() function. When setting its type argument to xyz, the x and y columns of the input data frame should correspond to coordinates on a regular grid. All the remaining columns (here: pop, women, mean_age, hh_size) will serve as values of the raster layers (Figure 14.1; see also code/14-location-figures.R in our GitHub repository).

    -
    +
     input_ras = rast(input_tidy, type = "xyz", crs = "EPSG:3035")
    -
    +
     input_ras
     #> class       : SpatRaster 
     #> dimensions  : 868, 642, 4  (nrow, ncol, nlyr)
    @@ -297,7 +297,7 @@ 

    Class 1 in the variable women, for instance, represents areas in which 0 to 40% of the population is female; these are reclassified with a comparatively high weight of 3 because the target demographic is predominantly male. Similarly, the classes containing the youngest people and highest proportion of single households are reclassified to have high weights.

    -
    +
     rcl_pop = matrix(c(1, 1, 127, 2, 2, 375, 3, 3, 1250, 
                        4, 4, 3000, 5, 5, 6000, 6, 6, 8000), 
                      ncol = 3, byrow = TRUE)
    @@ -311,13 +311,13 @@ 

    For instance, the first element corresponds in both cases to the population. Subsequently, the for-loop applies the reclassification matrix to the corresponding raster layer. Finally, the code chunk below ensures the reclass layers have the same name as the layers of input_ras.

    -
    +
     reclass = input_ras
     for (i in seq_len(nlyr(reclass))) {
       reclass[[i]] = classify(x = reclass[[i]], rcl = rcl[[i]], right = NA)
     }
     names(reclass) = names(input_ras)
    -
    +
     reclass # full output not shown
     #> ... 
     #> names       :  pop, women, mean_age, hh_size 
    @@ -331,7 +331,7 @@ 

    We deliberately define metropolitan areas as pixels of 20 km2 inhabited by more than 500,000 people. Pixels at this coarse resolution can rapidly be created using aggregate(), as introduced in Section 5.3.3. The command below uses the argument fact = 20 to reduce the resolution of the result twenty-fold (recall the original raster resolution was 1 km2).

    -
    +
     pop_agg = aggregate(reclass$pop, fact = 20, fun = sum, na.rm = TRUE)
     summary(pop_agg)
     #>       pop         
    @@ -343,14 +343,14 @@ 

    #> Max. :1204870 #> NA's :447

    The next stage is to keep only cells with more than half a million people.

    -
    +
     pop_agg = pop_agg[pop_agg > 500000, drop = FALSE] 

    Plotting this reveals eight metropolitan regions (Figure 14.2). Each region consists of one or more raster cells. It would be nice if we could join all cells belonging to one region. terra’s patches() command does exactly that. Subsequently, as.polygons() converts the raster object into spatial polygons, and st_as_sf() converts it into an sf object.

    -
    +
     metros = pop_agg |> 
       patches(directions = 8) |>
       as.polygons() |>
    @@ -367,7 +367,7 @@ 

    This is exactly what the rev_geocode_OSM() function of the tmaptools package expects. Setting additionally as.data.frame to TRUE will give back a data.frame with several columns referring to the location including the street name, house number and city. However, here, we are only interested in the name of the city.

    -
    +
     metro_names = sf::st_centroid(metros, of_largest_polygon = TRUE) |>
       tmaptools::rev_geocode_OSM(as.data.frame = TRUE) |>
       select(city, town, state)
    @@ -420,7 +420,7 @@ 

    Overall, we are satisfied with the city column serving as metropolitan names (Table 14.2) apart from one exception, namely Velbert which belongs to the greater region of Düsseldorf. Hence, we replace Velbert with Düsseldorf (Figure 14.2). Umlauts like ü might lead to trouble further on, for example when determining the bounding box of a metropolitan area with opq() (see further below), which is why we avoid them.

    -
    +
     metro_names = metro_names$city |> 
       as.character() |>
       {\(x) ifelse(x == "Velbert", "Düsseldorf", x)}() |>
    @@ -448,7 +448,7 @@ 

    Before running this code: please consider it will download almost 2GB of data. To save time and resources, we have put the output named shops into spDataLarge. To make it available in your environment run data("shops", package = "spDataLarge").

    -
    +
     shops = purrr::map(metro_names, function(x) {
       message("Downloading shops of: ", x, "\n")
       # give the server a bit time
    @@ -468,7 +468,7 @@ 

    It is highly unlikely that there are no shops in any of our defined metropolitan areas. The following if condition simply checks if there is at least one shop for each region. If not, we recommend to try to download the shops again for this/these specific region/s.

    -
    +
     # checking if we have downloaded shops for each metropolitan area
     ind = purrr::map_dbl(shops, nrow) == 0
     if (any(ind)) {
    @@ -476,18 +476,18 @@ 

    paste(metro_names[ind], collapse = ", "), "\nPlease fix it!") }

    To make sure that each list element (an sf data frame) comes with the same columns99 we only keep the osm_id and the shop columns with the help of the map_dfr loop which additionally combines all shops into one large sf object.

    -
    +
     # select only specific columns
     shops = purrr::map_dfr(shops, select, osm_id, shop)

    Note: shops is provided in the spDataLarge and can be accessed as follows:

    -
    +
     data("shops", package = "spDataLarge")

    The only thing left to do is to convert the spatial point object into a raster (see Section 6.4). The sf object, shops, is converted into a raster having the same parameters (dimensions, resolution, CRS) as the reclass object. Importantly, the length() function is used here to count the number of shops in each cell.

    The result of the subsequent code chunk is therefore an estimate of shop density (shops/km2). st_transform() is used before rasterize() to ensure the CRS of both inputs match.

    -
    +
     shops = sf::st_transform(shops, st_crs(reclass))
     # create poi raster
     poi = rasterize(x = shops, y = reclass, field = "osm_id", fun = "length")
    @@ -495,7 +495,7 @@

    Defining class intervals is an arbitrary undertaking to a certain degree. One can use equal breaks, quantile breaks, fixed values or others. Here, we choose the Fisher-Jenks natural breaks approach which minimizes within-class variance, the result of which provides an input for the reclassification matrix.

    -
    +
     # construct reclassification matrix
     int = classInt::classIntervals(values(poi), n = 4, style = "fisher")
     int = round(int$brks)
    @@ -515,13 +515,13 @@ 

    First of all, we have already delineated metropolitan areas, that is areas where the population density is above average compared to the rest of Germany. Second, though it is advantageous to have many potential customers within a specific catchment area, the sheer number alone might not actually represent the desired target group. For instance, residential tower blocks are areas with a high population density but not necessarily with a high purchasing power for expensive cycle components.

    -
    +
     # remove population raster and add poi raster
     reclass = reclass[[names(reclass) != "pop"]] |>
       c(poi)

    In common with other data science projects, data retrieval and ‘tidying’ have consumed much of the overall workload so far. With clean data, the final step — calculating a final score by summing all raster layers — can be accomplished in a single line of code.

    -
    +
     # calculate the total score
     result = sum(reclass)

    For instance, a score greater than 9 might be a suitable threshold indicating raster cells where a bike shop could be placed (Figure 14.3; see also code/14-location-figures.R).

    diff --git a/search.json b/search.json index 3b45bbad3..e78e8563b 100644 --- a/search.json +++ b/search.json @@ -1 +1 @@ -[{"path":"index.html","id":"welcome","chapter":"Welcome","heading":"Welcome","text":"online home Geocomputation R, book geographic data analysis, visualization modeling.Note: first edition book published CRC Press R Series.\ncan buy book CRC Press, Amazon, see archived First Edition hosted bookdown.org.Inspired Free Open Source Software Geospatial (FOSS4G) movement, code prose underlying book open, ensuring content reproducible, transparent, accessible.\nHosting source code GitHub allows anyone interact project opening issues contributing new content typo fixes benefit everyone.\nonline version book hosted r.geocompx.org kept --date GitHub Actions.\ncurrent ‘build status’ follows:version book built GH Actions 2024-09-25.book licensed Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.code samples book licensed Creative Commons CC0 1.0 Universal (CC0 1.0).","code":""},{"path":"index.html","id":"how-to-contribute","chapter":"Welcome","heading":"How to contribute?","text":"bookdown makes editing book easy editing wiki, provided GitHub account (sign-github.com).\nlogged-GitHub, click ‘Edit page’ icon right panel book website.\ntake editable version source R Markdown file generated page ’re .raise issue book’s content (e.g., code running) make feature request, check-issue tracker.Maintainers contributors must follow repository’s CODE CONDUCT.","code":""},{"path":"index.html","id":"reproducibility","chapter":"Welcome","heading":"Reproducibility","text":"quickest way reproduce contents book ’re new geographic data R may web browser, thanks Binder.\nClicking link open new window containing RStudio Server web browser, enabling open chapter files running code chunks test code reproducible.see something like image , congratulations, worked!\ncan start exploring Geocomputation R cloud-based environment, noting mybinder.org user guidelines):\nFIGURE 0.1: Screenshot reproducible code contained Geocomputation R running RStudio Server browser served Binder\nreproduce code book computer, need recent version R --date packages.\ncan installed using remotes package.installing book’s dependencies, can rebuild book testing educational purposes.\ndownload unzip clone book’s source code.\nopening geocompr.Rproj project RStudio (opening folder another IDE VS Code), able reproduce contents following command:See project’s GitHub repo full details reproducing book.","code":"\ninstall.packages(\"remotes\")\ninstall.packages('geocompkg', repos = c('https://geocompr.r-universe.dev', 'https://cloud.r-project.org'), dependencies = TRUE, force = TRUE)\nbookdown::serve_book(\".\")"},{"path":"index.html","id":"getting-involved","chapter":"Welcome","heading":"Getting involved","text":"find project use interest, can get involved many ways, :Telling people ‘Starring’ geocompr GitHub repositoryCommunicating book online, via #geocompr hashtag Mastodon (see Guestbook geocompx.org) letting us know courses using bookCiting linking-itBuying copyReviewing , Amazon, Goodreads elsewhereAsking questions content making suggestion GitHub, Mastodon DiscordAnswering questions, least responding people asking clarification reproducible examples demonstrate questionHelping people get started open source software reproducible research general, working geographic data R particular (can excellent way consolidate build skills)Supporting community translations\nSpanish version: https://r.geocompx.org/es/\nFrench version: https://r.geocompx.org/fr/\nJapanese version: http://babayoshihiko.ddns.net/geo/\nSpanish version: https://r.geocompx.org/es/French version: https://r.geocompx.org/fr/Japanese version: http://babayoshihiko.ddns.net/geo/details can found github.com/geocompx/geocompr.globe icon used book created Jean-Marc Viglino licensed CC-4.0 International.\nbook website hosted Netlify.","code":""},{"path":"foreword-1st-edition.html","id":"foreword-1st-edition","chapter":"Foreword (1st Edition)","heading":"Foreword (1st Edition)","text":"‘spatial’ R always broad, seeking provide integrate tools geography, geoinformatics, geocomputation spatial statistics anyone interested joining : joining asking interesting questions, contributing fruitful research questions, writing improving code.\n, ‘spatial’ R always included open source code, open data reproducibility.‘spatial’ R also sought open interaction many branches applied spatial data analysis, also implement new advances data representation methods analysis expose cross-disciplinary scrutiny.\nbook demonstrates, often alternative workflows similar data similar results, may learn comparisons others create understand workflows.\nincludes learning similar communities around Open Source GIS complementary languages Python, Java .R’s wide range spatial capabilities never evolved without people willing share creating adapting.\nmight include teaching materials, software, research practices (reproducible research, open data), combinations .\nR users also benefitted greatly ‘upstream’ open source geo libraries GDAL, GEOS PROJ.book clear example , curious willing join , can find things need match aptitudes.\nadvances data representation workflow alternatives, ever increasing numbers new users often without applied quantitative command line exposure, book kind really needed.\nDespite effort involved, authors supported pressing forward publication., fresh book ready go; authors tried many tutorials workshops, readers instructors able benefit knowing contents continue tried people like .\nEngage authors wider R-spatial community, see value choice building workflows important, enjoy applying learn things care .Roger BivandBergen, September 2018","code":""},{"path":"foreword-2nd-edition.html","id":"foreword-2nd-edition","chapter":"Foreword (2nd Edition)","heading":"Foreword (2nd Edition)","text":"Writing books open source data science software constantly changes uncontrolled ways brave undertaking: feels like running race someone else constantly moves finish line. second edition Geocomputation R timely: catches many recent changes, also embraces new R packages, new topical developments computing landscape. now includes chapter raster-vector interactions, discussing package terra replacing package raster raster (vector) data processing. also keeps tmap package creating high quality maps, completing full rewrite cycle.Besides updating contents book, authors also active helping streamline focus changes software extensively testing , helping improve , writing issues pull requests GitHub, sharing benchmark results, helping improve software documentation.first edition book great success. first book popularize spatial analysis sf package tidyverse. enthusiastic tone reached wide audience, helped people various levels experience solving new problems moving next level. available entirely freely online addition printed volume gave large reach, enabled users try presented methodology datasets. addition , authors encouraged readership reach ways GitHub issues, social media posts, discussions discord channel. led 75 people contributing book’s source code one way , including several providing longer reviews contributing full sections, including Cloud-optimized GeoTIFFs, STAC openEO; sfheaders package; OGC APIs metadata; CycleHire shiny app. Discord, led lively spontaneous discussions threads include topics ranging highly technical “look built”.Beyond , authors initiated companion volume Geocomputation Python, stressing geocomputation happens data science languages, means restricted one . Geocomputation rise, part fostering growing geocomputation community, writing books like one indispensable.Edzer PebesmaMünster, Germany, May 2024","code":""},{"path":"preface.html","id":"preface","chapter":"Preface","heading":"Preface","text":"","code":""},{"path":"preface.html","id":"who-this-book-is-for","chapter":"Preface","heading":"Who this book is for","text":"book people want analyze, visualize model geographic data open source software.\nbased R, statistical programming language powerful data processing, visualization geospatial capabilities.\nbook covers wide range topics interest wide range people many different backgrounds, especially:People learned spatial analysis skills using desktop Geographic Information System (GIS), QGIS, ArcGIS, GRASS GIS SAGA, want access powerful (geo)statistical visualization programming language benefits command line approach (Sherman 2008):\n\nadvent ‘modern’ GIS software, people want point click way life. ’s good, tremendous amount flexibility power waiting command line.\nPeople learned spatial analysis skills using desktop Geographic Information System (GIS), QGIS, ArcGIS, GRASS GIS SAGA, want access powerful (geo)statistical visualization programming language benefits command line approach (Sherman 2008):advent ‘modern’ GIS software, people want point click way life. ’s good, tremendous amount flexibility power waiting command line.Graduate students researchers fields specializing geographic data including Geography, Remote Sensing, Planning, GIS Spatial Data ScienceGraduate students researchers fields specializing geographic data including Geography, Remote Sensing, Planning, GIS Spatial Data ScienceAcademics post-graduate students working geographic data — fields Geology, Regional Science, Biology Ecology, Agricultural Sciences, Archaeology, Epidemiology, Transport Modeling, broadly defined Data Science — require power flexibility R researchAcademics post-graduate students working geographic data — fields Geology, Regional Science, Biology Ecology, Agricultural Sciences, Archaeology, Epidemiology, Transport Modeling, broadly defined Data Science — require power flexibility R researchApplied researchers analysts public, private third-sector organizations need reproducibility, speed flexibility command line language R applications dealing spatial data diverse Urban Transport Planning, Logistics, Geo-marketing (store location analysis) Emergency PlanningApplied researchers analysts public, private third-sector organizations need reproducibility, speed flexibility command line language R applications dealing spatial data diverse Urban Transport Planning, Logistics, Geo-marketing (store location analysis) Emergency PlanningThe book designed intermediate--advanced R users interested geocomputation R beginners prior experience geographic data.\nnew R geographic data, discouraged: provide links materials describe nature spatial data beginner’s perspective Chapter 2 links provided .","code":""},{"path":"preface.html","id":"how-to-read-this-book","chapter":"Preface","heading":"How to read this book","text":"book divided three parts:Part : Foundations, aimed getting --speed geographic data R.Part II: Advanced techniques, including spatial data visualization, bridges GIS software, programming spatial data, statistical learning.Part III: Applications real-world problems, including transportation, geomarketing ecological modeling.chapters get harder one part next.\nrecommend reading chapters Part order tackling advanced topics Part II Part III.\nchapters Part II Part III benefit slightly read order, can read independently interested specific topic.\nmajor barrier geographical analysis R steep learning curve.\nchapters Part aim address providing reproducible code simple datasets ease process getting started.important aspect book teaching/learning perspective exercises end chapter.\nCompleting develop skills equip confidence needed tackle range geospatial problems.\nSolutions exercises can found online booklet accompanies Geocomputation R, hosted r.geocompx.org/solutions.\nlearn booklet created, update solutions files _01-ex.Rmd, see blog post Geocomputation R solutions.\nblog posts examples can found geocompx.org.Impatient readers welcome dive straight practical examples, starting Chapter 2.\nHowever, recommend reading wider context Geocomputation R Chapter 1 first.\nnew R, also recommend learning language attempting run code chunks provided chapter (unless ’re reading book understanding concepts).\nFortunately beginners, R supportive community developed wealth resources can help.\nparticularly recommend three tutorials: R Data Science (Grolemund Wickham 2016) Efficient R Programming (Gillespie Lovelace 2016), introduction R (R Core Team 2021).","code":""},{"path":"preface.html","id":"why-r","chapter":"Preface","heading":"Why R?","text":"Although R steep learning curve, command line approach advocated book can quickly pay .\n’ll learn subsequent chapters, R effective tool tackling wide range geographic data challenges.\nexpect , practice, R become program choice geospatial toolbox many applications.\nTyping executing commands command line , many cases, faster pointing--clicking around graphical user interface (GUI) desktop GIS.\napplications Spatial Statistics modeling, R may realistic way get work done.outlined Section 1.3, many reasons using R geocomputation:\nR well suited interactive use required many geographic data analysis workflows compared languages.\nR excels rapidly growing fields Data Science (includes data carpentry, statistical learning techniques data visualization) Big Data (via efficient interfaces databases distributed computing systems).\nFurthermore, R enables reproducible workflow: sharing scripts underlying analysis allow others build work.\nensure reproducibility book made source code available github.com/geocompx/geocompr.\nfind script files code/ folder generate figures:\ncode generating figure provided main text book, name script file generated provided caption (see example caption Figure 13.2).languages Python, Java C++ can used geocomputation.\nexcellent resources learning geocomputation without R, discussed Section 1.4.\nNone provide unique combination package ecosystem, statistical capabilities, visualization options offered R community.\nFurthermore, teaching use one language (R) depth, book equip concepts confidence needed geocomputation languages.","code":""},{"path":"preface.html","id":"real-world-impact","chapter":"Preface","heading":"Real-world impact","text":"Geocomputation R equip knowledge skills tackle wide range issues, including scientific, societal environmental implications, manifested geographic data.\ndescribed Section 1.1, geocomputation using computers process geographic data, also real-world impact.\nwider context motivations underlying book covered Chapter 1.","code":""},{"path":"preface.html","id":"acknowledgments","chapter":"Preface","heading":"Acknowledgments","text":"Many thanks everyone contributed directly indirectly via code hosting collaboration site GitHub, including following people contributed direct via pull requests: prosoitos, florisvdh, babayoshihiko, katygregg, tibbles--tribbles, Lvulis, rsbivand, iod-ine, KiranmayiV, cuixueqin, defuneste, zmbc, erstearns, FlorentBedecarratsNM, dcooley, darrellcarvalho, marcosci, appelmar, MikeJohnPage, eyesofbambi, krystof236, nickbearman, tylerlittlefield, giocomai, KHwong12, LaurieLBaker, MarHer90, mdsumner, pat-s, sdesabbata, ahmohil, ateucher, annakrystalli, andtheWings, kant, gavinsimpson, Himanshuteli, yutannihilation, howardbaek, jimr1603, jbixon13, olyerickson, yvkschaefer, katiejolly, kwhkim, layik, mpaulacaldas, mtennekes, mvl22, ganes1410, richfitz, VLucet, wdearden, yihui, adambhouston, chihinl, cshancock, e-clin, ec-nebi, gregor-d, jasongrahn, p-kono, pokyah, schuetzingit, tim-salabim, tszberkowitz, vlarmet.\nThanks Marco Sciaini created front cover image first edition Benjamin Nowak created cover image second edition.\nSee code/frontcover.R code/frontcover2.R reproducible code generated visualizations.\nDozens people contributed online, raising commenting issues, providing feedback via social media.\n#geocompr geocompx hashtags live !like thank John Kimmel Lara Spieker CRC Press Taylor & Francis taking ideas early book plan production via four rounds peer review edition.\nreviewers deserve special mention detailed feedback expertise substantially improved book’s structure content.thank Patrick Schratz Alexander Brenning University Jena fruitful discussions contributions Chapters 12 15.\nthank Emmanuel Blondel Food Agriculture Organization United Nations expert contributions section web services;\nMichael Sumner critical contributions many areas book, especially discussion algorithms Chapter 11;\nTim Appelhans, David Cooley Kiranmayi Vadlamudi key contributions visualization chapter (Chapter 9);\nMarius Appel contributions Chapter 10;\nKaty Gregg, proofread every chapter greatly improved readability book.Countless others mentioned contributed myriad ways.\nfinal thank software developers make geocomputation R possible.\nEspecially, Edzer Pebesma (created sf package), Robert Hijmans (created terra) Roger Bivand (laid foundations much R-spatial software) made high performance geographic computing possible R.","code":""},{"path":"intro.html","id":"intro","chapter":"1 Introduction","heading":"1 Introduction","text":"book using power computers things geographic data.\nteaches range spatial skills, including: reading, writing manipulating geographic file formats; making static interactive maps; applying geocomputation support evidence-based decision-making related range geographic phenomena, transport systems ecosystems.\ndemonstrating various geographic operations can linked, ‘code chunks’ intersperse prose, book also teaches reproducible, open thus scientific workflows.book just using wealth existing tools geocomputation: ’s also understanding geographic data structures software needed build new tools.\napproach teach throughout, programming techniques covered Chapter 11 particular, can remove constraints creativity imposed software.\nreading book completing exercises, ready tackle real-world problems, communicate work maps code, contribute open source communities developing tools documentation reproducible geocomputation.last decades, free open source software geospatial (FOSS4G) progressed astonishing rate.\nThanks organizations OSGeo, advanced geographic techniques longer preserve expensive hardware software: anyone can now download run high-performance software geocomputation.\nOpen source Geographic Information Systems (GIS), QGIS, made geographic analysis accessible worldwide.\nGIS software products powerful, tend emphasize graphical user interface (GUI) approach command-line interface (CLI) approach advocated book.\n‘GUI focus’ many GIS products unintended consequence disabling many users making work fully reproducible, problem can overcome calling ‘geoalgorithms’ contained GIS software command line, ’ll see Chapter 10.\nsimplistic comparison different approaches illustrated Table 1.1.TABLE 1.1: Differences emphasis software packages (Graphical User Interface (GUI) Geographic Information Systems (GIS) R).R language providing CLI geocomputation.\ncommand environments powerful geographic capabilities exist, including Python (covered book Geocomputation Python), Julia, JavaScript.\nHowever, R advantages make good language learning geocomputation many geocomputation tasks, especially statistics, modeling visualization, outlined Section 1.2.book also motivated importance reproducibility scientific research.\naims make reproducible geographic data analysis workflows accessible, demonstrate power open geospatial software available command line.\nR provides ways interface languages (Eddelbuettel Balamuta 2018), enabling numerous spatial software libraries called R, explained Section 1.3 demonstrated Chapter 10.\ngoing details software, however, worth taking step back thinking mean geocomputation.","code":""},{"path":"intro.html","id":"what-is-geocomputation","chapter":"1 Introduction","heading":"1.1 What is geocomputation?","text":"define geocomputation asA field research, software development practical application uses geographic data solve problems, focus reproducibility, flexibility tool development.Geocomputation young term, dating back first conference subject 1996.1\ndistinguished geocomputation (time) commonly used term ‘quantitative geography’ emphasis “creative experimental” applications (Longley et al. 1998) development new tools methods.\nwords Stan Openshaw, pioneer field advocate (possibly originator) term, “GeoComputation using various different types geodata developing relevant geo-tools within overall context ‘scientific’ approach” (Openshaw Abrahart 2000).\nBuilding early definition, Geocomputation R goes beyond data analysis modeling include development new tools methods work just interesting academically beneficial.approach differs early definitions geocomputation one important way, however: emphasis reproducibility collaboration.\nturn 21st Century, unrealistic expect readers able reproduce code examples, due barriers preventing access necessary hardware, software data.\nFast-forward today things progressed rapidly.\nAnyone access laptop sufficient RAM (least 8 GB recommended) can install run software geocomputation, reproduce contents book.\nFinancial hardware barriers geocomputation existed 1990s early 2000s, high-performance computers expensive people, removed.2\nGeocomputation also accessible publicly accessible datasets widely available ever , see Chapter 8.\nUnlike early works field, work presented book reproducible using code example data supplied alongside book, R packages spData, installation covered Chapter 2.Geocomputation closely related terms including: Geographic Information Science (GIScience); Geomatics; Geoinformatics; Spatial Information Science; Geoinformation Engineering (Longley 2015); Spatial Data Science (SDS).\nterm shares emphasis ‘scientific’ (implying reproducible falsifiable) approach influenced GIS, although origins main fields application differ.\nSDS, example, emphasizes ‘data science’ skills large datasets, Geoinformatics tends focus data structures.\noverlaps terms larger differences use geocomputation rough synonym encapsulating :\nseek use geographic data applied scientific work.\nUnlike early users term, however, seek imply cohesive academic field called ‘Geocomputation’ (‘GeoComputation’ Stan Openshaw called ).Geocomputation recent term influenced old ideas.\ncan seen part Geography, 2000+ year history (Talbert 2014);\nextension GIS (Neteler Mitasova 2008), emerged 1960s (Coppock Rhind 1991).Geography played important role explaining influencing humanity’s relationship natural world long invention computer.\nfamous explorer, early geographer pioneering polymath Alexander von Humboldt (dozens species, geographic features, places even universities named , influence) illustrates role:\ntravels South America early 1800s resulting observations lay foundations physical geography ecology, also paved way towards policies protect natural world (Wulf 2015).\nbook aims contribute still-evolving ‘Geographic Tradition’ (Livingstone 1992) harnessing power modern computers open source software.book’s links older disciplines reflected suggested titles book: Geography R R GIS.\nadvantages.\nformer conveying applied nature content, something map.\nlatter communicates book using R powerful command-line geographic information system, perform spatial operations geographic data.\nHowever, term GIS connotations fail communicate R’s greatest strengths:\nabilities seamlessly switch geographic non-geographic data processing, modeling visualization tasks enabling reproducibility go far beyond capabilities GIS.\nGeocomputation implies working geographic data reproducible code-driven environment programming new results, methods tools, book .","code":""},{"path":"intro.html","id":"why-open-source","chapter":"1 Introduction","heading":"1.2 Why use open source tools for geocomputation?","text":"Early geographers used variety tools including barometers, compasses sextants advance knowledge world (Wulf 2015).\ninvention marine chronometer 1761 became possible calculate longitude sea, enabling ships take direct routes, example.\nturn century, acute shortage data tools geographic analysis.\n\nNowadays, researchers practitioners limitations cases face opposite problem: much data many tools.\nphones now global positioning (GPS) receiver.\nSensors ranging satellites semi-autonomous vehicles citizen scientists incessantly measure every part world.\nrate data produced can overwhelming, emerging technologies autonomous vehicles generating hundreds even thousands gigabytes data daily.\nRemote sensing datasets satellites large analyze single computer, outlined Chapter 10.\n‘geodata revolution’ drives demand high performance computer hardware efficient, scalable software handle extract signal noise.\nEvolving open source tools can import process subsets vast geographic data stores directly, via application programming interfaces (APIs) via interfaces databases. rapidly changing hardware, software data landscapes, ’s important choose tools future-proof.\nmajor advantage open source software rate development longevity, thousands potential contributors.\nHundreds people submit bug reports suggest new features well documentation improvements open source projects every day — rate evolution proprietary solutions simply keep .linked advantage interoperability.\nproprietary products tend monolithic ‘empires’ difficult maintain (linked previously mentioned advantage), open source software like ‘federation’ modular tools can combined different ways.\nallowed open source data science languages R rapidly incorporate new developments interfaces high performance visualization libraries file formats, proprietary solutions struggle keep .Another major advantage reproducibility.\nable replicate findings vital scientific research, open source software removes important barrier reproducibility enabling others check findings applying methods new contexts using tools.\ncombination using tools can accessed anyone free ability share code data means results work can checked built upon others, huge advantage want work used cited.biggest advantage open source software combined sharing reproducible code many people, however, community.\ncommunity enables get support far quicker often higher quality possible centralized budget-limited support team associated proprietary software.\ncommunity can provide feedback, ideas , discussed Chapter 16), can help develop tools methods.R open source software project, powerful language, ever-evolving community statisticians developers (Wickham 2019).\nR language enabling reproducible geocomputation open source software, outlined Section 1.4).\nMany reasons using R also apply open source languages reproducible data science, Python Julia.\nHowever, R key advantages, outlined Section 1.3.","code":""},{"path":"intro.html","id":"why-use-r-for-geocomputation","chapter":"1 Introduction","heading":"1.3 Why use R for geocomputation?","text":"R multi-platform, open source language environment statistical computing graphics (r-project.org/).\nwide range packages, R also supports advanced geospatial statistics, modeling visualization.\nIntegrated development environments (IDEs) RStudio made R user-friendly many, easing map-making panel dedicated interactive visualization.core, R object-oriented, functional programming language (Wickham 2019) specifically designed interactive interface software (Chambers 2016).\nlatter also includes many ‘bridges’ treasure trove GIS software, ‘geolibraries’ functions (see Chapter 10).\nthus ideal quickly creating ‘geo-tools’, without needing master lower level languages (compared R) C, FORTRAN Java (see Section 1.4).\ncan feel like breaking free metaphorical ‘glass ceiling’ imposed GUI-based proprietary geographic information systems (see Table 1.1 definition GUI).\nFurthermore, R facilitates access languages:\npackages Rcpp reticulate enable access C++ Python code, example.\nmeans R can used ‘bridge’ wide range geospatial programs (see Section 1.4).Another example showing R’s flexibility evolving geographic capabilities interactive map-making.\n’ll see Chapter 9, statement R “limited interactive [plotting] facilities” (Bivand, Pebesma, Gómez-Rubio 2013) longer true.\ndemonstrated following code chunk, creates Figure 1.1 (functions generate plot covered Section 9.4).\nFIGURE 1.1: blue markers indicate authors . basemap tiled image Earthat night provided NASA. Interact online version r.geocompx.org, example zooming clicking pop-ups.\ndifficult produce Figure 1.1 using R (open source language data science) years ago, let alone interactive map.\nillustrates R’s flexibility , thanks developments knitr leaflet, can used interface software, theme recur throughout book.\nuse R code, therefore, enables teaching geocomputation reference reproducible examples representing real-world phenomena, rather just abstract concepts.‘R-spatial stack’ easy install comprehensive, well-maintained highly interoperable packages.\nR ‘batteries included’ statistical functions part base installation hundreds well-maintained packages implementing many cutting edge methods.\nR, can dive get things working surprisingly lines code, enabling focus geographic methods data, rather debugging managing package dependencies.\nparticular strength R ease allows create publication quality interactive maps thanks excellent mapping packages, outlined Chapter 9.","code":"\nlibrary(leaflet)\npopup = c(\"Robin\", \"Jakub\", \"Jannes\")\nleaflet() |>\n addProviderTiles(\"NASAGIBS.ViirsEarthAtNight2012\") |>\n addMarkers(lng = c(-3, 23, 11),\n lat = c(52, 53, 49), \n popup = popup)"},{"path":"intro.html","id":"software-for-geocomputation","chapter":"1 Introduction","heading":"1.4 Software for geocomputation","text":"R powerful language geocomputation, many options geographic data analysis providing thousands geographic functions.\nAwareness languages geocomputation help decide different tool may appropriate specific task, place R wider geospatial ecosystem.\nsection briefly introduces languages C++, Java Python geocomputation, preparation Chapter 10.important feature R (Python) interpreted language.\nadvantageous enables interactive programming Read–Eval–Print Loop (REPL):\ncode entered console immediately executed result printed, rather waiting intermediate stage compilation.\nhand, compiled languages C++ Java tend run faster (compiled).C++ provides basis many GIS packages QGIS, GRASS GIS SAGA, sensible starting point.\nWell-written C++ fast, making good choice performance-critical applications processing large geographic datasets, harder learn Python R.\nC++ become accessible Rcpp package, provides good ‘way ’ C programming R users.\nProficiency low-level languages opens possibility creating new, high-performance ‘geoalgorithms’ better understanding GIS software works (see Chapter 11).\nHowever, necessary learn C++ use R geocomputation.Python important language geocomputation, especially many Desktop GIS GRASS GIS, SAGA QGIS provide Python API (see Chapter 10).\nLike R, Python popular language data science.\nlanguages object-oriented, many areas overlap, leading initiatives reticulate package facilitates access Python R Ursa Labs initiative support portable libraries benefit entire open source data science ecosystem.practice R Python strengths.\nextent use less important domain application communication results.\nLearning either provide head-start learning .\nHowever, major advantages R Python geocomputation.\nincludes much better support geographic raster data model language (see Chapter 2) corresponding visualization possibilities (see Chapters 2 9).\nEqually important, R unparalleled support statistics, including spatial statistics, hundreds packages (unmatched Python) supporting thousands statistical methods.major advantage Python general-purpose programming language.\nused many domains, including desktop software, computer games, websites data science.\nPython often shared language different (geocomputation) communities can seen ‘glue’ holds many GIS programs together.\nMany geoalgorithms, including QGIS ArcMap, can accessed Python command line, making well suited starter language command line GIS.3For spatial statistics predictive modeling, however, R second--none.\nmean must choose either R Python: Python supports common statistical techniques (though R tends support new developments spatial statistics earlier) many concepts learned Python can applied R world.\nLike R, Python also supports geographic data analysis manipulation packages shapely, geopandas, rasterio xarray.","code":""},{"path":"intro.html","id":"r-ecosystem","chapter":"1 Introduction","heading":"1.5 R’s spatial ecosystem","text":"many ways handle geographic data R, dozens packages area.4\nbook endeavor teach state---art field whilst ensuring methods future-proof.\nLike many areas software development, R’s spatial ecosystem rapidly evolving (Figure 1.2).\nR open source, developments can easily build previous work, ‘standing shoulders giants’, Isaac Newton put 1675.\napproach advantageous encourages collaboration avoids ‘reinventing wheel’.\npackage sf (covered Chapter 2), example, builds predecessor sp.surge development time (interest) ‘R-spatial’ followed award grant R Consortium development support simple features, open-source standard model store access vector geometries.\nresulted sf package (covered Section 2.2.1).\nMultiple places reflect immense interest sf.\nespecially true R-sig-Geo Archives, long-standing open access email list containing much R-spatial wisdom accumulated years.\nFIGURE 1.2: Downloads selected R packages working geographic data early 2013 present. y axis shows average number daily downloads popular cloud.r-project.org CRAN mirror 91-day rolling window (log scale).\nnoteworthy shifts wider R community, exemplified data processing package dplyr (released 2014), influenced shifts R’s spatial ecosystem.\nAlongside packages shared style emphasis ‘tidy data’ (including, e.g., ggplot2), dplyr placed tidyverse ‘metapackage’ late 2016.\ntidyverse approach, focus long-form data fast intuitively named functions, become immensely popular.\nled demand ‘tidy geographic data’ partly met sf.\nobvious feature tidyverse tendency packages work harmony.\nequivalent ‘geoverse’, modern R-spatial ecosystem consolidated around sf, illustrated key packages depend shown Table 1.2, terra, taught book.\nstack highly interoperable packages languages, outlined Chapter 10.TABLE 1.2: top 5 downloaded packages depend sf, terms average number downloads per day previous month. 2023-11-14 , 526 packages import sf.","code":""},{"path":"intro.html","id":"history-of-r-spatial","chapter":"1 Introduction","heading":"1.6 History of R-spatial","text":"many benefits using modern spatial packages sf, value understanding history R’s spatial capabilities.\nMany functions, use cases teaching materials contained older packages, many still useful, provided know look.\nR’s spatial capabilities originated early spatial packages S language (Bivand Gebhardt 2000).\n1990s saw development numerous S scripts handful packages spatial statistics.\nyear 2000, R packages various spatial methods, including “point pattern analysis, geostatistics, exploratory spatial data analysis spatial econometrics” (Bivand Neteler 2000).\n, notably spatial, sgeostat splancs still available CRAN (B. S. Rowlingson Diggle 1993; B. Rowlingson Diggle 2017; Venables Ripley 2002; Majure Gebhardt 2016).\nKey spatial packages described Ripley (2001), outlined R packages spatial smoothing interpolation (Akima Gebhardt 2016; Jr Diggle 2016) point pattern analysis (B. Rowlingson Diggle 2017; Baddeley, Rubak, Turner 2015).\nOne (spatstat) still actively maintained, 20 years first release.following commentary outlined future prospects spatial statistics (Bivand 2001), setting stage development popular spdep package (Bivand 2017).\nNotably, commentary mentioned need standardization spatial interfaces, efficient mechanisms exchanging data GIS, handling spatial metadata coordinate reference systems (CRS).\naims largely achieved.maptools (Bivand Lewin-Koh 2017) another important package time, provided interface shapelib library reading Shapefile file format fed sp.\nextended review spatial packages proposed class system support “data objects offered GDAL”, including fundamental point, line, polygon, raster types, interfaces external libraries (Bivand 2003).\nlarge extent, ideas realized packages rgdal sp, providing foundation seminal book Applied Spatial Data Analysis R (ASDAR) (Bivand, Pebesma, Gómez-Rubio 2013), first published 2008.\nR’s spatial capabilities evolved substantially since , still build ideas early pioneers.\nInterfaces GDAL PROJ, example, still power R’s high-performance geographic data /O CRS transformation capabilities, outlined Chapters 7 8, respectively.rgdal, released 2003, provided GDAL bindings R greatly enhanced ability import data previously unavailable geographic data formats.\ninitial release supported raster drivers, subsequent enhancements provided support coordinate reference systems (via PROJ library), reprojections import vector file formats.\nMany additional capabilities developed Barry Rowlingson released rgdal codebase 2006, described B. Rowlingson et al. (2003) R-help email list.sp package, released 2005, significant advancement R’s spatial capabilities.\nintroduced classes generic methods handling geographic coordinates, including points, lines, polygons, grids, well attribute data.\nS4 class system, sp stores information bounding box, coordinate reference system (CRS), attributes slots within Spatial objects.\nallows efficient data operations geographic data.\npackage also provided generic methods like summary() plot() working geographic data.following decade, sp classes rapidly became popular geographic data R number packages depended increased around 20 2008 100 2013 (Bivand, Pebesma, Gómez-Rubio 2013).\n2019 500 packages imported sp.\nAlthough number packages depend sp decreased since release sf still used prominent R packages, including gstat (spatial spatiotemporal geostatistics) geosphere (spherical trigonometry) (Pebesma Graeler 2023; Hijmans 2016).rgdal sp solved many spatial issues, rgeos developed Google Summer Code project 2010 (Bivand Rundel 2023) geometry operations undertaken sp objects.\nFunctions gIntersection() enabled users find spatial relationships geographic objects modify geometries (see Chapter 5 details geometric operations sf).\nlimitation sp ecosystem limited support raster data.\novercome raster, first released 2010 (Hijmans 2023b).\nraster’s class system functions enabled range raster operations, capabilities now implemented terra package, supersedes raster, outlined Section 2.3.\nimportant capability raster terra ability work datasets large fit RAM supporting -disk operations.\nraster terra also supports map algebra, described Section 4.3.2.parallel developments class systems methods, came support R interface dedicated GIS software.\nGRASS (Bivand 2000) follow-packages spgrass6, rgrass7 rgrass prominent examples direction (Bivand 2016a, 2016b, 2023).\nexamples bridges R GIS include bridges QGIS via qgisprocess (Dunnington et al. 2024), SAGA via Rsagacmd (Pawley 2023) RSAGA (Brenning, Bangs, Becker 2022) ArcGIS via RPyGeo (Brenning 2012a, first published 2008), (see Chapter 10).Visualization focus initially, bulk R-spatial development focused analysis geographic operations.\nsp provided methods map-making using base lattice plotting system, demand growing advanced map-making capabilities.\nRgoogleMaps first released 2009, allowed overlay R spatial data top ‘basemap’ tiles online services Google Maps OpenStreetMap (Loecher Ropkins 2015).\nfollowed ggmap package added similar ‘basemap’ tiles capabilities ggplot2 (Kahle Wickham 2013).\nThough ggmap facilitated map-making ggplot2, utility limited need fortify spatial objects, means converting long data frames.\nworks well points, computationally inefficient lines polygons, since coordinate (vertex) converted row, leading huge data frames represent complex geometries.\nAlthough geographic visualization tended focus vector data, raster visualization supported raster received boost release rasterVis (Lamigueiro 2018).\nSince map-making R become hot topic, dedicated packages tmap, leaflet mapview gaining popularity, highlighted Chapter 9.Since 2018, First Edition Geocomputation R published, development geographic R packages accelerated.\nterra, successor raster package, firstly released 2020 (Hijmans 2023c), bringing several benefits R users working raster datasets: faster straightforward user interface predecessor, described Section 2.3.mid-2021, sf started using S2 spherical geometry engine geometry operations unprojected datasets, described Section 2.2.9.\nAdditional ways representing working geographic data R since 2018 developed, including stars lidR packages (Pebesma 2021; Roussel et al. 2020).\ndevelopments motivated emergence new technologies, standards software outside R environment (Bivand 2021).\nMajor updates PROJ library beginning 2018 forced replacement ‘proj-string’ representations coordinate reference systems ‘Well Known Text’, described Section 2.4 Chapter 7.\nSince publication first version Geocomputation R 2018, several packages spatial data visualization developed improved.\nrayshader package, example, enables development striking easy--animate 3D visualizations via raytracing multiple hill-shading methods (Morgan-Wall 2021).\npopular ggplot2 package gained new spatial capabilities, thanks work ggspatial package, provides scale bars north arrows (Dunnington 2021).\ngganimate enables smooth customizable spatial animations (Pedersen Robinson 2020).Existing visualization packages also improved rewritten.\nLarge raster objects automatically downscaled tmap high-performance interactive maps now possible thanks packages including leafgl mapdeck.\n\nmapsf package (successor cartography) rewritten reduce dependencies improve performance (Giraud 2021); tmap underwent major update Version 4, internal code revised.late 2021, planned retirement rgdal, rgeos maptools announced October 2023 archived CRAN.\nretirement end 2023 large impact existing workflows applying packages, also influenced packages depend .\nModern R packages sf terra, described Chapter 2 provide strong future-proof foundation geocomputation build book.","code":""},{"path":"intro.html","id":"exercises","chapter":"1 Introduction","heading":"1.7 Exercises","text":"E1. Think terms ‘GIS’, ‘GDS’ ‘geocomputation’ described . () best describes work like using geo* methods software ?E2. Provide three reasons using scriptable language R geocomputation instead using graphical user interface (GUI) based GIS QGIS.E3. year 2000, Stan Openshaw wrote geocomputation involved “practical work beneficial useful” others. Think practical problem possible solutions informed new evidence derived analysis, visualization modeling geographic data. pen paper (computational equivalent) sketch inputs possible outputs illustrating geocomputation help.","code":""},{"path":"spatial-class.html","id":"spatial-class","chapter":"2 Geographic data in R","heading":"2 Geographic data in R","text":"","code":""},{"path":"spatial-class.html","id":"prerequisites","chapter":"2 Geographic data in R","heading":"Prerequisites","text":"first practical chapter book, therefore comes software requirements.\nneed access computer recent version R installed (R 4.3.2 later version).\nrecommend reading prose also running code chapter build geocomputational skills.keep track learning journey, may worth starting creating new folder computer save R scripts, outputs things related Geocomputation R go.\ncan also download clone source code underlying book support learning.\nstrongly recommend using R integrated development environment (IDE) RStudio (quicker get running) VS Code (requires additional setup).new R, recommend following introductory R resources Hands Programming R Introduction R dive Geocomputation R code.\nresources cover detail install R, simply involves downloading latest version Comprehensive R Archive Network (CRAN).\nSee note information installing R geocomputation Mac Linux.\nOrganize work projects give scripts sensible names chapter-02.R (equivalent RMarkdown Quarto file names) document code learn.\ngot good set-, ’s time run code!\nUnless already packages installed, first thing install foundational R packages used chapter, following commands:5The packages needed reproduce Part 1 book can installed following command: remotes::install_github(\"geocompx/geocompkg\").\ncommand uses function install_packages() remotes package install source code hosted GitHub code hosting, version collaboration platform.\nfollowing command install dependencies required reproduce entire book (warning: may take several minutes): remotes::install_github(\"geocompx/geocompkg\", dependencies = TRUE).packages needed run code presented chapter can ‘loaded’ (technically attached) library() function follows:output library(sf) reports versions key geographic libraries GEOS package using, outlined Section 2.2.1.packages installed contain data used book:","code":"\ninstall.packages(\"sf\")\ninstall.packages(\"terra\")\ninstall.packages(\"spData\")\ninstall.packages(\"spDataLarge\", repos = \"https://nowosad.r-universe.dev\")\nlibrary(sf) # classes and functions for vector data\n#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE\nlibrary(terra) # classes and functions for raster data\nlibrary(spData) # load geographic data\nlibrary(spDataLarge) # load larger geographic data"},{"path":"spatial-class.html","id":"intro-spatial-class","chapter":"2 Geographic data in R","heading":"2.1 Introduction","text":"chapter provide explanations fundamental geographic data models: vector raster.\nintroduce theory behind data model disciplines predominate, demonstrating implementation R.vector data model represents world using points, lines polygons.\ndiscrete, well-defined borders, meaning vector datasets usually high level precision (necessarily accuracy see Section 2.5).\nraster data model divides surface cells constant size.\nRaster datasets basis background images used web-mapping vital source geographic data since origins aerial photography satellite-based remote sensing devices.\nRasters aggregate spatially specific features given resolution, meaning consistent space scalable (many worldwide raster datasets available).use?\nanswer likely depends domain application:Vector data tends dominate social sciences human settlements tend discrete bordersRaster dominates many environmental sciences partially reliance remote sensing dataThere much overlap fields raster vector datasets can used together:\necologists demographers, example, commonly use vector raster data.\nFurthermore, possible convert two forms (see Chapter 6).\nWhether work involves use vector raster datasets, worth understanding underlying data model using , discussed subsequent chapters.\nbook uses sf terra packages work vector data raster datasets, respectively.","code":""},{"path":"spatial-class.html","id":"vector-data","chapter":"2 Geographic data in R","heading":"2.2 Vector data","text":"geographic vector data model based points located within coordinate reference system (CRS).\nPoints can represent self-standing features (e.g., location bus stop) can linked together form complex geometries lines polygons.\npoint geometries contain two dimensions (much less prominent 3-dimensional geometries contain additional \\(z\\) value, typically representing height sea level).system, example, London can represented coordinates c(-0.1, 51.5).\nmeans location -0.1 degrees east 51.5 degrees north origin.\norigin case 0 degrees longitude (Prime Meridian) 0 degrees latitude (Equator) geographic (‘lon/lat’) CRS (Figure 2.1, left panel).\npoint also approximated projected CRS ‘Easting/Northing’ values c(530000, 180000) British National Grid, meaning London located 530 km East 180 km North \\(origin\\) CRS.\ncan verified visually: slightly 5 ‘boxes’ — square areas bounded gray grid lines 100 km width — separate point representing London origin (Figure 2.1, right panel).location National Grid’s origin, sea beyond South West Peninsular, ensures locations UK positive Easting Northing values.6\nCRSs, described Section 2.4 Chapter 7 , purposes section, sufficient know coordinates consist two numbers representing distance origin, usually \\(x\\) \\(y\\) dimensions.\nFIGURE 2.1: Illustration vector (point) data location London (red X) represented reference origin (blue circle). left plot represents geographic CRS origin 0° longitude latitude. right plot represents projected CRS origin located sea west South West Peninsula.\nsf package provides classes geographic vector data consistent command line interface important low level libraries geocomputation:GDAL, reading, writing manipulating wide range geographic data formats, covered Chapter 8PROJ, powerful library coordinate system transformations, underlies content covered Chapter 7GEOS, planar geometry engine operations calculating buffers centroids data projected CRS, covered Chapter 5S2, spherical geometry engine written C++ developed Google, via s2 package, covered Section 2.2.9 Chapter 7Information interfaces printed sf first time package loaded: message appears library(sf) command beginning chapter tells us versions linked GEOS, GDAL PROJ libraries (vary computers time) whether S2 interface turned .\nNowadays, take granted, however, tight integration different geographic libraries makes reproducible geocomputation possible first place.neat feature sf can change default geometry engine used unprojected data: ‘switching ’ S2 can done command sf::sf_use_s2(FALSE), meaning planar geometry engine GEOS used default geometry operations, including geometry operations unprojected data.\nsee Section 2.2.9, planar geometry based 2 dimensional space.\nPlanar geometry engines GEOS assume ‘flat’ (projected) coordinates spherical geometry engines S2 assume unprojected (lon/lat) coordinates.section introduces sf classes preparation subsequent chapters (Chapters 5 8 cover GEOS GDAL interface, respectively).","code":""},{"path":"spatial-class.html","id":"intro-sf","chapter":"2 Geographic data in R","heading":"2.2.1 An introduction to simple features","text":"Simple features open standard developed endorsed Open Geospatial Consortium (OGC), --profit organization whose activities revisit later chapter (Section 8.2).\nSimple features hierarchical data model represents wide range geometry types.\n18 geometry types supported specification, 7 used vast majority geographic research (see Figure 2.2);\ncore geometry types fully supported R package sf (Pebesma 2018).7\nFIGURE 2.2: Simple feature types fully supported sf.\nsf can represent common vector geometry types (raster data classes supported sf): points, lines, polygons respective ‘multi’ versions (group together features type single feature).\nsf also supports geometry collections, can contain multiple geometry types single object.\nsf provides functionality () previously provided three packages — sp data classes (Pebesma Bivand 2023a), rgdal data read/write via interface GDAL PROJ (Bivand, Keitt, Rowlingson 2023) rgeos spatial operations via interface GEOS (Bivand Rundel 2023).re-iterate message Chapter 1, geographic R packages long history interfacing lower level libraries, sf continues tradition unified interface recent versions GEOS geometry operations, GDAL library reading writing geographic data files, PROJ library representing transforming projected coordinate reference systems.\ns2, R interface Google’s spherical geometry library, s2, sf also access fast accurate “measurements operations non-planar geometries” (Bivand 2021).\nSince sf version 1.0.0, launched June 2021, s2 functionality now used default geometries geographic (longitude/latitude) coordinate systems, unique feature sf differs spatial libraries support GEOS geometry operations Python package GeoPandas.\ndiscuss s2 subsequent chapters.sf’s ability integrate multiple powerful libraries geocomputation single framework notable achievement reduces ‘barriers entry’ world reproducible geographic data analysis high-performance libraries.\nsf’s functionality well documented website r-spatial.github.io/sf/ contains 7 vignettes.\ncan viewed offline follows:first vignette explains, simple feature objects R stored data frame, geographic data occupying special column, usually named ‘geom’ ‘geometry’.\nuse world dataset provided spData (Bivand, Nowosad, Lovelace 2023), loaded beginning chapter, show sf objects work.\nworld ‘sf data frame’ containing spatial attribute columns, names returned function names() (last column example contains geographic information).contents geom column give sf objects spatial powers: world$geom ‘list column’ contains coordinates country polygons.\nsf objects can plotted quickly function plot().\nAlthough part R’s default installation (base R), plot() generic extended packages.\nsf contains non-exported (hidden users time) plot.sf() function called behind scenes following command, creates Figure 2.3.\nFIGURE 2.3: spatial plot world using sf package, facet attribute.\nNote instead creating single map default geographic objects, GIS programs , plot()ing sf objects results map variable datasets.\nbehavior can useful exploring spatial distribution different variables discussed Section 2.2.3.broadly, treating geographic objects regular data frames spatial powers many advantages, especially already used working data frames.\ncommonly used summary() function, example, provides useful overview variables within world object.Although selected one variable summary() command, also outputs report geometry.\ndemonstrates ‘sticky’ behavior geometry columns sf objects, meaning geometry kept unless user deliberately removes , ’ll see Section 3.2.\nresult provides quick summary non-spatial spatial data contained world: mean average life expectancy 71 years (ranging less 51 83 years median 73 years) across countries.also worth taking deeper look basic behavior contents simple feature object, can usefully thought ‘spatial data frame’.sf objects easy subset: code shows return object containing first two rows first three columns world object.\noutput shows two major differences compared regular data.frame: inclusion additional geographic metadata (Geometry type, Dimension, Bounding box coordinate reference system information), presence ‘geometry column’, named geom:may seem rather complex, especially class system supposed ‘simple’!\nHowever, good reasons organizing things way using sf work vector geographic datasets.describing geometry type sf package supports, worth taking step back understand building blocks sf objects.\nSection 2.2.5 shows simple features objects data frames, special geometry columns.\nspatial columns often called geom geometry: world$geom refers spatial element world object described .\ngeometry columns ‘list columns’ class sfc (see Section 2.2.7).\nturn, sfc objects composed one objects class sfg: simple feature geometries describe Section 2.2.6.\nunderstand spatial components simple features work, vital understand simple feature geometries.\nreason cover currently supported simple features geometry type Section 2.2.4 moving describe can represented R using sf objects, based sfg sfc objects.","code":"\nvignette(package = \"sf\") # see which vignettes are available\nvignette(\"sf1\") # an introduction to the package\nclass(world)\n#> [1] \"sf\" \"tbl_df\" \"tbl\" \"data.frame\"\nnames(world)\n#> [1] \"iso_a2\" \"name_long\" \"continent\" \"region_un\" \"subregion\" \"type\" \n#> [7] \"area_km2\" \"pop\" \"lifeExp\" \"gdpPercap\" \"geom\"\nplot(world)\nsummary(world[\"lifeExp\"])\n#> lifeExp geom \n#> Min. :50.6 MULTIPOLYGON :177 \n#> 1st Qu.:65.0 epsg:4326 : 0 \n#> Median :72.9 +proj=long...: 0 \n#> Mean :70.9 \n#> 3rd Qu.:76.8 \n#> Max. :83.6 \n#> NA's :10\nworld_mini = world[1:2, 1:3]\nworld_mini\n#> Simple feature collection with 2 features and 3 fields\n#> Geometry type: MULTIPOLYGON\n#> Dimension: XY\n#> Bounding box: xmin: -180 ymin: -18.3 xmax: 180 ymax: -0.95\n#> Geodetic CRS: WGS 84\n#> iso_a2 name_long continent geom\n#> 1 FJ Fiji Oceania MULTIPOLYGON (((-180 -16.6,...\n#> 2 TZ Tanzania Africa MULTIPOLYGON (((33.9 -0.95,..."},{"path":"spatial-class.html","id":"why-simple-features","chapter":"2 Geographic data in R","heading":"2.2.2 Why simple features?","text":"Simple features widely supported data model underlies data structures many GIS applications including QGIS PostGIS.\nmajor advantage using data model ensures work cross-transferable setups, example importing exporting spatial databases.\nspecific question R perspective “use sf package”?\nmany reasons (linked advantages simple features model):Fast reading writing dataEnhanced plotting performancesf objects can treated data frames operationssf function names relatively consistent intuitive (begin st_)sf functions can combined |> operator works well tidyverse collection R packages.sf’s support tidyverse packages exemplified read_sf(), function importing geographic vector data covered detail Section 8.3.1.\nUnlike function st_read(), returns attributes stored base R data.frame (emits verbose messages, shown code chunk ), read_sf() silently returns data tidyverse tibble.\ndemonstrated :described Chapter 3, shows manipulate sf objects tidyverse functions, sf now go-package analysis spatial vector data R.\nspatstat, package ecosystem provides numerous functions spatial statistics, terra vector geographic data classes, neither level uptake sf working vector data.\nMany popular packages build sf, shown rise popularity terms number downloads per day, shown Section 1.5 previous chapter.","code":"\nworld_dfr = st_read(system.file(\"shapes/world.shp\", package = \"spData\"))\n#> Reading layer `world' from data source \n#> `/usr/local/lib/R/site-library/spData/shapes/world.shp' using driver `ESRI Shapefile'\n#> Simple feature collection with 177 features and 10 fields\n#> Geometry type: MULTIPOLYGON\n#> Dimension: XY\n#> Bounding box: xmin: -180 ymin: -89.9 xmax: 180 ymax: 83.6\n#> Geodetic CRS: WGS 84\nworld_tbl = read_sf(system.file(\"shapes/world.shp\", package = \"spData\"))\nclass(world_dfr)\n#> [1] \"sf\" \"data.frame\"\nclass(world_tbl)\n#> [1] \"sf\" \"tbl_df\" \"tbl\" \"data.frame\""},{"path":"spatial-class.html","id":"basic-map","chapter":"2 Geographic data in R","heading":"2.2.3 Basic map-making","text":"Basic maps created sf plot().\ndefault creates multi-panel plot, one sub-plot variable object, illustrated left-hand panel Figure 2.4.\nlegend ‘key’ continuous color produced object plotted single variable (see right-hand panel).\nColors can also set col =, although create continuous palette legend.\n\nFIGURE 2.4: Plotting sf, multiple variables (left) single variable (right).\nPlots added layers existing images setting add = TRUE.8\ndemonstrate , provide insight contents Chapters 3 4 attribute spatial data operations, subsequent code chunk filters countries Asia combines single feature:can now plot Asian continent map world.\nNote first plot must one facet add = TRUE work.\nfirst plot key, reset = FALSE must used:\nFIGURE 2.5: plot Asia added layer top countries worldwide.\nvarious ways modify maps sf’s plot() method.\nsf extends base R plotting methods, plot()’s arguments work sf objects (see ?graphics::plot ?par information arguments main =).9\nFigure 2.6 illustrates flexibility overlaying circles, whose diameters (set cex =) represent country populations, map world.\nunprojected version figure can created following commands (see exercises end chapter script 02-contplot.R reproduce Figure 2.6):\nFIGURE 2.6: Country continents (represented fill color) 2015 populations (represented circles, area proportional population).\ncode uses function st_centroid() convert one geometry type (polygons) another (points) (see Chapter 5), aesthetics varied cex argument.\nsf’s plot method also arguments specific geographic data.\nexpandBB, example, can used plot sf object context:\ntakes numeric vector length four expands bounding box plot relative zero following order: bottom, left, top, right.\nused plot India context giant Asian neighbors, emphasis China east, following code chunk, generates Figure 2.7 (see exercises adding text plots):10\nFIGURE 2.7: India context, demonstrating expandBB argument.\nNote use lwd emphasize India plotting code.\nSee Section 9.2 visualization techniques representing range geometry types, subject next section.","code":"\nplot(world[3:6])\nplot(world[\"pop\"])\nworld_asia = world[world$continent == \"Asia\", ]\nasia = st_union(world_asia)\nplot(world[\"pop\"], reset = FALSE)\nplot(asia, add = TRUE, col = \"red\")\nplot(world[\"continent\"], reset = FALSE)\ncex = sqrt(world$pop) / 10000\nworld_cents = st_centroid(world, of_largest = TRUE)\nplot(st_geometry(world_cents), add = TRUE, cex = cex)\nindia = world[world$name_long == \"India\", ]\nplot(st_geometry(india), expandBB = c(0, 0.2, 0.1, 1), col = \"gray\", lwd = 3)\nplot(st_geometry(world_asia), add = TRUE)"},{"path":"spatial-class.html","id":"geometry","chapter":"2 Geographic data in R","heading":"2.2.4 Geometry types","text":"Geometries basic building blocks simple features.\nSimple features R can take one 18 geometry types supported sf package.\nchapter focus seven commonly used types: POINT, LINESTRING, POLYGON, MULTIPOINT, MULTILINESTRING, MULTIPOLYGON GEOMETRYCOLLECTION.Generally, well-known binary (WKB) well-known text (WKT) standard encoding simple feature geometries.\nWKB representations usually hexadecimal strings easily readable computers.\nGIS spatial databases use WKB transfer store geometry objects.\nWKT, hand, human-readable text markup description simple features.\nformats exchangeable, present one, naturally choose WKT representation.basis geometry type point.\npoint simply coordinate 2D, 3D 4D space (see vignette(\"sf1\") information) (Figure 2.8, left panel):\nPOINT (5 2)\nlinestring sequence points straight line connecting points, example (Figure 2.8, middle panel):LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2)polygon sequence points form closed, non-intersecting ring.\nClosed means first last point polygon coordinates (Figure 2.8, right panel).11\nPolygon without hole: POLYGON ((1 5, 2 2, 4 1, 4 4, 1 5))\nFIGURE 2.8: Illustration point, linestring polygon geometries.\nfar created geometries one geometric entity per feature.\nSimple feature standard also allows multiple geometries single type exist within single feature within “multi” version geometry type (Figure 2.9):Multipoint: MULTIPOINT (5 2, 1 3, 3 4, 3 2)Multilinestring: MULTILINESTRING ((1 5, 4 4, 4 1, 2 2, 3 2), (1 2, 2 4))Multipolygon: MULTIPOLYGON (((1 5, 2 2, 4 1, 4 4, 1 5), (0 2, 1 2, 1 3, 0 3, 0 2)))\nFIGURE 2.9: Illustration multi* geometries.\nFinally, geometry collection can contain combination geometries including (multi)points linestrings (see Figure 2.10):\nGeometry collection: GEOMETRYCOLLECTION (MULTIPOINT (5 2, 1 3, 3 4, 3 2), LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2))\nFIGURE 2.10: Illustration geometry collection.\n","code":""},{"path":"spatial-class.html","id":"sf","chapter":"2 Geographic data in R","heading":"2.2.5 The sf class","text":"Simple features consist two main parts: geometries non-geographic attributes.\nFigure 2.11 shows sf object created – geometries come sfc object, attributes taken data.frame tibble.12\nFIGURE 2.11: Building blocks sf objects.\nNon-geographic attributes represent name feature attributes measured values, groups, things.\nillustrate attributes, represent temperature 25°C London June 21st, 2023.\nexample contains geometry (coordinates), three attributes three different classes (place name, temperature date).13\nObjects class sf represent data combining attributes (data.frame) simple feature geometry column (sfc).\ncreated st_sf() illustrated , creates London example described :just happened? First, coordinates used create simple feature geometry (sfg).\nSecond, geometry converted simple feature geometry column (sfc), CRS.\nThird, attributes stored data.frame, combined sfc object st_sf().\nresults sf object, demonstrated (output omitted):result shows sf objects actually two classes, sf data.frame.\nSimple features simply data frames (square tables), spatial attributes stored list column, usually called geometry geom, described Section 2.2.1.\nduality central concept simple features:\ntime sf can treated behaves like data.frame.\nSimple features , essence, data frames spatial extension.","code":"\nlnd_point = st_point(c(0.1, 51.5)) # sfg object\nlnd_geom = st_sfc(lnd_point, crs = \"EPSG:4326\") # sfc object\nlnd_attrib = data.frame( # data.frame object\n name = \"London\",\n temperature = 25,\n date = as.Date(\"2023-06-21\")\n)\nlnd_sf = st_sf(lnd_attrib, geometry = lnd_geom) # sf object\nlnd_sf\n#> Simple feature collection with 1 features and 3 fields\n#> ...\n#> name temperature date geometry\n#> 1 London 25 2023-06-21 POINT (0.1 51.5)\nclass(lnd_sf)\n#> [1] \"sf\" \"data.frame\""},{"path":"spatial-class.html","id":"sfg","chapter":"2 Geographic data in R","heading":"2.2.6 Simple feature geometries (sfg)","text":"sfg class represents different simple feature geometry types R: point, linestring, polygon (‘multi’ equivalents, multipoints) geometry collection.\nUsually spared tedious task creating geometries since can simply import already existing spatial file.\nHowever, set functions create simple feature geometry objects (sfg) scratch needed.\nnames functions simple consistent, start st_ prefix end name geometry type lowercase letters:point: st_point()linestring: st_linestring()polygon: st_polygon()multipoint: st_multipoint()multilinestring: st_multilinestring()multipolygon: st_multipolygon()geometry collection: st_geometrycollection()sfg objects can created three base R data types:numeric vector: single pointA matrix: set points, row represents point, multipoint linestringA list: collection objects matrices, multilinestrings geometry collectionsThe function st_point() creates single points numeric vectors:results show XY (2D coordinates), XYZ (3D coordinates) XYZM (3D additional variable, typically measurement accuracy) point types created vectors length 2, 3, 4, respectively.\nXYM type must specified using dim argument (short dimension).contrast, use matrices case multipoint (st_multipoint()) linestring (st_linestring()) objects:Finally, use lists creation multilinestrings, (multi-)polygons geometry collections:","code":"\nst_point(c(5, 2)) # XY point\n#> POINT (5 2)\nst_point(c(5, 2, 3)) # XYZ point\n#> POINT Z (5 2 3)\nst_point(c(5, 2, 1), dim = \"XYM\") # XYM point\n#> POINT M (5 2 1)\nst_point(c(5, 2, 3, 1)) # XYZM point\n#> POINT ZM (5 2 3 1)\n# the rbind function simplifies the creation of matrices\n## MULTIPOINT\nmultipoint_matrix = rbind(c(5, 2), c(1, 3), c(3, 4), c(3, 2))\nst_multipoint(multipoint_matrix)\n#> MULTIPOINT ((5 2), (1 3), (3 4), (3 2))\n## LINESTRING\nlinestring_matrix = rbind(c(1, 5), c(4, 4), c(4, 1), c(2, 2), c(3, 2))\nst_linestring(linestring_matrix)\n#> LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2)\n## POLYGON\npolygon_list = list(rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5)))\nst_polygon(polygon_list)\n#> POLYGON ((1 5, 2 2, 4 1, 4 4, 1 5))\n## POLYGON with a hole\npolygon_border = rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5))\npolygon_hole = rbind(c(2, 4), c(3, 4), c(3, 3), c(2, 3), c(2, 4))\npolygon_with_hole_list = list(polygon_border, polygon_hole)\nst_polygon(polygon_with_hole_list)\n#> POLYGON ((1 5, 2 2, 4 1, 4 4, 1 5), (2 4, 3 4, 3 3, 2 3, 2 4))\n## MULTILINESTRING\nmultilinestring_list = list(rbind(c(1, 5), c(4, 4), c(4, 1), c(2, 2), c(3, 2)), \n rbind(c(1, 2), c(2, 4)))\nst_multilinestring(multilinestring_list)\n#> MULTILINESTRING ((1 5, 4 4, 4 1, 2 2, 3 2), (1 2, 2 4))\n## MULTIPOLYGON\nmultipolygon_list = list(list(rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5))),\n list(rbind(c(0, 2), c(1, 2), c(1, 3), c(0, 3), c(0, 2))))\nst_multipolygon(multipolygon_list)\n#> MULTIPOLYGON (((1 5, 2 2, 4 1, 4 4, 1 5)), ((0 2, 1 2, 1 3, 0 3, 0 2)))\n## GEOMETRYCOLLECTION\ngeometrycollection_list = list(st_multipoint(multipoint_matrix),\n st_linestring(linestring_matrix))\nst_geometrycollection(geometrycollection_list)\n#> GEOMETRYCOLLECTION (MULTIPOINT (5 2, 1 3, 3 4, 3 2),\n#> LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2))"},{"path":"spatial-class.html","id":"sfc","chapter":"2 Geographic data in R","heading":"2.2.7 Simple feature columns (sfc)","text":"One sfg object contains single simple feature geometry.\nsimple feature geometry column (sfc) list sfg objects, additionally able contain information coordinate reference system use.\ninstance, combine two simple features one object two features, can use st_sfc() function.\nimportant since sfc represents geometry column sf data frames:cases, sfc object contains objects geometry type.\nTherefore, convert sfg objects type polygon simple feature geometry column, also end sfc object type polygon, can verified st_geometry_type().\nEqually, geometry column multilinestrings result sfc object type multilinestring:also possible create sfc object sfg objects different geometry types:mentioned , sfc objects can additionally store information coordinate reference systems (CRS).\ndefault value NA (Available), can verified st_crs():geometries sfc objects must CRS.\nCRS can specified crs argument st_sfc() (st_sf()), takes CRS identifier provided text string, crs = \"EPSG:4326\" (see Section 7.2 CRS representations details means).","code":"\n# sfc POINT\npoint1 = st_point(c(5, 2))\npoint2 = st_point(c(1, 3))\npoints_sfc = st_sfc(point1, point2)\npoints_sfc\n#> Geometry set for 2 features \n#> Geometry type: POINT\n#> Dimension: XY\n#> Bounding box: xmin: 1 ymin: 2 xmax: 5 ymax: 3\n#> CRS: NA\n#> POINT (5 2)\n#> POINT (1 3)\n# sfc POLYGON\npolygon_list1 = list(rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5)))\npolygon1 = st_polygon(polygon_list1)\npolygon_list2 = list(rbind(c(0, 2), c(1, 2), c(1, 3), c(0, 3), c(0, 2)))\npolygon2 = st_polygon(polygon_list2)\npolygon_sfc = st_sfc(polygon1, polygon2)\nst_geometry_type(polygon_sfc)\n#> [1] POLYGON POLYGON\n#> 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE\n# sfc MULTILINESTRING\nmultilinestring_list1 = list(rbind(c(1, 5), c(4, 4), c(4, 1), c(2, 2), c(3, 2)), \n rbind(c(1, 2), c(2, 4)))\nmultilinestring1 = st_multilinestring((multilinestring_list1))\nmultilinestring_list2 = list(rbind(c(2, 9), c(7, 9), c(5, 6), c(4, 7), c(2, 7)), \n rbind(c(1, 7), c(3, 8)))\nmultilinestring2 = st_multilinestring((multilinestring_list2))\nmultilinestring_sfc = st_sfc(multilinestring1, multilinestring2)\nst_geometry_type(multilinestring_sfc)\n#> [1] MULTILINESTRING MULTILINESTRING\n#> 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE\n# sfc GEOMETRY\npoint_multilinestring_sfc = st_sfc(point1, multilinestring1)\nst_geometry_type(point_multilinestring_sfc)\n#> [1] POINT MULTILINESTRING\n#> 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE\nst_crs(points_sfc)\n#> Coordinate Reference System: NA\n# Set the CRS with an identifier referring to an 'EPSG' CRS code:\npoints_sfc_wgs = st_sfc(point1, point2, crs = \"EPSG:4326\")\nst_crs(points_sfc_wgs) # print CRS (only first 4 lines of output shown)\n#> Coordinate Reference System:\n#> User input: EPSG:4326 \n#> wkt:\n#> GEOGCRS[\"WGS 84\",\n#> ..."},{"path":"spatial-class.html","id":"sfheaders","chapter":"2 Geographic data in R","heading":"2.2.8 The sfheaders package","text":"\nsfheaders R package speeds-construction, conversion manipulation sf objects (Cooley 2020).\nfocuses building sf objects vectors, matrices data frames, rapidly, without depending sf library; exposing underlying C++ code header files (hence name, sfheaders).\napproach enables others extend using compiled fast-running code.\nEvery core sfheaders function corresponding C++ implementation, described Cpp vignette.\npeople, R functions sufficient benefit computational speed package.\nsfheaders developed separately sf, aims fully compatible, creating valid sf objects type described preceding sections.simplest use case sfheaders demonstrated code chunks examples building sfg, sfc, sf objects showing:vector converted sfg_POINTA matrix converted sfg_LINESTRINGA data frame converted sfg_POLYGONWe start creating simplest possible sfg object, single coordinate pair, assigned vector named v:example shows sfg object v_sfg_sfh printed sf loaded, demonstrating underlying structure.\nsf loaded (case ), result command indistinguishable sf objects:next examples shows sfheaders creates sfg objects matrices data frames:Reusing objects v, m, df can also build simple feature columns (sfc) follows (outputs shown):Similarly, sf objects can created follows:examples CRS (coordinate reference system) defined.\nplan calculations geometric operations using sf functions, encourage set CRS (see Chapter 7 details):sfheaders also good ‘deconstructing’ ‘reconstructing’ sf objects, meaning converting geometry columns data frames contain data coordinates vertex geometry feature (multi-feature) ids.\nfast reliable ‘casting’ geometry columns different types, topic covered Chapter 5.\nBenchmarks, package’s documentation test code developed book, show much faster sf package operations.","code":"\nv = c(1, 1)\nv_sfg_sfh = sfheaders::sfg_point(obj = v)\nv_sfg_sfh # printing without sf loaded\n#> [,1] [,2]\n#> [1,] 1 1\n#> attr(,\"class\")\n#> [1] \"XY\" \"POINT\" \"sfg\" \nv_sfg_sf = st_point(v)\nprint(v_sfg_sf) == print(v_sfg_sfh)\n#> POINT (1 1)\n#> POINT (1 1)\n#> [1] TRUE\n# matrices\nm = matrix(1:8, ncol = 2)\nsfheaders::sfg_linestring(obj = m)\n#> LINESTRING (1 5, 2 6, 3 7, 4 8)\n# data frames\ndf = data.frame(x = 1:4, y = 4:1)\nsfheaders::sfg_polygon(obj = df)\n#> POLYGON ((1 4, 2 3, 3 2, 4 1, 1 4))\nsfheaders::sfc_point(obj = v)\nsfheaders::sfc_linestring(obj = m)\nsfheaders::sfc_polygon(obj = df)\nsfheaders::sf_point(obj = v)\nsfheaders::sf_linestring(obj = m)\nsfheaders::sf_polygon(obj = df)\ndf_sf = sfheaders::sf_polygon(obj = df)\nst_crs(df_sf) = \"EPSG:4326\""},{"path":"spatial-class.html","id":"s2","chapter":"2 Geographic data in R","heading":"2.2.9 Spherical geometry operations with S2","text":"Spherical geometry engines based fact world round simple mathematical procedures geocomputation, calculating straight line two points area enclosed polygon, assume planar (projected) geometries.\nSince sf version 1.0.0, R supports spherical geometry operations ‘box’ (default), thanks interface Google’s S2 spherical geometry engine via s2 interface package\n.\nS2 perhaps best known example Discrete Global Grid System (DGGS).\nAnother example H3 global hexagonal hierarchical spatial index (Bondaruk, Roberts, Robertson 2020).Although potentially useful describing locations anywhere Earth using character strings, main benefit sf’s interface S2 provision drop-functions calculations distance, buffer, area calculations, described sf’s built documentation can opened command vignette(\"sf7\").sf can run two modes respect S2: .\ndefault S2 geometry engine turned , can verified following command:example consequences turning geometry engine shown , creating buffers around india object created earlier chapter (note warnings emitted S2 turned ) (Figure 2.12):\nFIGURE 2.12: Example consequences turning S2 geometry engine. representations buffer around India created command purple polygon object created S2 switched , resulting buffer 1 m. larger light green polygon created S2 switched , resulting buffer 1 degree, accurate.\nright panel Figure 2.12 incorrect buffer 1 degree return equal distance around india polygon (explanation issue, read Section 7.4).Throughout book assume S2 turned , unless explicitly stated.\nTurn following command.","code":"\nsf_use_s2()\n#> [1] TRUE\nindia_buffer_with_s2 = st_buffer(india, 1) # 1 meter\nsf_use_s2(FALSE)\n#> Spherical geometry (s2) switched off\nindia_buffer_without_s2 = st_buffer(india, 1) # 1 degree\n#> Warning in st_buffer.sfc(st_geometry(x), dist, nQuadSegs, endCapStyle =\n#> endCapStyle, : st_buffer does not correctly buffer longitude/latitude data\n#> dist is assumed to be in decimal degrees (arc_degrees).\nsf_use_s2(TRUE)\n#> Spherical geometry (s2) switched on"},{"path":"spatial-class.html","id":"raster-data","chapter":"2 Geographic data in R","heading":"2.3 Raster data","text":"spatial raster data model represents world continuous grid cells (often also called pixels; Figure 2.13:).\ndata model often refers -called regular grids, cell , constant size – focus regular grids book .\nHowever, several types grids exist, including rotated, sheared, rectilinear, curvilinear grids (see Chapter 1 Pebesma Bivand (2023b) Chapter 2 Tennekes Nowosad (2022)).raster data model usually consists raster header\nmatrix (rows columns) representing equally spaced cells (often also called pixels; Figure 2.13:).14\nraster header defines coordinate reference system, extent origin.\norigin (starting point) frequently coordinate lower left corner matrix (terra package, however, uses upper left corner, default (Figure 2.13:B)).\nheader defines extent via number columns, number rows cell size resolution.resolution can calculated follows:\\[\n\\text{resolution} = \\frac{\\text{xmax} - \\text{xmin}}{\\text{ncol}}, \\frac{\\text{ymax} - \\text{ymin}}{\\text{nrow}}\n\\]Starting origin, can easily access modify single cell either using ID cell (Figure 2.13:B) explicitly specifying rows columns.\nmatrix representation avoids storing explicitly coordinates four corner points (fact stores one coordinate, namely origin) cell corner case rectangular vector polygons.\nmap algebra (Section 4.3.2) makes raster processing much efficient faster vector data processing.contrast vector data, cell one raster layer can hold single value.15\nvalue might continuous categorical (Figure 2.13:C).\nFIGURE 2.13: Raster data types: () cell IDs, (B) cell values, (C) colored raster map.\nRaster maps usually represent continuous phenomena elevation, temperature, population density spectral data.\nDiscrete features soil land-cover classes can also represented raster data model.\nuses raster datasets illustrated Figure 2.14, shows borders discrete features may become blurred raster datasets.\nDepending nature application, vector representations discrete features may suitable.\nFIGURE 2.14: Examples continuous categorical rasters.\n","code":""},{"path":"spatial-class.html","id":"r-packages-for-working-with-raster-data","chapter":"2 Geographic data in R","heading":"2.3.1 R packages for working with raster data","text":"last two decades, several packages reading processing raster datasets developed.\noutlined Section 1.6, chief among raster, led step change R’s raster capabilities launched 2010 premier package space development terra stars.\nrecently developed packages provide powerful performant functions working raster datasets substantial overlap possible use cases.\nbook focus terra, replaces older (cases) slower raster.\nlearning terra’s class system works, section describes similarities differences terra stars; knowledge help decide appropriate different situations.First, terra focuses common raster data model (regular grids), stars also allows storing less popular models (including regular, rotated, sheared, rectilinear, curvilinear grids).\nterra usually handles one multilayered rasters16, stars package provides ways store raster data cubes – raster object many layers (e.g., bands), many moments time (e.g., months), many attributes (e.g., sensor type sensor type B).\nImportantly, packages, layers elements data cube must spatial dimensions extent.\nSecond, packages allow either read raster data memory just read metadata – usually done automatically based input file size.\nHowever, store raster values differently.\nterra based C++ code mostly uses C++ pointers.\nstars stores values lists arrays smaller rasters just file path larger ones.\nThird, stars functions closely related vector objects functions sf, terra uses class objects vector data, namely SpatVector, also accepts sf ones.17\nFourth, packages different approach various functions work objects.\nterra package mostly relies large number built-functions, function specific purpose (e.g., resampling cropping).\nhand, stars uses built-functions (usually names starting st_), existing dplyr functions (e.g., filter() slice()), also methods existing R functions (e.g., split() aggregate()).Importantly, straightforward convert objects terra stars (using st_as_stars()) way round (using rast()).\nalso encourage read Pebesma Bivand (2023b) comprehensive introduction stars package.","code":""},{"path":"spatial-class.html","id":"an-introduction-to-terra","chapter":"2 Geographic data in R","heading":"2.3.2 An introduction to terra","text":"\nterra package supports raster objects R.\nprovides extensive set functions create, read, export, manipulate process raster datasets.\nterra’s functionality largely mature raster package, differences: terra functions usually computationally efficient raster equivalents.\nhand, raster class system popular used many packages.\ncan seamlessly translate two types object ensure backwards compatibility older scripts packages, example, functions raster(), stack(), brick() raster package (see previous chapter evolution R packages working geographic data).addition functions raster data manipulation, terra provides many low-level functions can form foundation developing new tools working raster datasets.\nterra also lets work large raster datasets large fit main memory.\ncase, terra provides possibility divide raster smaller chunks, processes iteratively instead loading whole raster file RAM.illustration terra concepts, use datasets spDataLarge (Nowosad Lovelace 2023).\nconsists raster objects one vector object covering area Zion National Park (Utah, USA).\nexample, srtm.tif digital elevation model area (details, see documentation ?srtm).\nFirst, let’s create SpatRaster object named my_rast:Typing name raster console, print raster header (dimensions, resolution, extent, CRS) additional information (class, data source, summary raster values):Dedicated functions report component: dim() returns number rows, columns layers; ncell() number cells (pixels); res() spatial resolution; ext() spatial extent; crs() coordinate reference system (raster reprojection covered Section 7.8).\ninMemory() reports whether raster data stored memory disk, sources specifies file location.","code":"\nraster_filepath = system.file(\"raster/srtm.tif\", package = \"spDataLarge\")\nmy_rast = rast(raster_filepath)\nclass(my_rast)\n#> [1] \"SpatRaster\"\n#> attr(,\"package\")\n#> [1] \"terra\"\nmy_rast\n#> class : SpatRaster \n#> dimensions : 457, 465, 1 (nrow, ncol, nlyr)\n#> resolution : 0.000833, 0.000833 (x, y)\n#> extent : -113, -113, 37.1, 37.5 (xmin, xmax, ymin, ymax)\n#> coord. ref. : lon/lat WGS 84 (EPSG:4326) \n#> source : srtm.tif \n#> name : srtm \n#> min value : 1024 \n#> max value : 2892"},{"path":"spatial-class.html","id":"basic-map-raster","chapter":"2 Geographic data in R","heading":"2.3.3 Basic map-making","text":"Similar sf package, terra also provides plot() methods classes.\n\nFIGURE 2.15: Basic raster plot.\nseveral approaches plotting raster data R outside scope section, including:plotRGB() function terra package create plot based three layers SpatRaster objectPackages tmap create static interactive maps raster vector objects (see Chapter 9)Functions, example levelplot() rasterVis package, create facets, common technique visualizing change time","code":"\nplot(my_rast)"},{"path":"spatial-class.html","id":"raster-classes","chapter":"2 Geographic data in R","heading":"2.3.4 Raster classes","text":"\nSpatRaster class represents rasters object terra.\neasiest way create raster object R read-raster file disk server (Section 8.3.2).\nterra package supports numerous drivers help GDAL library.\nRasters files usually read entirely RAM, exception header pointer file .Rasters can also created scratch using rast() function.\nillustrated subsequent code chunk, results new SpatRaster object.\nresulting raster consists 36 cells (6 columns 6 rows specified nrows ncols) centered around Prime Meridian Equator (see xmin, xmax, ymin ymax parameters).\nValues (vals) assigned cell: 1 cell 1, 2 cell 2, .\nRemember: rast() fills cells row-wise (unlike matrix()) starting upper left corner, meaning top row contains values 1 6, second 7 12, etc.\nways creating raster objects, see ?rast.Given number rows columns well extent (xmin, xmax, ymin, ymax), resolution 0.5.\nunit resolution underlying CRS.\n, degrees, default CRS raster objects WGS84.\nHowever, one can specify CRS crs argument.SpatRaster class also handles multiple layers, typically correspond single multispectral satellite file time-series rasters.nlyr() retrieves number layers stored SpatRaster object:multilayer raster objects, layers can selected [[ $ operators, example commands multi_rast[[\"landsat_1\"]] multi_rast$landsat_1.\nterra::subset() can also used select layers.\naccepts layer number name second argument:opposite operation, combining several SpatRaster objects one, can done using c function:","code":"\nsingle_raster_file = system.file(\"raster/srtm.tif\", package = \"spDataLarge\")\nsingle_rast = rast(raster_filepath)\nnew_raster = rast(nrows = 6, ncols = 6, \n xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5,\n vals = 1:36)\nmulti_raster_file = system.file(\"raster/landsat.tif\", package = \"spDataLarge\")\nmulti_rast = rast(multi_raster_file)\nmulti_rast\n#> class : SpatRaster \n#> dimensions : 1428, 1128, 4 (nrow, ncol, nlyr)\n#> resolution : 30, 30 (x, y)\n#> extent : 301905, 335745, 4111245, 4154085 (xmin, xmax, ymin, ymax)\n#> coord. ref. : WGS 84 / UTM zone 12N (EPSG:32612) \n#> source : landsat.tif \n#> names : landsat_1, landsat_2, landsat_3, landsat_4 \n#> min values : 7550, 6404, 5678, 5252 \n#> max values : 19071, 22051, 25780, 31961\nnlyr(multi_rast)\n#> [1] 4\nmulti_rast3 = subset(multi_rast, 3)\nmulti_rast4 = subset(multi_rast, \"landsat_4\")\nmulti_rast34 = c(multi_rast3, multi_rast4)"},{"path":"spatial-class.html","id":"crs-intro","chapter":"2 Geographic data in R","heading":"2.4 Coordinate Reference Systems","text":"\nVector raster spatial data types share concepts intrinsic spatial data.\nPerhaps fundamental Coordinate Reference System (CRS), defines spatial elements data relate surface Earth (bodies).\nCRSs either geographic projected, introduced beginning chapter (see Figure 2.1).\nsection explains type, laying foundations Chapter 7, provides deep dive setting, transforming querying CRSs.","code":""},{"path":"spatial-class.html","id":"geographic-coordinate-reference-systems","chapter":"2 Geographic data in R","heading":"2.4.1 Geographic coordinate reference systems","text":"\nGeographic coordinate reference systems identify location Earth’s surface using two values — longitude latitude (Figure 2.17, left panel).\nLongitude location East-West direction angular distance Prime Meridian plane.\nLatitude angular distance North South equatorial plane.\nDistances geographic CRSs therefore measured meters.\nimportant consequences, demonstrated Section 7.surface Earth geographic coordinate reference systems represented spherical ellipsoidal surface.\nSpherical models assume Earth perfect sphere given radius – advantage simplicity , time, inaccurate Earth exactly sphere.\nEllipsoidal models slightly accurate, defined two parameters: equatorial radius polar radius.\nsuitable Earth compressed: equatorial radius around 11.5 km longer polar radius (Maling 1992).18Ellipsoids part wider component CRSs: datum.\ncontains information ellipsoid use precise relationship coordinates location Earth’s surface.\ntwo types datum — geocentric (WGS84) local (NAD83).\ncan see examples two types datums Figure 2.16.\nBlack lines represent geocentric datum, whose center located Earth’s center gravity optimized specific location.\nlocal datum, shown purple dashed line, ellipsoidal surface shifted align surface particular location.\nallow local variations Earth’s surface, example due large mountain ranges, accounted local CRS.\ncan seen Figure 2.16, local datum fitted area Philippines, misaligned rest planet’s surface.\ndatums Figure 2.16 put top geoid - model global mean sea level.19\nFIGURE 2.16: Geocentric local geodetic datums shown top geoid (false color vertical exaggeration 10,000 scale factor). Image geoid adapted work Ince et al. (2019).\n","code":""},{"path":"spatial-class.html","id":"projected-coordinate-reference-systems","chapter":"2 Geographic data in R","heading":"2.4.2 Projected coordinate reference systems","text":"\nprojected CRSs based geographic CRS, described previous section, rely map projections convert three-dimensional surface Earth Easting Northing (x y) values projected CRS.\nProjected CRSs based Cartesian coordinates implicitly flat surface (Figure 2.17, right panel).\norigin, x y axes, linear unit measurement meters.transition done without adding deformations.\nTherefore, properties Earth’s surface distorted process, area, direction, distance, shape.\nprojected coordinate reference system can preserve one two properties.\nProjections often named based property preserve: equal-area preserves area, azimuthal preserve direction, equidistant preserve distance, conformal preserve local shape.three main groups projection types - conic, cylindrical, planar (azimuthal).\nconic projection, Earth’s surface projected onto cone along single line tangency two lines tangency.\nDistortions minimized along tangency lines rise distance lines projection.\nTherefore, best suited maps mid-latitude areas.\ncylindrical projection maps surface onto cylinder.\nprojection also created touching Earth’s surface along single line tangency two lines tangency.\nCylindrical projections used often mapping entire world.\nplanar projection projects data onto flat surface touching globe point along line tangency.\ntypically used mapping polar regions.\nsf_proj_info(type = \"proj\") gives list available projections supported PROJ library.quick summary different projections, types, properties, suitability can found www.geo-projections.com.\nexpand CRSs explain project one CRS another Chapter 7.\nnow, sufficient know:coordinate systems key component geographic objectsKnowing CRS data , whether geographic (lon/lat) projected (typically meters), important consequences R handles spatial geometry operationsCRSs sf objects can queried function st_crs(), CRSs terra objects can queried function crs()\nFIGURE 2.17: Examples geographic (WGS 84; left) projected (NAD83 / UTM zone 12N; right) coordinate systems vector data type.\n","code":""},{"path":"spatial-class.html","id":"units","chapter":"2 Geographic data in R","heading":"2.5 Units","text":"important feature CRSs contain information spatial units.\nClearly, vital know whether house’s measurements feet meters, applies maps.\ngood cartographic practice add scale bar distance indicator onto maps demonstrate relationship distances page screen distances ground.\nLikewise, important formally specify units geometry data cells measured provide context, ensure subsequent calculations done context.novel feature geometry data sf objects native support units.\nmeans distance, area geometric calculations sf return values come units attribute, defined units package (Pebesma, Mailund, Hiebert 2016).\nadvantageous, preventing confusion caused different units (CRSs use meters, use feet) providing information dimensionality.\ndemonstrated code chunk , calculates area Luxembourg:\noutput units square meters (m2), showing result represents two-dimensional space.\ninformation, stored attribute (interested readers can discover attributes(st_area(luxembourg))), can feed subsequent calculations use units, population density (measured people per unit area, typically per km2).\nReporting units prevents confusion.\ntake Luxembourg example, units remained unspecified, one incorrectly assume units hectares.\ntranslate huge number digestible size, tempting divide results million (number square meters square kilometer):However, result incorrectly given square meters.\nsolution set correct units units package:Units equal importance case raster data.\nHowever, far sf spatial package supports units, meaning people working raster data approach changes units analysis (example, converting pixel widths imperial decimal units) care.\nmy_rast object (see ) uses WGS84 projection decimal degrees units.\nConsequently, resolution also given decimal degrees know , since res() function simply returns numeric vector.used UTM projection, units change., res() command gives back numeric vector without unit, forcing us know unit UTM projection meters.","code":"\nluxembourg = world[world$name_long == \"Luxembourg\", ]\nst_area(luxembourg) # requires the s2 package in recent versions of sf\n#> 2.41e+09 [m^2]\nst_area(luxembourg) / 1000000\n#> 2409 [m^2]\nunits::set_units(st_area(luxembourg), km^2)\n#> 2409 [km^2]\nres(my_rast)\n#> [1] 0.000833 0.000833\nrepr = project(my_rast, \"EPSG:26912\")\nres(repr)\n#> [1] 83.5 83.5"},{"path":"spatial-class.html","id":"ex2","chapter":"2 Geographic data in R","heading":"2.6 Exercises","text":"E1. Use summary() geometry column world data object included spData package. output tell us :geometry type?number countries?coordinate reference system (CRS)?E2. Run code ‘generated’ map world Section 2.2.3 (Basic map-making).\nFind two similarities two differences image computer book.cex argument (see ?plot)?cex set sqrt(world$pop) / 10000?Bonus: experiment different ways visualize global population.E3. Use plot() create maps Nigeria context (see Section 2.2.3).Adjust lwd, col expandBB arguments plot().Challenge: read documentation text() annotate map.E4. Create empty SpatRaster object called my_raster 10 columns 10 rows.\nAssign random values 0 10 new raster plot .E5. Read-raster/nlcd.tif file spDataLarge package.\nkind information can get properties file?E6. Check CRS raster/nlcd.tif file spDataLarge package.\nkind information can learn ?","code":""},{"path":"attr.html","id":"attr","chapter":"3 Attribute data operations","heading":"3 Attribute data operations","text":"","code":""},{"path":"attr.html","id":"prerequisites-1","chapter":"3 Attribute data operations","heading":"Prerequisites","text":"chapter requires following packages installed attached:relies spData, loads datasets used code examples chapter:Also ensure installed tidyr package, tidyverse part, want run data ‘tidying’ operations Section 3.2.5.","code":"\nlibrary(sf) # vector data package introduced in Chapter 2\nlibrary(terra) # raster data package introduced in Chapter 2\nlibrary(dplyr) # tidyverse package for data frame manipulation\nlibrary(spData) # spatial data package introduced in Chapter 2"},{"path":"attr.html","id":"introduction","chapter":"3 Attribute data operations","heading":"3.1 Introduction","text":"\nAttribute data non-spatial information associated geographic (geometry) data.\nbus stop provides simple example: position typically represented latitude longitude coordinates (geometry data), addition name.\nElephant & Castle / New Kent Road stop London, example coordinates -0.098 degrees longitude 51.495 degrees latitude can represented POINT (-0.098 51.495) sfc representation described Chapter 2.\nAttributes, name, POINT feature (use simple features terminology) topic chapter.\nAnother example elevation value (attribute) specific grid cell raster data.\nUnlike vector data model, raster data model stores coordinate grid cell indirectly, meaning distinction attribute spatial information less clear.\nillustrate point, think pixel 3rd row 4th column raster matrix.\nspatial location defined index matrix: move origin four cells x direction (typically east right maps) three cells y direction (typically south ).\nraster’s resolution defines distance x- y-step specified header.\nheader vital component raster datasets specifies pixels relate spatial coordinates (see also Chapter 4).chapter teaches manipulate geographic objects based attributes names bus stops vector dataset elevations pixels raster dataset.\nvector data, means techniques subsetting aggregation (see Sections 3.2.1 3.2.3).\nSections 3.2.4 3.2.5 demonstrate join data onto simple feature objects using shared ID create new variables, respectively.\noperations spatial equivalent:\n[ operator base R, example, works equally subsetting objects based attribute spatial objects; can also join attributes two geographic datasets using spatial joins.\ngood news: skills developed chapter cross-transferable.deep dive various types vector attribute operations next section, raster attribute data operations covered.\nCreation raster layers containing continuous categorical attributes extraction cell values one layer (raster subsetting) (Section 3.3.1) demonstrated.\nSection 3.3.2 provides overview ‘global’ raster operations can used summarize entire raster datasets.\nChapter 4 extends methods presented spatial world.","code":""},{"path":"attr.html","id":"vector-attribute-manipulation","chapter":"3 Attribute data operations","heading":"3.2 Vector attribute manipulation","text":"\nGeographic vector datasets well supported R thanks sf class, extends base R’s data.frame.\nLike data frames, sf objects one column per attribute variable (‘name’) one row per observation feature (e.g., per bus station).\nsf objects differ basic data frames geometry column class sfc can contain range geographic entities (single ‘multi’ point, line, polygon features) per row.\ndescribed Chapter 2, demonstrated generic methods plot() summary() work sf objects.\nsf also provides generics allow sf objects behave like regular data frames, shown printing class’s methods:Many (aggregate(), cbind(), merge(), rbind() [) manipulating data frames.\nrbind(), example, binds rows data frames together, one ‘top’ .\n$<- creates new columns.\nkey feature sf objects store spatial non-spatial data way, columns data.frame.geometry column sf objects typically called geometry geom name can used.\nfollowing command, example, creates geometry column named g:st_sf(data.frame(n = world$name_long), g = world$geom)sf objects can also extend tidyverse classes data frames, tbl_df tbl.\nThus sf enables full power R’s data analysis capabilities unleashed geographic data, whether use base R tidyverse functions data analysis.\nsf objects can also used high-performance data processing package data.table although, documented issue Rdatatable/data.table#2273, fully compatible sf objects.\nusing capabilities worth re-capping discover basic properties vector data objects.\nLet’s start using base R functions learn world dataset spData package:\nworld contains ten non-geographic columns (one geometry list column) almost 200 rows representing world’s countries.\nfunction st_drop_geometry() keeps attributes data sf object, words removing geometry:Dropping geometry column working attribute data can useful; data manipulation processes can run faster work attribute data geometry columns always needed.\ncases, however, makes sense keep geometry column, explaining column ‘sticky’ (remains attribute operations unless specifically dropped).\nNon-spatial data operations sf objects change object’s geometry appropriate (e.g., dissolving borders adjacent polygons following aggregation).\nBecoming skilled geographic attribute data manipulation means becoming skilled manipulating data frames.many applications, tidyverse package dplyr (Wickham et al. 2023) offers effective approach working data frames.\nTidyverse compatibility advantage sf predecessor sp, pitfalls avoid (see supplementary tidyverse-pitfalls vignette geocompx.org details).","code":"\nmethods(class = \"sf\") # methods for sf objects, first 12 shown\n#> [1] [ [[<- $<- aggregate \n#> [5] as.data.frame cbind coerce filter \n#> [9] identify initialize merge plot \nclass(world) # it's an sf object and a (tidy) data frame\n#> [1] \"sf\" \"tbl_df\" \"tbl\" \"data.frame\"\ndim(world) # it is a 2 dimensional object, with 177 rows and 11 columns\n#> [1] 177 11\nworld_df = st_drop_geometry(world)\nclass(world_df)\n#> [1] \"tbl_df\" \"tbl\" \"data.frame\"\nncol(world_df)\n#> [1] 10"},{"path":"attr.html","id":"vector-attribute-subsetting","chapter":"3 Attribute data operations","heading":"3.2.1 Vector attribute subsetting","text":"Base R subsetting methods include operator [ function subset().\nkey dplyr subsetting functions filter() slice() subsetting rows, select() subsetting columns.\napproaches preserve spatial components attribute data sf objects, using operator $ dplyr function pull() return single attribute column vector lose geometry data, see.\nsection focuses subsetting sf data frames; details subsetting vectors non-geographic data frames recommend reading section section 2.7 Introduction R (R Core Team 2021) Chapter 4 Advanced R Programming (Wickham 2019), respectively.\n[ operator can subset rows columns.\nIndices placed inside square brackets placed directly data frame object name specify elements keep.\ncommand object[, j] means ‘return rows represented columns represented j’, j typically contain integers TRUEs FALSEs (indices can also character strings, indicating row column names).\nobject[5, 1:3], example, means ’return data containing 5th row columns 1 3: result data frame 1 row 3 columns, fourth geometry column ’s sf object.\nLeaving j empty returns rows columns, world[1:5, ] returns first five rows 11 columns.\nexamples demonstrate subsetting base R.\nGuess number rows columns sf data frames returned command check results computer (see end chapter exercises):demonstration utility using logical vectors subsetting shown code chunk .\ncreates new object, small_countries, containing nations whose surface area smaller 10,000 km2.intermediary i_small (short index representing small countries) logical vector can used subset seven smallest countries world surface area.\nconcise command, omits intermediary object, generates result:base R function subset() provides another way achieve result:\nBase R functions mature, stable widely used, making rock solid choice, especially contexts reproducibility reliability key.\ndplyr functions enable ‘tidy’ workflows people (authors book included) find intuitive productive interactive data analysis, especially combined code editors RStudio enable auto-completion column names.\nKey functions subsetting data frames (including sf data frames) dplyr functions demonstrated .select() selects columns name position.\nexample, select two columns, name_long pop, following command:Note: equivalent command base R (world[, c(\"name_long\", \"pop\")]), sticky geom column remains.\nselect() also allows selecting range columns help : operator:can remove specific columns - operator:Subset rename columns time new_name = old_name syntax:worth noting command concise base R equivalent, requires two lines code:select() also works ‘helper functions’ advanced subsetting operations, including contains(), starts_with() num_range() (see help page ?select details).dplyr verbs return data frame, can extract single column vector pull().\ncan get result base R list subsetting operators $ [[, three following commands return numeric vector:slice() row-equivalent select().\nfollowing code chunk, example, selects rows 1 6:filter() dplyr’s equivalent base R’s subset() function.\nkeeps rows matching given criteria, e.g., countries area certain threshold, high average life expectancy, shown following examples:standard set comparison operators can used filter() function, illustrated Table 3.1:TABLE 3.1: Comparison operators return Booleans (TRUE/FALSE).","code":"\nworld[1:6, ] # subset rows by position\nworld[, 1:3] # subset columns by position\nworld[1:6, 1:3] # subset rows and columns by position\nworld[, c(\"name_long\", \"pop\")] # columns by name\nworld[, c(T, T, F, F, F, F, F, T, T, F, F)] # by logical indices\nworld[, 888] # an index representing a non-existent column\ni_small = world$area_km2 < 10000\nsummary(i_small) # a logical vector\n#> Mode FALSE TRUE \n#> logical 170 7\nsmall_countries = world[i_small, ]\nsmall_countries = world[world$area_km2 < 10000, ]\nsmall_countries = subset(world, area_km2 < 10000)\nworld1 = select(world, name_long, pop)\nnames(world1)\n#> [1] \"name_long\" \"pop\" \"geom\"\n# all columns between name_long and pop (inclusive)\nworld2 = select(world, name_long:pop)\n# all columns except subregion and area_km2 (inclusive)\nworld3 = select(world, -subregion, -area_km2)\nworld4 = select(world, name_long, population = pop)\nworld5 = world[, c(\"name_long\", \"pop\")] # subset columns by name\nnames(world5)[names(world5) == \"pop\"] = \"population\" # rename column manually\npull(world, pop)\nworld$pop\nworld[[\"pop\"]]\nslice(world, 1:6)\nworld7 = filter(world, area_km2 < 10000) # countries with a small area\nworld7 = filter(world, lifeExp > 82) # with high life expectancy"},{"path":"attr.html","id":"chaining-commands-with-pipes","chapter":"3 Attribute data operations","heading":"3.2.2 Chaining commands with pipes","text":"\nKey workflows using dplyr functions ‘pipe’ operator %>% (since R 4.1.0 native pipe |>), takes name Unix pipe | (Grolemund Wickham 2016).\nPipes enable expressive code: output previous function becomes first argument next function, enabling chaining.\nillustrated , countries Asia filtered world dataset, next object subset columns (name_long continent) first five rows (result shown).chunk shows pipe operator allows commands written clear order:\nrun top bottom (line--line) left right.\nalternative piped operations nested function calls, harder read:Another alternative split operations multiple self-contained lines, recommended developing new R packages, approach advantage saving intermediate results distinct names can later inspected debugging purposes (approach disadvantages verbose cluttering global environment undertaking interactive analysis):approach advantages disadvantages, importance depend programming style applications.\ninteractive data analysis, focus chapter, find piped operations fast intuitive, especially combined RStudio/VSCode shortcuts creating pipes auto-completing variable names.","code":"\nworld7 = world |>\n filter(continent == \"Asia\") |>\n select(name_long, continent) |>\n slice(1:5)\nworld8 = slice(\n select(\n filter(world, continent == \"Asia\"),\n name_long, continent),\n 1:5)\nworld9_filtered = filter(world, continent == \"Asia\")\nworld9_selected = select(world9_filtered, continent)\nworld9 = slice(world9_selected, 1:5)"},{"path":"attr.html","id":"vector-attribute-aggregation","chapter":"3 Attribute data operations","heading":"3.2.3 Vector attribute aggregation","text":"\nAggregation involves summarizing data one ‘grouping variables’, typically columns data frame aggregated (geographic aggregation covered next chapter).\nexample attribute aggregation calculating number people per continent based country-level data (one row per country).\nworld dataset contains necessary ingredients: columns pop continent, population grouping variable, respectively.\naim find sum() country populations continent, resulting smaller data frame (aggregation form data reduction can useful early step working large datasets).\ncan done base R function aggregate() follows:result non-spatial data frame six rows, one per continent, two columns reporting name population continent (see Table 3.2 results top 3 populous continents).aggregate() generic function means behaves differently depending inputs.\nsf provides method aggregate.sf() activated automatically x sf object argument provided:resulting world_agg2 object spatial object containing 8 features representing continents world (open ocean).\ngroup_by() |> summarize() dplyr equivalent aggregate(), variable name provided group_by() function specifying grouping variable information summarized passed summarize() function, shown :approach may seem complex benefits: flexibility, readability, control new column names.\nflexibility illustrated command , calculates population also area number countries continent:previous code chunk Pop, Area N column names result, sum() n() aggregating functions.\naggregating functions return sf objects rows representing continents geometries containing multiple polygons representing land mass associated islands (works thanks geometric operation ‘union’, explained Section 5.2.7).\nLet’s combine learned far dplyr functions, chaining multiple commands summarize attribute data countries worldwide continent.\nfollowing command calculates population density (mutate()), arranges continents number countries contain (arrange()), keeps 3 populous continents (slice_max()), result presented Table 3.2):TABLE 3.2: top 3 populous continents ordered number countries.","code":"\nworld_agg1 = aggregate(pop ~ continent, FUN = sum, data = world,\n na.rm = TRUE)\nclass(world_agg1)\n#> [1] \"data.frame\"\nworld_agg2 = aggregate(world[\"pop\"], by = list(world$continent), FUN = sum, \n na.rm = TRUE)\nclass(world_agg2)\n#> [1] \"sf\" \"data.frame\"\nnrow(world_agg2)\n#> [1] 8\nworld_agg3 = world |>\n group_by(continent) |>\n summarize(pop = sum(pop, na.rm = TRUE))\nworld_agg4 = world |> \n group_by(continent) |> \n summarize(Pop = sum(pop, na.rm = TRUE), Area = sum(area_km2), N = n())\nworld_agg5 = world |> \n st_drop_geometry() |> # drop the geometry for speed\n select(pop, continent, area_km2) |> # subset the columns of interest \n group_by(continent) |> # group by continent and summarize:\n summarize(Pop = sum(pop, na.rm = TRUE), Area = sum(area_km2), N = n()) |>\n mutate(Density = round(Pop / Area)) |> # calculate population density\n slice_max(Pop, n = 3) |> # keep only the top 3\n arrange(desc(N)) # arrange in order of n. countries"},{"path":"attr.html","id":"vector-attribute-joining","chapter":"3 Attribute data operations","heading":"3.2.4 Vector attribute joining","text":"Combining data different sources common task data preparation.\nJoins combining tables based shared ‘key’ variable.\ndplyr multiple join functions including left_join() inner_join() — see vignette(\"two-table\") full list.\nfunction names follow conventions used database language SQL (Grolemund Wickham 2016, chap. 13); using join non-spatial datasets sf objects focus section.\ndplyr join functions work data frames sf objects, important difference geometry list column.\nresult data joins can either sf data.frame object.\ncommon type attribute join spatial data takes sf object first argument adds columns data.frame specified second argument.\ndemonstrate joins, combine data coffee production world dataset.\ncoffee data data frame called coffee_data spData package (see ?coffee_data details).\nthree columns:\nname_long names major coffee-producing nations coffee_production_2016 coffee_production_2017 contain estimated values coffee production units 60-kg bags year.\n‘left join’, preserves first dataset, merges world coffee_data.input datasets share ‘key variable’ (name_long) join worked without using argument (see ?left_join details).\nresult sf object identical original world object two new variables (column indices 11 12) coffee production.\ncan plotted map, illustrated Figure 3.1, generated plot() function .\nFIGURE 3.1: World coffee production (thousand 60-kg bags) country, 2017. Source: International Coffee Organization.\njoining work, ‘key variable’ must supplied datasets.\ndefault, dplyr uses variables matching names.\ncase, coffee_data world objects contained variable called name_long, explaining message Joining '= join_by(name_long)'.\nmajority cases variable names , two options:Rename key variable one objects match.Use argument specify joining variables.latter approach demonstrated renamed version coffee_data.Note name original object kept, meaning world_coffee new object world_coffee2 identical.\nAnother feature result number rows original dataset.\nAlthough 47 rows data coffee_data, 177 country records kept intact world_coffee world_coffee2:\nrows original dataset match assigned NA values new coffee production variables.\nwant keep countries match key variable?\ncase inner join can used.Note result inner_join() 45 rows compared 47 coffee_data.\nhappened remaining rows?\ncan identify rows match using setdiff() function follows:result shows Others accounts one row present world dataset name Democratic Republic Congo accounts :\nabbreviated, causing join miss .\nfollowing command uses string matching (regex) function stringr package confirm Congo, Dem. Rep. .fix issue, create new version coffee_data update name.\ninner_join()ing updated data frame returns result 46 coffee-producing nations.also possible join direction: starting non-spatial dataset adding variables simple features object.\ndemonstrated , starts coffee_data object adds variables original world dataset.\ncontrast previous joins, result another simple feature object, data frame form tidyverse tibble:\noutput join tends match first argument.section covers majority joining use cases.\ninformation, recommend reading chapter Relational data Grolemund Wickham (2016), join vignette geocompkg package accompanies book, documentation describing joins data.table packages.\nAdditionally, spatial joins covered next chapter (Section 4.2.5).","code":"\nworld_coffee = left_join(world, coffee_data)\n#> Joining with `by = join_by(name_long)`\nclass(world_coffee)\n#> [1] \"sf\" \"tbl_df\" \"tbl\" \"data.frame\"\nnames(world_coffee)\n#> [1] \"iso_a2\" \"name_long\" \"continent\" \n#> [4] \"region_un\" \"subregion\" \"type\" \n#> [7] \"area_km2\" \"pop\" \"lifeExp\" \n#> [10] \"gdpPercap\" \"geom\" \"coffee_production_2016\"\n#> [13] \"coffee_production_2017\"\nplot(world_coffee[\"coffee_production_2017\"])\ncoffee_renamed = rename(coffee_data, nm = name_long)\nworld_coffee2 = left_join(world, coffee_renamed, by = join_by(name_long == nm))\nworld_coffee_inner = inner_join(world, coffee_data)\n#> Joining with `by = join_by(name_long)`\nnrow(world_coffee_inner)\n#> [1] 45\nsetdiff(coffee_data$name_long, world$name_long)\n#> [1] \"Congo, Dem. Rep. of\" \"Others\"\ndrc = stringr::str_subset(world$name_long, \"Dem*.+Congo\")\ndrc\n#> [1] \"Democratic Republic of the Congo\"\ncoffee_data$name_long[grepl(\"Congo,\", coffee_data$name_long)] = drc\nworld_coffee_match = inner_join(world, coffee_data)\n#> Joining with `by = join_by(name_long)`\nnrow(world_coffee_match)\n#> [1] 46\ncoffee_world = left_join(coffee_data, world)\n#> Joining with `by = join_by(name_long)`\nclass(coffee_world)\n#> [1] \"tbl_df\" \"tbl\" \"data.frame\""},{"path":"attr.html","id":"vec-attr-creation","chapter":"3 Attribute data operations","heading":"3.2.5 Creating attributes and removing spatial information","text":"\nOften, like create new column based already existing columns.\nexample, want calculate population density country.\nneed divide population column, pop, area column, area_km2 unit area square kilometers.\nUsing base R, can type:\nAlternatively, can use one dplyr functions - mutate() transmute().\nmutate() adds new columns penultimate position sf object (last one reserved geometry):difference mutate() transmute() latter drops existing columns (except sticky geometry column).\nunite() tidyr package (provides many useful functions reshaping datasets, including pivot_longer()) pastes together existing columns.\nexample, want combine continent region_un columns new column named con_reg.\nAdditionally, can define separator (: colon :) defines values input columns joined, original columns removed (: TRUE).resulting sf object new column called con_reg representing continent region country, e.g., South America:Americas Argentina South America countries.\ntidyr’s separate() function opposite unite(): splits one column multiple columns using either regular expression character positions.\ndplyr function rename() base R function setNames() useful renaming columns.\nfirst replaces old name new one.\nfollowing command, example, renames lengthy name_long column simply name:\nsetNames() changes column names , requires character vector name matching column.\nillustrated , outputs world object, short names:\nattribute data operations preserve geometry simple features.\nSometimes makes sense remove geometry, example speed-aggregation.\nst_drop_geometry(), manually commands select(world, -geom), shown .20","code":"\nworld_new = world # do not overwrite our original data\nworld_new$pop_dens = world_new$pop / world_new$area_km2\nworld_new2 = world |> \n mutate(pop_dens = pop / area_km2)\nworld_unite = world |>\n tidyr::unite(\"con_reg\", continent:region_un, sep = \":\", remove = TRUE)\nworld_separate = world_unite |>\n tidyr::separate(con_reg, c(\"continent\", \"region_un\"), sep = \":\")\nworld |> \n rename(name = name_long)\nnew_names = c(\"i\", \"n\", \"c\", \"r\", \"s\", \"t\", \"a\", \"p\", \"l\", \"gP\", \"geom\")\nworld_new_names = world |>\n setNames(new_names)\nworld_data = world |> st_drop_geometry()\nclass(world_data)\n#> [1] \"tbl_df\" \"tbl\" \"data.frame\""},{"path":"attr.html","id":"manipulating-raster-objects","chapter":"3 Attribute data operations","heading":"3.3 Manipulating raster objects","text":"contrast vector data model underlying simple features (represents points, lines polygons discrete entities space), raster data represent continuous surfaces.\nsection shows raster objects work creating scratch, building Section 2.3.2.\nunique structure, subsetting operations raster datasets work different way, demonstrated Section 3.3.1.\nfollowing code recreates raster dataset used Section 2.3.4, result illustrated Figure 3.2.\ndemonstrates rast() function works create example raster named elev (representing elevations).result raster object 6 rows 6 columns (specified nrow ncol arguments), minimum maximum spatial extent x y direction (xmin, xmax, ymin, ymax).\nvals argument sets values cell contains: numeric data ranging 1 36 case.\nRaster objects can also contain categorical values class logical factor variables R.\nfollowing code creates raster datasets shown Figure 3.2:\nraster object stores corresponding look-table “Raster Attribute Table” (RAT) list data frames, can viewed cats(grain) (see ?cats() information).\nelement list layer raster.\nalso possible use function levels() retrieving adding new replacing existing factor levels.\nFIGURE 3.2: Raster datasets numeric (left) categorical values (right).\n","code":"\nelev = rast(nrows = 6, ncols = 6,\n xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5,\n vals = 1:36)\ngrain_order = c(\"clay\", \"silt\", \"sand\")\ngrain_char = sample(grain_order, 36, replace = TRUE)\ngrain_fact = factor(grain_char, levels = grain_order)\ngrain = rast(nrows = 6, ncols = 6, \n xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5,\n vals = grain_fact)\ngrain2 = grain # do not overwrite the original data\nlevels(grain2) = data.frame(value = c(0, 1, 2), wetness = c(\"wet\", \"moist\", \"dry\"))\nlevels(grain2)\n#> [[1]]\n#> value wetness\n#> 1 0 wet\n#> 2 1 moist\n#> 3 2 dry"},{"path":"attr.html","id":"raster-subsetting","chapter":"3 Attribute data operations","heading":"3.3.1 Raster subsetting","text":"Raster subsetting done base R operator [, accepts variety inputs:\nRow-column indexingCell IDsCoordinatesAnother spatial objectHere, show first two options since can considered non-spatial operations.\nneed spatial object subset another output spatial object, refer spatial subsetting.\nTherefore, latter two options shown next chapter (see Section 4.3.1).\nfirst two subsetting options demonstrated commands —\nreturn value top left pixel raster object elev (results shown).Subsetting multilayered raster objects return cell value(s) layer.\nexample, two_layers = c(grain, elev); two_layers[1] returns data frame one row two columns — one layer.\nextract values can also use values().Cell values can modified overwriting existing values conjunction subsetting operation.\nfollowing code chunk, example, sets upper left cell elev 0 (results shown):Leaving square brackets empty shortcut version values() retrieving values raster.\nMultiple cells can also modified way:Replacing values multilayered rasters can done matrix many columns layers rows replaceable cells (results shown):","code":"\n# row 1, column 1\nelev[1, 1]\n# cell ID 1\nelev[1]\nelev[1, 1] = 0\nelev[]\nelev[1, c(1, 2)] = 0\ntwo_layers = c(grain, elev) \ntwo_layers[1] = cbind(c(1), c(4))\ntwo_layers[]"},{"path":"attr.html","id":"summarizing-raster-objects","chapter":"3 Attribute data operations","heading":"3.3.2 Summarizing raster objects","text":"terra contains functions extracting descriptive statistics entire rasters.\nPrinting raster object console typing name returns minimum maximum values raster.\nsummary() provides common descriptive statistics – minimum, maximum, quartiles number NAs continuous rasters number cells class categorical rasters.\nsummary operations standard deviation (see ) custom summary statistics can calculated global().\nAdditionally, freq() function allows get frequency table categorical values.Raster value statistics can visualized variety ways.\nSpecific functions boxplot(), density(), hist() pairs() work also raster objects, demonstrated histogram created command (shown).\ncase desired visualization function work raster objects, one can extract raster data plotted help values() (Section 3.3.1).Descriptive raster statistics belong -called global raster operations.\ntypical raster processing operations part map algebra scheme, covered next chapter (Section 4.3.2).\nfunction names clash packages (e.g., function \nname extract() exist terra \ntidyr packages). may lead unexpected results\nloading packages different order. addition calling\nfunctions verbosely full namespace (e.g.,\ntidyr::extract()) avoid attaching packages \nlibrary(), another way prevent function name clashes \nunloading offending package detach(). \nfollowing command, example, unloads terra\npackage (can also done package tab resides\ndefault right-bottom pane RStudio):\ndetach(“package:terra”, unload = TRUE, force = TRUE). \nforce argument makes sure package detached\neven packages depend . , however, may lead \nrestricted usability packages depending detached package, \ntherefore recommended.\n","code":"\nglobal(elev, sd)\nfreq(grain)\n#> layer value count\n#> 1 1 clay 10\n#> 2 1 silt 13\n#> 3 1 sand 13\nhist(elev)"},{"path":"attr.html","id":"exercises-1","chapter":"3 Attribute data operations","heading":"3.4 Exercises","text":"exercises use us_states us_states_df datasets spData package.\nmust attached package, packages used attribute operations chapter (sf, dplyr, terra) commands library(spData) attempting exercises:us_states spatial object (class sf), containing geometry attributes (including name, region, area, population) states within contiguous United States.\nus_states_df data frame (class data.frame) containing name additional variables (including median income poverty level, years 2010 2015) US states, including Alaska, Hawaii Puerto Rico.\ndata comes United States Census Bureau, documented ?us_states ?us_states_df.E1. Create new object called us_states_name contains NAME column us_states object using either base R ([) tidyverse (select()) syntax.\nclass new object makes geographic?E2. Select columns us_states object contain population data.\nObtain result using different command (bonus: try find three ways obtaining result).\nHint: try use helper functions, contains matches dplyr (see ?contains).E3. Find states following characteristics (bonus find plot ):Belong Midwest region.Belong West region, area 250,000 km2and 2015 population greater 5,000,000 residents (hint: may need use function units::set_units() .numeric()).Belong South region, area larger 150,000 km2 total population 2015 larger 7,000,000 residents.E4. total population 2015 us_states dataset?\nminimum maximum total population 2015?E5. many states region?E6. minimum maximum total population 2015 region?\ntotal population 2015 region?E7. Add variables us_states_df us_states, create new object called us_states_stats.\nfunction use ?\nvariable key datasets?\nclass new object?E8. us_states_df two rows us_states.\ncan find ? (hint: try use dplyr::anti_join() function)E9. population density 2015 state?\npopulation density 2010 state?E10. much population density changed 2010 2015 state?\nCalculate change percentages map .E11. Change columns’ names us_states lowercase. (Hint: helper functions - tolower() colnames() may help.)E12. Using us_states us_states_df create new object called us_states_sel.\nnew object two variables - median_income_15 geometry.\nChange name median_income_15 column Income.E13. Calculate change number residents living poverty level 2010 2015 state. (Hint: See ?us_states_df documentation poverty level columns.)\nBonus: Calculate change percentage residents living poverty level state.E14. minimum, average maximum state’s number people living poverty line 2015 region?\nBonus: region largest increase people living poverty line?E15. Create raster scratch nine rows columns resolution 0.5 decimal degrees (WGS84).\nFill random numbers.\nExtract values four corner cells.E16. common class example raster grain?E17. Plot histogram boxplot dem.tif file spDataLarge package (system.file(\"raster/dem.tif\", package = \"spDataLarge\")).","code":"\nlibrary(sf)\nlibrary(dplyr)\nlibrary(terra)\nlibrary(spData)\ndata(us_states)\ndata(us_states_df)"},{"path":"spatial-operations.html","id":"spatial-operations","chapter":"4 Spatial data operations","heading":"4 Spatial data operations","text":"","code":""},{"path":"spatial-operations.html","id":"prerequisites-2","chapter":"4 Spatial data operations","heading":"Prerequisites","text":"chapter requires packages used Chapter 3:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)"},{"path":"spatial-operations.html","id":"introduction-1","chapter":"4 Spatial data operations","heading":"4.1 Introduction","text":"Spatial operations, including spatial joins vector datasets local focal operations raster datasets, vital part geocomputation.\nchapter shows spatial objects can modified multitude ways based location shape.\nMany spatial operations non-spatial (attribute) equivalent, concepts subsetting joining datasets demonstrated previous chapter applicable .\nespecially true vector operations: Section 3.2 vector attribute manipulation provides basis understanding spatial counterpart, namely spatial subsetting (covered Section 4.2.1).\nSpatial joining (Sections 4.2.5, 4.2.6 4.2.8) aggregation (Section 4.2.7) also non-spatial counterparts, covered previous chapter.Spatial operations differ non-spatial operations number ways, however:\nspatial joins, example, can done number ways — including matching entities intersect within certain distance target dataset — attribution joins discussed Section 3.2.4 previous chapter can done one way (except using fuzzy joins, described documentation fuzzyjoin package).\nDifferent types spatial relationship objects, including intersects disjoint, described Sections 4.2.2 4.2.4.\nAnother unique aspect spatial objects distance: spatial objects related space, distance calculations can used explore strength relationship, described context vector data Section 4.2.3.Spatial operations raster objects include subsetting — covered Section 4.3.1.\nMap algebra covers range operations modify raster cell values, without reference surrounding cell values.\nconcept map algebra, vital many applications, introduced Section 4.3.2; local, focal zonal map algebra operations covered sections 4.3.3, 4.3.4, 4.3.5, respectively.\nGlobal map algebra operations, generate summary statistics representing entire raster dataset, distance calculations rasters, discussed Section 4.3.6.\nNext, relation map algebra vector operations discussed Section 4.3.7.\nfinal section exercises (4.3.8) process merging two raster datasets discussed demonstrated reference reproducible example.","code":""},{"path":"spatial-operations.html","id":"spatial-vec","chapter":"4 Spatial data operations","heading":"4.2 Spatial operations on vector data","text":"section provides overview spatial operations vector geographic data represented simple features sf package.\nSection 4.3 presents spatial operations raster datasets using classes functions terra package.","code":""},{"path":"spatial-operations.html","id":"spatial-subsetting","chapter":"4 Spatial data operations","heading":"4.2.1 Spatial subsetting","text":"Spatial subsetting process taking spatial object returning new object containing features relate space another object.\nAnalogous attribute subsetting (covered Section 3.2.1), subsets sf data frames can created square bracket ([) operator using syntax x[y, , op = st_intersects], x sf object subset rows returned, y ‘subsetting object’ , op = st_intersects optional argument specifies topological relation (also known binary predicate) used subsetting.\ndefault topological relation used op argument provided st_intersects(): command x[y, ] identical x[y, , op = st_intersects] shown x[y, , op = st_disjoint] (meaning topological relations described next section).\nfilter() function tidyverse can also used approach verbose, see examples .\ndemonstrate spatial subsetting, use nz nz_height datasets spData package, contain geographic data 16 main regions 101 highest points New Zealand, respectively (Figure 4.1), projected coordinate reference system.\nfollowing code chunk creates object representing Canterbury, uses spatial subsetting return high points region.\nFIGURE 4.1: Illustration spatial subsetting red triangles representing 101 high points New Zealand, clustered near central Canterbuy region (left). points Canterbury created [ subsetting operator (highlighted gray, right).\nLike attribute subsetting, command x[y, ] (equivalent nz_height[canterbury, ]) subsets features target x using contents source object y.\nInstead y vector class logical integer, however, spatial subsetting x y must geographic objects.\nSpecifically, objects used spatial subsetting way must class sf sfc: nz nz_height geographic vector data frames class sf, result operation returns another sf object representing features target nz_height object intersect (case high points located within) canterbury region.Various topological relations can used spatial subsetting determine type spatial relationship features target object must subsetting object selected.\ninclude touches, crosses within, see shortly Section 4.2.2.\ndefault setting st_intersects ‘catch ’ topological relation return features target touch, cross within source ‘subsetting’ object.\nAlternative spatial operators can specified op = argument, demonstrated following command returns opposite st_intersects(), points intersect Canterbury (see Section 4.2.2).many applications, ’ll need know spatial subsetting vector data: just works.\nimpatient learn topological relations, beyond st_intersects() st_disjoint(), skip next section (4.2.2).\n’re interested details, including ways subsetting, read .Another way spatial subsetting uses objects returned topological operators.\nobjects can useful right, example exploring graph network relationships contiguous regions, can also used subsetting, demonstrated code chunk .code chunk creates object class sgbp (sparse geometry binary predicate, list length x spatial operation) converts logical vector sel_logical (containing TRUE FALSE values, something can also used dplyr’s filter function).\nfunction lengths() identifies features nz_height intersect objects y.\ncase 1 greatest possible value complex operations one use method subset features intersect , example, 2 features source object.result can also achieved sf function st_filter() created increase compatibility sf objects dplyr data manipulation code:point, three identical (row names) versions canterbury_height, one created using [ operator, one created via intermediary selection object, another using sf’s convenience function st_filter().\n\n\nnext section explores different types spatial relation, also known binary predicates, can used identify whether two features spatially related .","code":"\ncanterbury = nz |> filter(Name == \"Canterbury\")\ncanterbury_height = nz_height[canterbury, ]\nnz_height[canterbury, , op = st_disjoint]\nsel_sgbp = st_intersects(x = nz_height, y = canterbury)\nclass(sel_sgbp)\n#> [1] \"sgbp\" \"list\"\nsel_sgbp\n#> Sparse geometry binary predicate list of length 101, where the\n#> predicate was `intersects'\n#> first 10 elements:\n#> 1: (empty)\n#> 2: (empty)\n#> 3: (empty)\n#> 4: (empty)\n#> 5: 1\n#> 6: 1\n....\nsel_logical = lengths(sel_sgbp) > 0\ncanterbury_height2 = nz_height[sel_logical, ]\ncanterbury_height3 = nz_height |>\n st_filter(y = canterbury, .predicate = st_intersects)"},{"path":"spatial-operations.html","id":"topological-relations","chapter":"4 Spatial data operations","heading":"4.2.2 Topological relations","text":"Topological relations describe spatial relationships objects.\n“Binary topological relationships”, give full name, logical statements (answer can TRUE FALSE) spatial relationships two objects defined ordered sets points (typically forming points, lines polygons) two dimensions (Egenhofer Herring 1990).\nmay sound rather abstract , indeed, definition classification topological relations based mathematical foundations first published book form 1966 (Spanier 1995), field algebraic topology continuing 21st century (Dieck 2008).Despite mathematical origins, topological relations can understood intuitively reference visualizations commonly used functions test common types spatial relationships.\nFigure 4.2 shows variety geometry pairs associated relations.\nthird fourth pairs Figure 4.2 (left right ) demonstrate , relations, order important.\nrelations equals, intersects, crosses, touches overlaps symmetrical, meaning function(x, y) true, function(y, x) also true, relations order geometries important contains within .\nNotice geometry pair “DE-9IM” string FF2F11212, described next section.\n\nFIGURE 4.2: Topological relations vector geometries, inspired Figures 1 2 Egenhofer Herring (1990). relations function(x, y) true printed geometry pair, x represented pink y represented blue. nature spatial relationship pair described Dimensionally Extended 9-Intersection Model string.\nsf, functions testing different types topological relations called ‘binary predicates’, described vignette Manipulating Simple Feature Geometries, can viewed command vignette(\"sf3\"), help page ?geos_binary_pred.\nsee topological relations work practice, let’s create simple reproducible example, building relations illustrated Figure 4.2 consolidating knowledge vector geometries represented previous chapter (Section 2.2.4).\nNote create tabular data representing coordinates (x y) polygon vertices, use base R function cbind() create matrix representing coordinates points, POLYGON, finally sfc object, described Chapter 2:create additional geometries demonstrate spatial relations following commands , plotted top polygon created , relate space one another, shown Figure 4.3.\nNote use function st_as_sf() argument coords efficiently convert data frame containing columns representing coordinates sf object containing points:\nFIGURE 4.3: Points, line polygon objects arranged illustrate topological relations.\nsimple query : points point_sf intersect way polygon polygon_sfc?\nquestion can answered inspection (points 1 3 touching within polygon, respectively).\nquestion can answered spatial predicate st_intersects() follows:result match intuition:\npositive (1) results returned first third point, negative result (represented empty vector) second outside polygon’s border.\nmay unexpected result comes form list vectors.\nsparse matrix output registers relation one exists, reducing memory requirements topological operations multi-feature objects.\nsaw previous section, dense matrix consisting TRUE FALSE values returned sparse = FALSE.output row represents feature target (argument x) object column represents feature selecting object (y).\ncase, one feature y object polygon_sfc result, can used subsetting saw Section 4.2.1, one column.st_intersects() returns TRUE even cases features just touch: intersects ‘catch-’ topological operation identifies many types spatial relation, illustrated Figure 4.2.\nrestrictive questions include points lie within polygon, features contain shared boundary y?\ncan answered follows (results shown):Note although first point touches boundary polygon, within ; third point within polygon touch part border.\nopposite st_intersects() st_disjoint(), returns objects spatially relate way selecting object (note [, 1] converts result vector).function st_is_within_distance() detects features almost touch selection object, additional dist argument.\ncan used set close target objects need selected.\n‘within distance’ binary spatial predicate demonstrated code chunk , results show every point within 0.2 units polygon.Note although point 2 0.2 units distance nearest vertex polygon_sfc, still selected distance set 0.2.\ndistance measured nearest edge, case part polygon lies directly point 2 Figure 4.3.\n(can verify actual distance point 2 polygon 0.13 command st_distance(point_sf, polygon_sfc).)","code":"\npolygon_matrix = cbind(\n x = c(0, 0, 1, 1, 0),\n y = c(0, 1, 1, 0.5, 0)\n)\npolygon_sfc = st_sfc(st_polygon(list(polygon_matrix)))\npoint_df = data.frame(\n x = c(0.2, 0.7, 0.4),\n y = c(0.1, 0.2, 0.8)\n)\npoint_sf = st_as_sf(point_df, coords = c(\"x\", \"y\"))\nst_intersects(point_sf, polygon_sfc)\n#> Sparse geometry binary predicate... `intersects'\n#> 1: 1\n#> 2: (empty)\n#> 3: 1\nst_intersects(point_sf, polygon_sfc, sparse = FALSE)\n#> [,1]\n#> [1,] TRUE\n#> [2,] FALSE\n#> [3,] TRUE\nst_within(point_sf, polygon_sfc)\nst_touches(point_sf, polygon_sfc)\nst_disjoint(point_sf, polygon_sfc, sparse = FALSE)[, 1]\n#> [1] FALSE TRUE FALSE\nst_is_within_distance(point_sf, polygon_sfc, dist = 0.2, sparse = FALSE)[, 1]\n#> [1] TRUE TRUE TRUE"},{"path":"spatial-operations.html","id":"distance-relations","chapter":"4 Spatial data operations","heading":"4.2.3 Distance relations","text":"topological relations presented previous section binary (feature either intersects another ) distance relations continuous.\ndistance two sf objects calculated st_distance(), also used behind scenes Section 4.2.6 distance-based joins.\nillustrated code chunk , finds distance highest point New Zealand geographic centroid Canterbury region, created Section 4.2.1:\ntwo potentially surprising things result:units, telling us distance 100,000 meters, 100,000 inches, measure distanceIt returned matrix, even though result contains single valueThis second feature hints another useful feature st_distance(), ability return distance matrices combinations features objects x y.\nillustrated command , finds distances first three features nz_height Otago Canterbury regions New Zealand represented object co.Note distance second third features nz_height second feature co zero.\ndemonstrates fact distances points polygons refer distance part polygon:\nsecond third points nz_height Otago, can verified plotting (result shown):","code":"\nnz_highest = nz_height |> slice_max(n = 1, order_by = elevation)\ncanterbury_centroid = st_centroid(canterbury)\nst_distance(nz_highest, canterbury_centroid)\n#> Units: [m]\n#> [,1]\n#> [1,] 115540\nco = filter(nz, grepl(\"Canter|Otag\", Name))\nst_distance(nz_height[1:3, ], co)\n#> Units: [m]\n#> [,1] [,2]\n#> [1,] 123537 15498\n#> [2,] 94283 0\n#> [3,] 93019 0\nplot(st_geometry(co)[2])\nplot(st_geometry(nz_height)[2:3], add = TRUE)"},{"path":"spatial-operations.html","id":"DE-9IM-strings","chapter":"4 Spatial data operations","heading":"4.2.4 DE-9IM strings","text":"Underlying binary predicates demonstrated previous section Dimensionally Extended 9-Intersection Model (DE-9IM).\ncryptic name suggests, easy topic understand, worth knowing underlies many spatial operations enables creation custom spatial predicates.\nmodel originally labelled “DE + 9IM” inventors, referring “dimension intersections boundaries, interiors, exteriors two features” (Clementini Di Felice 1995), now referred DE-9IM (Shen, Chen, Liu 2018).\nDE-9IM applicable 2-dimensional objects (points, lines polygons) Euclidean space, meaning model (software implementing GEOS) assumes working data projected coordinate reference system, described Chapter 7.demonstrate DE-9IM strings work, let’s take look various ways first geometry pair Figure 4.2 relate.\nFigure 4.4 illustrates 9 intersection model (9IM) shows intersections every combination object’s interior, boundary exterior: component first object x arranged columns component y arranged rows, facetted graphic created intersections element highlighted.\nFIGURE 4.4: Illustration Dimensionally Extended 9 Intersection Model (DE-9IM) works. Colors legend represent overlap different components. thick lines highlight 2 dimensional intesections, e.g., boundary object x interior object y, shown middle top facet.\nDE-9IM strings derived dimension type relation.\ncase red intersections Figure 4.4 dimensions 0 (points), 1 (lines), 2 (polygons), shown Table 4.1.TABLE 4.1: Table showing relations interiors, boundaries exteriors geometries x y.Flattening matrix ‘row-wise’ (meaning concatenating first row, second, third) results string 212111212.\nAnother example serve demonstrate system:\nrelation shown Figure 4.2 (third polygon pair third column 1st row) can defined DE-9IM system follows:intersections interior larger object x interior, boundary exterior y dimensions 2, 1 2 respectivelyThe intersections boundary larger object x interior, boundary exterior y dimensions F, F 1 respectively, ‘F’ means ‘false’, objects disjointThe intersections exterior x interior, boundary exterior y dimensions F, F 2 respectively: exterior larger object touch interior boundary y, exterior smaller larger objects cover areaThese three components, concatenated, create string 212, FF1, FF2.\nresult obtained function st_relate() (see source code chapter see geometries Figure 4.2 created):Understanding DE-9IM strings allows new binary spatial predicates developed.\nhelp page ?st_relate contains function definitions ‘queen’ ‘rook’ relations polygons share border point, respectively.\n‘Queen’ relations mean ‘boundary-boundary’ relations (cell second column second row Table 4.1, 5th element DE-9IM string) must empty, corresponding pattern F***T****, ‘rook’ relations element must 1 (meaning linear intersection).\nimplemented follows:Building object x created previously, can use newly created functions find elements grid ‘queen’ ‘rook’ relation middle square grid follows:\nFIGURE 4.5: Demonstration custom binary spatial predicates finding ‘queen’ (left) ‘rook’ (right) relations central square grid 9 geometries.\n","code":"\nxy2sfc = function(x, y) st_sfc(st_polygon(list(cbind(x, y))))\nx = xy2sfc(x = c(0, 0, 1, 1, 0), y = c(0, 1, 1, 0.5, 0))\ny = xy2sfc(x = c(0.7, 0.7, 0.9, 0.7), y = c(0.8, 0.5, 0.5, 0.8))\nst_relate(x, y)\n#> [,1] \n#> [1,] \"212FF1FF2\"\nst_queen = function(x, y) st_relate(x, y, pattern = \"F***T****\")\nst_rook = function(x, y) st_relate(x, y, pattern = \"F***1****\")\ngrid = st_make_grid(x, n = 3)\ngrid_sf = st_sf(grid)\ngrid_sf$queens = lengths(st_queen(grid, grid[5])) > 0\nplot(grid, col = grid_sf$queens)\ngrid_sf$rooks = lengths(st_rook(grid, grid[5])) > 0\nplot(grid, col = grid_sf$rooks)"},{"path":"spatial-operations.html","id":"spatial-joining","chapter":"4 Spatial data operations","heading":"4.2.5 Spatial joining","text":"Joining two non-spatial datasets relies shared ‘key’ variable, described Section 3.2.4.\nSpatial data joining applies concept, instead relies spatial relations, described previous section.\nattribute data, joining adds new columns target object (argument x joining functions), source object (y).\nprocess illustrated following example: imagine ten points randomly distributed across Earth’s surface ask, points land, countries ?\nImplementing idea reproducible example build geographic data handling skills show spatial joins work.\nstarting point create points randomly scattered Earth’s surface.scenario illustrated Figure 4.6 shows random_points object (top left) lacks attribute data, world (top right) attributes, including country names shown sample countries legend.\nSpatial joins implemented st_join(), illustrated code chunk .\noutput random_joined object illustrated Figure 4.6 (bottom left).\ncreating joined dataset, use spatial subsetting create world_random, contains countries contain random points, verify number country names returned joined dataset four (Figure 4.6, top right panel).\nFIGURE 4.6: Illustration spatial join. new attribute variable added random points (top left) source world object (top right) resulting data represented final panel.\ndefault, st_join() performs left join, meaning result object containing rows x including rows match y (see Section 3.2.4), can also inner joins setting argument left = FALSE.\nLike spatial subsetting, default topological operator used st_join() st_intersects(), can changed setting join argument (see ?st_join details).\nexample demonstrates addition column polygon layer point layer, approach works regardless geometry types.\ncases, example x contains polygons, match multiple objects y, spatial joins result duplicate features creating new row match y.","code":"\nset.seed(2018) # set seed for reproducibility\n(bb = st_bbox(world)) # the world's bounds\n#> xmin ymin xmax ymax \n#> -180.0 -89.9 180.0 83.6\nrandom_df = data.frame(\n x = runif(n = 10, min = bb[1], max = bb[3]),\n y = runif(n = 10, min = bb[2], max = bb[4])\n)\nrandom_points = random_df |> \n st_as_sf(coords = c(\"x\", \"y\"), crs = \"EPSG:4326\") # set coordinates and CRS\nworld_random = world[random_points, ]\nnrow(world_random)\n#> [1] 4\nrandom_joined = st_join(random_points, world[\"name_long\"])"},{"path":"spatial-operations.html","id":"non-overlapping-joins","chapter":"4 Spatial data operations","heading":"4.2.6 Distance-based joins","text":"Sometimes two geographic datasets intersect still strong geographic relationship due proximity.\ndatasets cycle_hire cycle_hire_osm, already attached spData package, provide good example.\nPlotting shows often closely related touch, shown Figure 4.7, base version created following code :\ncan check points using st_intersects() shown :\nFIGURE 4.7: spatial distribution cycle hire points London based official data (blue) OpenStreetMap data (red).\nImagine need join capacity variable cycle_hire_osm onto official ‘target’ data contained cycle_hire.\nnon-overlapping join needed.\nsimplest method use binary predicate st_is_within_distance(), demonstrated using threshold distance 20 m.\nOne can set threshold distance metric units also unprojected data (e.g., lon/lat CRSs WGS84), spherical geometry engine (S2) enabled, sf default (see Section 2.2.9).shows 438 points target object cycle_hire within threshold distance cycle_hire_osm.\nretrieve values associated respective cycle_hire_osm points?\nsolution st_join(), additional dist argument (set 20 m ):Note number rows joined result greater target.\ncycle hire stations cycle_hire multiple matches cycle_hire_osm.\naggregate values overlapping points return mean, can use aggregation methods learned Chapter 3, resulting object number rows target.capacity nearby stations can verified comparing plot capacity source cycle_hire_osm data results new object (plots shown):result join used spatial operation change attribute data associated simple features; geometry associated feature remained unchanged.","code":"\nplot(st_geometry(cycle_hire), col = \"blue\")\nplot(st_geometry(cycle_hire_osm), add = TRUE, pch = 3, col = \"red\")\nany(st_intersects(cycle_hire, cycle_hire_osm, sparse = FALSE))\n#> [1] FALSE\nsel = st_is_within_distance(cycle_hire, cycle_hire_osm, \n dist = units::set_units(20, \"m\"))\nsummary(lengths(sel) > 0)\n#> Mode FALSE TRUE \n#> logical 304 438\nz = st_join(cycle_hire, cycle_hire_osm, st_is_within_distance, \n dist = units::set_units(20, \"m\"))\nnrow(cycle_hire)\n#> [1] 742\nnrow(z)\n#> [1] 762\nz = z |> \n group_by(id) |> \n summarize(capacity = mean(capacity))\nnrow(z) == nrow(cycle_hire)\n#> [1] TRUE\nplot(cycle_hire_osm[\"capacity\"])\nplot(z[\"capacity\"])"},{"path":"spatial-operations.html","id":"spatial-aggr","chapter":"4 Spatial data operations","heading":"4.2.7 Spatial aggregation","text":"attribute data aggregation, spatial data aggregation condenses data: aggregated outputs fewer rows non-aggregated inputs.\nStatistical aggregating functions, mean average sum, summarise multiple values variable, return single value per grouping variable.\nSection 3.2.3 demonstrated aggregate() group_by() |> summarize() condense data based attribute variables, section shows functions work spatial objects.\nReturning example New Zealand, imagine want find average height high points region: geometry source (y nz case) defines values target object (x nz_height) grouped.\ncan done single line code base R’s aggregate() method.result previous command sf object geometry (spatial) aggregating object (nz), can verify command identical(st_geometry(nz), st_geometry(nz_agg)).\nresult previous operation illustrated Figure 4.8, shows average value features nz_height within New Zealand’s 16 regions.\nresult can also generated piping output st_join() ‘tidy’ functions group_by() summarize() follows:\nFIGURE 4.8: Average height top 101 high points across regions New Zealand.\nresulting nz_agg objects geometry aggregating object nz new column summarizing values x region using function mean().\nfunctions used instead mean() , including median(), sd() functions return single value per group.\nNote: one difference aggregate() group_by() |> summarize() approaches former results NA values unmatching region names latter preserves region names.\n‘tidy’ approach thus flexible terms aggregating functions column names results.\nAggregating operations also create new geometries covered Section 5.2.7.","code":"\nnz_agg = aggregate(x = nz_height, by = nz, FUN = mean)\nnz_agg2 = st_join(x = nz, y = nz_height) |>\n group_by(Name) |>\n summarize(elevation = mean(elevation, na.rm = TRUE))"},{"path":"spatial-operations.html","id":"incongruent","chapter":"4 Spatial data operations","heading":"4.2.8 Joining incongruent layers","text":"Spatial congruence important concept related spatial aggregation.\naggregating object (refer y) congruent target object (x) two objects shared borders.\nOften case administrative boundary data, whereby larger units — Middle Layer Super Output Areas (MSOAs) UK districts many European countries — composed many smaller units.Incongruent aggregating objects, contrast, share common borders target (Qiu, Zhang, Zhou 2012).\nproblematic spatial aggregation (spatial operations) illustrated Figure 4.9: aggregating centroid sub-zone return accurate results.\nAreal interpolation overcomes issue transferring values one set areal units another, using range algorithms including simple area weighted approaches sophisticated approaches ‘pycnophylactic’ methods (Waldo R. Tobler 1979).\nFIGURE 4.9: Illustration congruent (left) incongruent (right) areal units respect larger aggregating zones (translucent red borders).\nspData package contains dataset named incongruent (colored polygons black borders right panel Figure 4.9) dataset named aggregating_zones (two polygons translucent blue border right panel Figure 4.9).\nLet us assume value column incongruent refers total regional income million Euros.\ncan transfer values underlying nine spatial polygons two polygons aggregating_zones?simplest useful method area weighted spatial interpolation, transfers values incongruent object new column aggregating_zones proportion area overlap: larger spatial intersection input output features, larger corresponding value.\nimplemented st_interpolate_aw(), demonstrated code chunk .case meaningful sum values intersections falling aggregating zones since total income -called spatially extensive variable (increases area), assuming income evenly distributed across smaller zones (hence warning message ).\ndifferent spatially intensive variables average income percentages, increase area increases.\nst_interpolate_aw() works equally spatially intensive variables: set extensive parameter FALSE use average rather sum function aggregation.","code":"\niv = incongruent[\"value\"] # keep only the values to be transferred\nagg_aw = st_interpolate_aw(iv, aggregating_zones, extensive = TRUE)\n#> Warning in st_interpolate_aw.sf(iv, aggregating_zones, extensive = TRUE):\n#> st_interpolate_aw assumes attributes are constant or uniform over areas of x\nagg_aw$value\n#> [1] 19.6 25.7"},{"path":"spatial-operations.html","id":"spatial-ras","chapter":"4 Spatial data operations","heading":"4.3 Spatial operations on raster data","text":"section builds Section 3.3, highlights various basic methods manipulating raster datasets, demonstrate advanced explicitly spatial raster operations, uses objects elev grain manually created Section 3.3.\nreader’s convenience, datasets can also found spData package.","code":"\nelev = rast(system.file(\"raster/elev.tif\", package = \"spData\"))\ngrain = rast(system.file(\"raster/grain.tif\", package = \"spData\"))"},{"path":"spatial-operations.html","id":"spatial-raster-subsetting","chapter":"4 Spatial data operations","heading":"4.3.1 Spatial subsetting","text":"previous chapter (Section 3.3) demonstrated retrieve values associated specific cell IDs row column combinations.\nRaster objects can also extracted location (coordinates) spatial objects.\nuse coordinates subsetting, one can ‘translate’ coordinates cell ID terra function cellFromXY().\nalternative use terra::extract() (careful, also function called extract() tidyverse) extract values.\nmethods demonstrated find value cell covers point located coordinates 0.1, 0.1.\nRaster objects can also subset another raster object, demonstrated code chunk :amounts retrieving values first raster object (case elev) fall within extent second raster (: clip), illustrated Figure 4.10.\nFIGURE 4.10: Original raster (left). Raster mask (middle). Output masking raster (right).\nexample returned values specific cells, many cases spatial outputs subsetting operations raster datasets needed.\ncan done setting drop argument [ operator FALSE.\ncode returns first two cells elev, .e., first two cells top row, raster object (first 2 lines output shown):Another common use case spatial subsetting raster logical (NA) values used mask another raster extent resolution, illustrated Figure 4.10.\ncase, [ mask() functions can used (results shown).code chunk , created mask object called rmask values randomly assigned NA TRUE.\nNext, want keep values elev TRUE rmask.\nwords, want mask elev rmask.approach can also used replace values (e.g., expected wrong) NA.operations fact Boolean local operations since compare cell-wise two rasters.\nnext subsection explores related operations detail.","code":"\nid = cellFromXY(elev, xy = matrix(c(0.1, 0.1), ncol = 2))\nelev[id]\n# the same as\nterra::extract(elev, matrix(c(0.1, 0.1), ncol = 2))\nclip = rast(xmin = 0.9, xmax = 1.8, ymin = -0.45, ymax = 0.45,\n resolution = 0.3, vals = rep(1, 9))\nelev[clip]\n# we can also use extract\n# terra::extract(elev, ext(clip))\nelev[1:2, drop = FALSE] # spatial subsetting with cell IDs\n#> class : SpatRaster \n#> dimensions : 1, 2, 1 (nrow, ncol, nlyr)\n#> ...\n# create raster mask\nrmask = elev\nvalues(rmask) = sample(c(NA, TRUE), 36, replace = TRUE)\n# spatial subsetting\nelev[rmask, drop = FALSE] # with [ operator\n# we can also use mask\n# mask(elev, rmask)\nelev[elev < 20] = NA"},{"path":"spatial-operations.html","id":"map-algebra","chapter":"4 Spatial data operations","heading":"4.3.2 Map algebra","text":"\nterm ‘map algebra’ coined late 1970s describe “set conventions, capabilities, techniques” analysis geographic raster (although less prominently) vector data (Tomlin 1994).\ncontext, define map algebra narrowly, operations modify summarize raster cell values, reference surrounding cells, zones, statistical functions apply every cell.Map algebra operations tend fast, raster datasets implicitly store coordinates, hence old adage “raster faster vector corrector”.\nlocation cells raster datasets can calculated using matrix position resolution origin dataset (stored header).\nprocessing, however, geographic position cell barely relevant long make sure cell position still processing.\nAdditionally, two raster datasets share extent, projection resolution, one treat matrices processing.way map algebra works terra package.\nFirst, headers raster datasets queried (cases map algebra operations work one dataset) checked ensure datasets compatible.\nSecond, map algebra retains -called one--one locational correspondence, meaning cells move.\ndiffers matrix algebra, values change position, example multiplying dividing matrices.Map algebra (cartographic modeling raster data) divides raster operations four subclasses (Tomlin 1990), working one several grids simultaneously:Local per-cell operationsFocal neighborhood operations.\noften output cell value result 3 x 3 input cell blockZonal operations similar focal operations, surrounding pixel grid new values computed can irregular sizes shapesGlobal per-raster operations.\nmeans output cell derives value potentially one several entire rastersThis typology classifies map algebra operations number cells used pixel processing step type output.\nsake completeness, mention raster operations can also classified discipline terrain, hydrological analysis, image classification.\nfollowing sections explain type map algebra operations can used, reference worked examples.","code":""},{"path":"spatial-operations.html","id":"local-operations","chapter":"4 Spatial data operations","heading":"4.3.3 Local operations","text":"\nLocal operations comprise cell--cell operations one several layers.\nincludes adding subtracting values raster, squaring multiplying rasters.\nRaster algebra also allows logical operations finding raster cells greater specific value (5 example ).\nterra package supports operations , demonstrated (Figure 4.11):\nFIGURE 4.11: Examples different local operations elev raster object: adding two rasters, squaring, applying logarithmic transformation, performing logical operation.\nAnother good example local operations classification intervals numeric values groups grouping digital elevation model low (class 1), middle (class 2) high elevations (class 3).\nUsing classify() command, need first construct reclassification matrix, first column corresponds lower second column upper end class.\nthird column represents new value specified ranges column one two., assign raster values ranges 0–12, 12–24 24–36 reclassified take values 1, 2 3, respectively.classify() function can also used want reduce number classes categorical rasters.\nperform several additional reclassifications Chapter 14.Apart applying arithmetic operators directly, one can also use app(), tapp() lapp() functions.\nefficient, hence, preferable presence large raster datasets.\nAdditionally, allow save output file directly.\napp() function applies function cell raster used summarize (e.g., calculating sum) values multiple layers one layer.\ntapp() extension app(), allowing us select subset layers (see index argument) want perform certain operation.\nFinally, lapp() function allows us apply function cell using layers arguments – application lapp() presented .calculation normalized difference vegetation index (NDVI) well-known local (pixel--pixel) raster operation.\nreturns raster values -1 1; positive values indicate presence living plants (mostly > 0.2).\nNDVI calculated red near-infrared (NIR) bands remotely sensed imagery, typically satellite systems Landsat Sentinel.\nVegetation absorbs light heavily visible light spectrum, especially red channel, reflecting NIR light. ’s NVDI formula:\\[\n\\begin{split}\nNDVI&= \\frac{\\text{NIR} - \\text{Red}}{\\text{NIR} + \\text{Red}}\\\\\n\\end{split}\n\\]Let’s calculate NDVI multispectral satellite file Zion National Park.raster object four satellite bands Landsat 8 satellite — blue, green, red, near-infrared (NIR).\nImportantly, Landsat level-2 products stored integers save disk space, thus need convert floating-point numbers calculations.\npurpose, need apply scaling factor (0.0000275) add offset (-0.2) original values.21The proper values now range 0 1.\ncase , probably due presence clouds atmospheric effects, thus need replace 0 0.next step implement NDVI formula R function:function accepts two numerical arguments, nir red, returns numerical vector NDVI values.\ncan used fun argument lapp().\njust need remember function expects two bands (four original raster), need NIR, red order.\nsubset input raster multi_rast[[c(4, 3)]] calculations.result, shown right panel Figure 4.12, can compared RGB image area (left panel Figure).\nallows us see largest NDVI values connected northern areas dense forest, lowest values related lake north snowy mountain ridges.\nFIGURE 4.12: RGB image (left) NDVI values (right) calculated example satellite file Zion National Park\nPredictive mapping another interesting application local raster operations.\nresponse variable corresponds measured observed points space, example, species richness, presence landslides, tree disease crop yield.\nConsequently, can easily retrieve space- airborne predictor variables various rasters (elevation, pH, precipitation, temperature, land cover, soil class, etc.).\nSubsequently, model response function predictors using lm(), glm(), gam() machine-learning technique.\nSpatial predictions raster objects can therefore made applying estimated coefficients predictor raster values, summing output raster values (see Chapter 15).","code":"\nelev + elev\nelev^2\nlog(elev)\nelev > 5\nrcl = matrix(c(0, 12, 1, 12, 24, 2, 24, 36, 3), ncol = 3, byrow = TRUE)\nrcl\n#> [,1] [,2] [,3]\n#> [1,] 0 12 1\n#> [2,] 12 24 2\n#> [3,] 24 36 3\nrecl = classify(elev, rcl = rcl)\nmulti_raster_file = system.file(\"raster/landsat.tif\", package = \"spDataLarge\")\nmulti_rast = rast(multi_raster_file)\nmulti_rast = (multi_rast * 0.0000275) - 0.2\nmulti_rast[multi_rast < 0] = 0\nndvi_fun = function(nir, red){\n (nir - red) / (nir + red)\n}\nndvi_rast = lapp(multi_rast[[c(4, 3)]], fun = ndvi_fun)"},{"path":"spatial-operations.html","id":"focal-operations","chapter":"4 Spatial data operations","heading":"4.3.4 Focal operations","text":"\nlocal functions operate one cell, though possibly multiple layers, focal operations take account central (focal) cell neighbors.\nneighborhood (also named kernel, filter moving window) consideration typically size 3--3 cells (central cell eight surrounding neighbors), can take size (necessarily rectangular) shape defined user.\nfocal operation applies aggregation function cells within specified neighborhood, uses corresponding output new value central cell, moves next central cell (Figure 4.13).\nnames operation spatial filtering convolution (Burrough, McDonnell, Lloyd 2015).R, can use focal() function perform spatial filtering.\ndefine shape moving window matrix whose values correspond weights (see w parameter code chunk ).\nSecondly, fun parameter lets us specify function wish apply neighborhood.\n, choose minimum, summary function, including sum(), mean(), var() can used.function also accepts additional arguments, example, remove NAs process (na.rm = TRUE) (na.rm = FALSE).\nFIGURE 4.13: Input raster (left) resulting output raster (right) due focal operation - finding minimum value 3--3 moving windows.\ncan quickly check output meets expectations.\nexample, minimum value always upper left corner moving window (remember created input raster row-wise incrementing cell values one starting upper left corner).\nexample, weighting matrix consists 1s, meaning cell weight output, can changed.Focal functions filters play dominant role image processing.\nLow-pass smoothing filters use mean function remove extremes.\ncase categorical data, can replace mean mode, common value.\ncontrast, high-pass filters accentuate features.\nline detection Laplace Sobel filters might serve example .\nCheck focal() help page use R (also used exercises end chapter).Terrain processing, calculation topographic characteristics slope, aspect flow directions, relies focal functions.\nterrain() can used calculate metrics, although terrain algorithms, including Zevenbergen Thorne method compute slope, implemented terra function.\nMany algorithms — including curvatures, contributing areas wetness indices — implemented open source desktop geographic information system (GIS) software.\nChapter 10 shows access GIS functionality within R.","code":"\nr_focal = focal(elev, w = matrix(1, nrow = 3, ncol = 3), fun = min)"},{"path":"spatial-operations.html","id":"zonal-operations","chapter":"4 Spatial data operations","heading":"4.3.5 Zonal operations","text":"\nJust like focal operations, zonal operations apply aggregation function multiple raster cells.\nHowever, second raster, usually categorical values, defines zonal filters (‘zones’) case zonal operations, opposed neighborhood window case focal operations presented previous section.\nConsequently, raster cells defining zonal filter necessarily neighbors.\ngrain size raster good example, illustrated right panel Figure 3.2: different grain sizes spread irregularly throughout raster.\nFinally, result zonal operation summary table grouped zone operation also known zonal statistics GIS world.\ncontrast focal operations return raster object default.following code chunk uses zonal() function calculate mean elevation associated grain size class.returns statistics category, mean altitude grain size class.\nNote also possible get raster calculated statistics zone setting .raster argument TRUE.","code":"\nz = zonal(elev, grain, fun = \"mean\")\nz\n#> grain elev\n#> 1 clay 14.8\n#> 2 silt 21.2\n#> 3 sand 18.7"},{"path":"spatial-operations.html","id":"global-operations-and-distances","chapter":"4 Spatial data operations","heading":"4.3.6 Global operations and distances","text":"Global operations special case zonal operations entire raster dataset representing single zone.\ncommon global operations descriptive statistics entire raster dataset minimum maximum – already discussed Section 3.3.2.Aside , global operations also useful computation distance weight rasters.\nfirst case, one can calculate distance cell specific target cell.\nexample, one might want compute distance nearest coast (see also terra::distance()).\nmight also want consider topography, means, interested pure distance like also avoid crossing mountain ranges going coast.\n, can weight distance elevation additional altitudinal meter ‘prolongs’ Euclidean distance (Exercises 8 9 end chapter exactly ).\nVisibility viewshed computations also belong family global operations (exercises Chapter 10, compute viewshed raster).","code":""},{"path":"spatial-operations.html","id":"map-algebra-counterparts-in-vector-processing","chapter":"4 Spatial data operations","heading":"4.3.7 Map algebra counterparts in vector processing","text":"Many map algebra operations counterpart vector processing (Liu Mason 2009).\nComputing distance raster (global operation) considering maximum distance (logical focal operation) equivalent vector buffer operation (Section 5.2.5).\nReclassifying raster data (either local zonal function depending input) equivalent dissolving vector data (Section 4.2.5).\nOverlaying two rasters (local operation), one contains NULL NA values representing mask, similar vector clipping (Section 5.2.5).\nQuite similar spatial clipping intersecting two layers (Section 4.2.1).\ndifference two layers (vector raster) simply share overlapping area (see Figure 5.8 example).\nHowever, careful wording.\nSometimes words slightly different meanings raster vector data models.\naggregating polygon geometries means dissolving boundaries, raster data geometries means increasing cell sizes thereby reducing spatial resolution.\nZonal operations dissolve cells one raster accordance zones (categories) another raster dataset using aggregating function.","code":""},{"path":"spatial-operations.html","id":"merging-rasters","chapter":"4 Spatial data operations","heading":"4.3.8 Merging rasters","text":"\nSuppose like compute NDVI (see Section 4.3.3), additionally want compute terrain attributes elevation data observations within study area.\ncomputations rely remotely sensed information.\ncorresponding imagery often divided scenes covering specific spatial extent, frequently, study area covers one scene.\n, need merge scenes covered study area.\neasiest case, can just merge scenes, put side side.\npossible, example, digital elevation data.\nfollowing code chunk first download SRTM elevation data Austria Switzerland (country codes, see geodata function country_codes()).\nsecond step, merge two rasters one.terra’s merge() command combines two images, case overlap, uses value first raster.merging approach little use overlapping values correspond .\nfrequently case want combine spectral imagery scenes taken different dates.\nmerge() command still work see clear border resulting image.\nhand, mosaic() command lets define function overlapping area.\ninstance, compute mean value – might smooth clear border merged result likely make disappear.\ndetailed introduction remote sensing R, see Wegmann, Leutner, Dech (2016).","code":"\naut = geodata::elevation_30s(country = \"AUT\", path = tempdir())\nch = geodata::elevation_30s(country = \"CHE\", path = tempdir())\naut_ch = merge(aut, ch)"},{"path":"spatial-operations.html","id":"exercises-2","chapter":"4 Spatial data operations","heading":"4.4 Exercises","text":"E1. established Section 4.2 Canterbury region New Zealand containing 101 highest points country.\nmany high points Canterbury region contain?Bonus: plot result using plot() function show New Zealand, canterbury region highlighted yellow, high points Canterbury represented red crosses (hint: pch = 7) high points parts New Zealand represented blue circles. See help page ?points details illustration different pch values.E2. region second highest number nz_height points, many ?E3. Generalizing question regions: many New Zealand’s 16 regions contain points belong top 101 highest points country? regions?Bonus: create table listing regions order number points name.E4. Test knowledge spatial predicates finding plotting US states relate spatial objects.starting point exercise create object representing Colorado state USA. command\ncolorado = us_states[us_states$NAME == \"Colorado\",] (base R) filter() function (tidyverse) plot resulting object context US states.Create new object representing states geographically intersect Colorado plot result (hint: concise way subsetting method [).Create another object representing objects touch (shared boundary ) Colorado plot result (hint: remember can use argument op = st_intersects spatial relations spatial subsetting operations base R).Bonus: create straight line centroid District Columbia near East coast centroid California near West coast USA (hint: functions st_centroid(), st_union() st_cast() described Chapter 5 may help) identify states long East-West line crosses.E5. Use dem = rast(system.file(\"raster/dem.tif\", package = \"spDataLarge\")), reclassify elevation three classes: low (<300), medium high (>500).\nSecondly, read NDVI raster (ndvi = rast(system.file(\"raster/ndvi.tif\", package = \"spDataLarge\"))) compute mean NDVI mean elevation altitudinal class.E6. Apply line detection filter rast(system.file(\"ex/logo.tif\", package = \"terra\")).\nPlot result.\nHint: Read ?terra::focal().E7. Calculate Normalized Difference Water Index (NDWI; (green - nir)/(green + nir)) Landsat image.\nUse Landsat image provided spDataLarge package (system.file(\"raster/landsat.tif\", package = \"spDataLarge\")).\nAlso, calculate correlation NDVI NDWI area (hint: can use layerCor() function).E8. StackOverflow post (stackoverflow.com/questions/35555709) shows compute distances nearest coastline using raster::distance().\nTry something similar terra::distance(): retrieve digital elevation model Spain, compute raster represents distances coast across country (hint: use geodata::elevation_30s()).\nConvert resulting distances meters kilometers.\nNote: may wise increase cell size input raster reduce compute time operation (aggregate()).E9. Try modify approach used exercise weighting distance raster elevation raster; every 100 altitudinal meters increase distance coast 10 km.\nNext, compute visualize difference raster created using Euclidean distance (E7) raster weighted elevation.","code":"\nlibrary(sf)\nlibrary(dplyr)\nlibrary(spData)"},{"path":"geometry-operations.html","id":"geometry-operations","chapter":"5 Geometry operations","heading":"5 Geometry operations","text":"","code":""},{"path":"geometry-operations.html","id":"prerequisites-3","chapter":"5 Geometry operations","heading":"Prerequisites","text":"chapter uses packages Chapter 4 addition spDataLarge, installed Chapter 2:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)\nlibrary(spDataLarge)"},{"path":"geometry-operations.html","id":"introduction-2","chapter":"5 Geometry operations","heading":"5.1 Introduction","text":"far book explained structure geographic datasets (Chapter 2), manipulate based non-geographic attributes (Chapter 3) spatial relations (Chapter 4).\nchapter focuses manipulating geographic elements spatial objects, example creating buffers, simplifying converting vector geometries, aggregating resampling raster data.\nreading — attempting exercises end — understand control geometry column sf objects extent geographic location pixels represented rasters relation geographic objects.Section 5.2 covers transforming vector geometries ‘unary’ ‘binary’ operations.\nUnary operations work single geometry isolation, including simplification (lines polygons), creation buffers centroids, shifting/scaling/rotating single geometries using ‘affine transformations’ (Sections 5.2.1 5.2.4).\nBinary transformations modify one geometry based shape another, including clipping geometry unions, covered Sections 5.2.5 5.2.7.\nType transformations (polygon line, example) demonstrated Section 5.2.8.Section 5.3 covers geometric transformations raster objects.\ninvolves changing size number underlying pixels, assigning new values.\nteaches change resolution (also called raster aggregation disaggregation), extent origin raster.\noperations especially useful one like align raster datasets diverse sources.\nAligned raster objects share one--one correspondence pixels, allowing processed using map algebra operations, described Section 4.3.2.interaction raster vector objects covered Chapter 6.\npresents raster values can ‘masked’ ‘extracted’ vector geometries.\nImportantly also shows ‘polygonize’ rasters ‘rasterize’ vector datasets, making two data models interchangeable.","code":""},{"path":"geometry-operations.html","id":"geo-vec","chapter":"5 Geometry operations","heading":"5.2 Geometric operations on vector data","text":"section operations way change geometry vector (sf) objects.\nadvanced spatial data operations presented previous chapter (Section 4.2), drill geometry:\nfunctions discussed section work objects class sfc addition objects class sf.","code":""},{"path":"geometry-operations.html","id":"simplification","chapter":"5 Geometry operations","heading":"5.2.1 Simplification","text":"\nSimplification process generalization vector objects (lines polygons) usually use smaller scale maps.\nAnother reason simplifying objects reduce amount memory, disk space network bandwidth consume:\nmay wise simplify complex geometries publishing interactive maps.\nsf package provides st_simplify(), uses Douglas-Peucker algorithm reduce vertex count.\nst_simplify() uses dTolerance control level generalization map units (see Douglas Peucker 1973 details).\nFigure 5.1 illustrates simplification LINESTRING geometry representing river Seine tributaries.\nsimplified geometry created following command:\nFIGURE 5.1: Comparison original simplified geometry seine object.\nresulting seine_simp object copy original seine fewer vertices.\napparent, result visually simpler (Figure 5.1, right) consuming less memory original object, verified :\nSimplification also applicable polygons.\nillustrated using us_states, representing contiguous United States.limitation st_simplify() simplifies objects per-geometry basis.\nmeans ‘topology’ lost, resulting overlapping ‘holey’ areal units illustrated Figure 5.2 (right top panel).\nms_simplify() rmapshaper provides alternative.\ndefault uses Visvalingam algorithm, overcomes limitations Douglas-Peucker algorithm (Visvalingam Whyatt 1993).\n\nfollowing code chunk uses function simplify us_states.\nresult 1% vertices input (set using argument keep) number objects remains intact set keep_shapes = TRUE:22\nalternative process simplification smoothing boundaries polygon linestring geometries, implemented smoothr package.\nSmoothing interpolates edges geometries necessarily lead fewer vertices, can especially useful working geometries arise spatially vectorizing raster (topic covered Chapter 6).\nsmoothr implements three techniques smoothing: Gaussian kernel regression, Chaikin’s corner cutting algorithm, spline interpolation, described package vignette website.\nNote similar st_simplify(), smoothing algorithms don’t preserve ‘topology’.\nworkhorse function smoothr smooth(), method argument specifies smoothing technique use.\nexample using Gaussian kernel regression smooth borders US states using method=ksmooth.\nsmoothness argument controls bandwidth Gaussian used smooth geometry default value 1.Finally, visual comparison original dataset simplified smoothed versions shown Figure 5.2.\nDifferences can observed outputs Douglas-Peucker (st_simplify), Visvalingam (ms_simplify), Gaussian kernel regression (smooth(method=ksmooth) algorithms.\nFIGURE 5.2: Polygon simplification action, comparing original geometry contiguous United States simplified versions, generated functions sf (top-right), rmapshaper (bottom-left), smoothr (bottom-right) packages.\n","code":"\nseine_simp = st_simplify(seine, dTolerance = 2000) # 2000 m\nobject.size(seine)\n#> 18096 bytes\nobject.size(seine_simp)\n#> 9112 bytes\nus_states_simp1 = st_simplify(us_states, dTolerance = 100000) # 100 km\n# proportion of points to retain (0-1; default 0.05)\nus_states_simp2 = rmapshaper::ms_simplify(us_states, keep = 0.01,\n keep_shapes = TRUE)\nus_states_simp3 = smoothr::smooth(us_states, method = \"ksmooth\", smoothness = 6)"},{"path":"geometry-operations.html","id":"centroids","chapter":"5 Geometry operations","heading":"5.2.2 Centroids","text":"\nCentroid operations identify center geographic objects.\nLike statistical measures central tendency (including mean median definitions ‘average’), many ways define geographic center object.\ncreate single point representations complex vector objects.commonly used centroid operation geographic centroid.\ntype centroid operation (often referred ‘centroid’) represents center mass spatial object (think balancing plate finger).\nGeographic centroids many uses, example create simple point representation complex geometries, estimate distances polygons.\ncan calculated sf function st_centroid() demonstrated code , generates geographic centroids regions New Zealand tributaries River Seine, illustrated black points Figure 5.3.Sometimes geographic centroid falls outside boundaries parent objects (think doughnut).\ncases point surface operations can used guarantee point parent object (e.g., labeling irregular multipolygon objects island states), illustrated red points Figure 5.3.\nNotice red points always lie parent objects.\ncreated st_point_on_surface() follows:23\nFIGURE 5.3: Centroids (black points) ‘points surface’ (red points) New Zealand’s regions (left) Seine (right) datasets.\ntypes centroids exist, including Chebyshev center visual center.\nexplore possible calculate using R, ’ll see Chapter 11.","code":"\nnz_centroid = st_centroid(nz)\nseine_centroid = st_centroid(seine)\nnz_pos = st_point_on_surface(nz)\nseine_pos = st_point_on_surface(seine)"},{"path":"geometry-operations.html","id":"buffers","chapter":"5 Geometry operations","heading":"5.2.3 Buffers","text":"\nBuffers polygons representing area within given distance geometric feature:\nregardless whether input point, line polygon, output polygon.\nUnlike simplification (often used visualization reducing file size) buffering tends used geographic data analysis.\nmany points within given distance line?\ndemographic groups within travel distance new shop?\nkinds questions can answered visualized creating buffers around geographic entities interest.Figure 5.4 illustrates buffers different sizes (5 50 km) surrounding river Seine tributaries.\nbuffers created commands , show command st_buffer() requires least two arguments: input geometry distance, provided units CRS (case meters).\nFIGURE 5.4: Buffers around Seine dataset 5 km (left) 50 km (right). Note colors, reflect fact one buffer created per geometry feature.\nst_buffer() additional arguments.\nimportant ones :nQuadSegs (GEOS engine used), means ‘number segments per quadrant’ set default 30 (meaning circles created buffers composed \\(4 \\times 30 = 120\\) lines).\nUnusual cases may useful include memory consumed output buffer operation major concern (case reduced) high precision needed (case increased)max_cells (S2 engine used), larger value, smooth buffer , calculations take longerendCapStyle joinStyle (GEOS engine used), control appearance buffer’s edgessingleSide (GEOS engine used), controls whether buffer created one sides input geometry","code":"\nseine_buff_5km = st_buffer(seine, dist = 5000)\nseine_buff_50km = st_buffer(seine, dist = 50000)"},{"path":"geometry-operations.html","id":"affine-transformations","chapter":"5 Geometry operations","heading":"5.2.4 Affine transformations","text":"\nAffine transformation transformation preserves lines parallelism.\nHowever, angles length necessarily preserved.\nAffine transformations include, among others, shifting (translation), scaling rotation.\nAdditionally, possible use combination .\nAffine transformations essential part geocomputation.\nexample, shifting needed labels placement, scaling used non-contiguous area cartograms (see Section 9.6), many affine transformations applied reprojecting improving geometry created based distorted wrongly projected map.\nsf package implements affine transformation objects classes sfg sfc.Shifting moves every point distance map units.\ndone adding numerical vector vector object.\nexample, code shifts y-coordinates 100,000 meters north, leaves x-coordinates untouched (Figure 5.5, left panel).Scaling enlarges shrinks objects factor.\ncan applied either globally locally.\nGlobal scaling increases decreases coordinates values relation origin coordinates, keeping geometries topological relations intact.\ncan done subtraction multiplication sfg sfc object.Local scaling treats geometries independently requires points around geometries going scaled, e.g., centroids.\nexample , geometry shrunk factor two around centroids (Figure 5.5, middle panel).\nachieve , object firstly shifted way center coordinates 0, 0 ((nz_sfc - nz_centroid_sfc)).\nNext, sizes geometries reduced half (* 0.5).\nFinally, object’s centroid moved back input data coordinates (+ nz_centroid_sfc).Rotation two-dimensional coordinates requires rotation matrix:\\[\nR =\n\\begin{bmatrix}\n\\cos \\theta & -\\sin \\theta \\\\ \n\\sin \\theta & \\cos \\theta \\\\\n\\end{bmatrix}\n\\]rotates points clockwise direction.\nrotation matrix can implemented R :rotation function accepts one argument - rotation angle degrees.\nRotation done around selected points, centroids (Figure 5.5, right panel).\nSee vignette(\"sf3\") examples.\nFIGURE 5.5: Illustrations affine transformations: shift, scale rotate.\nFinally, newly created geometries can replace old ones st_set_geometry() function:","code":"\nnz_sfc = st_geometry(nz)\nnz_shift = nz_sfc + c(0, 100000)\nnz_centroid_sfc = st_centroid(nz_sfc)\nnz_scale = (nz_sfc - nz_centroid_sfc) * 0.5 + nz_centroid_sfc\nrotation = function(a){\n r = a * pi / 180 #degrees to radians\n matrix(c(cos(r), sin(r), -sin(r), cos(r)), nrow = 2, ncol = 2)\n} \nnz_rotate = (nz_sfc - nz_centroid_sfc) * rotation(30) + nz_centroid_sfc\nnz_scale_sf = st_set_geometry(nz, nz_scale)"},{"path":"geometry-operations.html","id":"clipping","chapter":"5 Geometry operations","heading":"5.2.5 Clipping","text":"\nSpatial clipping form spatial subsetting involves changes geometry columns least affected features.Clipping can apply features complex points:\nlines, polygons ‘multi’ equivalents.\nillustrate concept start simple example:\ntwo overlapping circles center point one unit away radius one (Figure 5.6).\nFIGURE 5.6: Overlapping circles.\nImagine want select one circle , space covered x y.\ncan done using function st_intersection(), illustrated using objects named x y represent left- right-hand circles (Figure 5.7).\nFIGURE 5.7: Overlapping circles gray color indicating intersection .\nsubsequent code chunk demonstrates works combinations ‘Venn’ diagram representing x y, inspired Figure 5.1 book R Data Science (Grolemund Wickham 2016).\nFIGURE 5.8: Spatial equivalents logical operators.\n","code":"\nb = st_sfc(st_point(c(0, 1)), st_point(c(1, 1))) # create 2 points\nb = st_buffer(b, dist = 1) # convert points to circles\nplot(b, border = \"grey\")\ntext(x = c(-0.5, 1.5), y = 1, labels = c(\"x\", \"y\"), cex = 3) # add text\nx = b[1]\ny = b[2]\nx_and_y = st_intersection(x, y)\nplot(b, border = \"grey\")\nplot(x_and_y, col = \"lightgrey\", border = \"grey\", add = TRUE) # intersecting area"},{"path":"geometry-operations.html","id":"subsetting-and-clipping","chapter":"5 Geometry operations","heading":"5.2.6 Subsetting and clipping","text":"\nClipping objects can change geometry can also subset objects, returning features intersect (partly intersect) clipping/subsetting object.\nillustrate point, subset points cover bounding box circles x y Figure 5.8.\npoints inside just one circle, inside inside neither.\nst_sample() used generate simple random distribution points within extent circles x y, resulting output illustrated Figure 5.9, raising question: subset points return point intersects x y?\nFIGURE 5.9: Randomly distributed points within bounding box enclosing circles x y. point intersects objects x y highlighted.\ncode chunk demonstrates three ways achieve result.\ncan use intersection x y (represented x_and_y previous code chunk) subsetting object directly, shown first line code chunk .\ncan also find intersection input points represented p subsetting/clipping object x_and_y, demonstrated second line code chunk .\nsecond approach return features partly intersect x_and_y modified geometries spatially extensive features cross border subsetting object.\nthird approach create subsetting object using binary spatial predicate st_intersects(), introduced previous chapter.\nresults identical (except superficial differences attribute names), implementation differs substantially:Although example rather contrived provided educational rather applied purposes, encourage reader reproduce results deepen understanding handling geographic vector objects R, raises important question: implementation use?\nGenerally, concise implementations favored, meaning first approach .\nreturn question choosing different implementations technique algorithm Chapter 11.","code":"\nbb = st_bbox(st_union(x, y))\nbox = st_as_sfc(bb)\nset.seed(2024)\np = st_sample(x = box, size = 10)\np_xy1 = p[x_and_y]\nplot(box, border = \"grey\", lty = 2)\nplot(x, add = TRUE, border = \"grey\")\nplot(y, add = TRUE, border = \"grey\")\nplot(p, add = TRUE, cex = 3.5)\nplot(p_xy1, cex = 5, col = \"red\", add = TRUE)\ntext(x = c(-0.5, 1.5), y = 1, labels = c(\"x\", \"y\"), cex = 3)\nbb = st_bbox(st_union(x, y))\nbox = st_as_sfc(bb)\nset.seed(2024)\np = st_sample(x = box, size = 10)\nx_and_y = st_intersection(x, y)\n# way #1\np_xy1 = p[x_and_y]\n# way #2\np_xy2 = st_intersection(p, x_and_y)\n# way #3\nsel_p_xy = st_intersects(p, x, sparse = FALSE)[, 1] & \n st_intersects(p, y, sparse = FALSE)[, 1]\np_xy3 = p[sel_p_xy]"},{"path":"geometry-operations.html","id":"geometry-unions","chapter":"5 Geometry operations","heading":"5.2.7 Geometry unions","text":"\nsaw Section 3.2.3, spatial aggregation can silently dissolve geometries touching polygons group.\ndemonstrated code chunk 48 US states District Columbia (us_states) aggregated four regions using base dplyr functions (see results Figure 5.10):\nFIGURE 5.10: Spatial aggregation contiguous polygons, illustrated aggregating population US states regions, population represented color. Note operation automatically dissolves boundaries states.\ngoing terms geometries?\nBehind scenes, aggregate() summarize() combine geometries dissolve boundaries using st_union().\ndemonstrated code chunk creates united western US:function can take two geometries unite , demonstrated code chunk creates united western block incorporating Texas (challenge: reproduce plot result):","code":"\nregions = aggregate(x = us_states[, \"total_pop_15\"], by = list(us_states$REGION),\n FUN = sum, na.rm = TRUE)\nregions2 = us_states |> \n group_by(REGION) |>\n summarize(pop = sum(total_pop_15, na.rm = TRUE))\nus_west = us_states[us_states$REGION == \"West\", ]\nus_west_union = st_union(us_west)\ntexas = us_states[us_states$NAME == \"Texas\", ]\ntexas_union = st_union(us_west_union, texas)"},{"path":"geometry-operations.html","id":"type-trans","chapter":"5 Geometry operations","heading":"5.2.8 Type transformations","text":"\nGeometry casting powerful operation enables transformation geometry type.\nimplemented st_cast() function sf package.\nImportantly, st_cast() behaves differently single simple feature geometry (sfg) objects, simple feature geometry column (sfc) simple features objects.Let’s create multipoint illustrate geometry casting works simple feature geometry (sfg) objects:case, st_cast() can useful transform new object linestring polygon (Figure 5.11).\nFIGURE 5.11: Examples linestring polygon casted multipoint geometry.\nConversion multipoint linestring common operation creates line object ordered point observations, GPS measurements geotagged media.\n, turn, allows us perform spatial operations calculation length path traveled.\nConversion multipoint linestring polygon often used calculate area, example set GPS measurements taken around lake corners building lot.transformation process can also reversed using st_cast():Geometry casting simple features geometry column (sfc) simple features objects works sfg cases.\nOne important difference conversion multi-types non-multi-types.\nresult process, multi-objects sfc sf split many non-multi-objects.Let’s say following sf objects:POI - POINT type (one point definition)MPOI - MULTIPOINT type four pointsLIN - LINESTRING type one linestring containing five pointsMLIN - MULTILINESTRING type two linestrings (one five points one two points)POL - POLYGON type one polygon (created using five points)MPOL - MULTIPOLYGON type consisting two polygons (consisting five points)GC - GEOMETRYCOLLECTION type two geometries, MULTIPOINT (four points) LINESTRING (five points)Table 5.1 shows possible geometry type transformations simple feature objects listed .\nSingle simple feature geometries (represented first column table) can transformed multiple geometry types, represented columns Table 5.1.\ntransformations possible: convert single point multilinestring polygon, example, explaining cells [1, 4:5] table contain NA.\ntransformations split single features input multiple sub-features, ‘expanding’ sf objects (adding new rows duplicate attribute values).\nmultipoint geometry consisting five pairs coordinates tranformed ‘POINT’ geometry, example, output contain five features.TABLE 5.1: Geometry casting simple feature geometries (see Section 2.1) input type row output type columnNote:\nNote: Values like (1) represent number features; NA means operation possibleLet’s try apply geometry type transformations new object, multilinestring_sf, example (left Figure 5.12):can imagine road river network.\nnew object one row defines lines.\nrestricts number operations can done, example prevents adding names line segment calculating lengths single lines.\nst_cast() function can used situation, separates one mutlilinestring three linestrings.\nFIGURE 5.12: Examples type casting MULTILINESTRING (left) LINESTRING (right).\nnewly created object allows attributes creation (see Section 3.2.5) length measurements:","code":"\nmultipoint = st_multipoint(matrix(c(1, 3, 5, 1, 3, 1), ncol = 2))\nlinestring = st_cast(multipoint, \"LINESTRING\")\npolyg = st_cast(multipoint, \"POLYGON\")\nmultipoint_2 = st_cast(linestring, \"MULTIPOINT\")\nmultipoint_3 = st_cast(polyg, \"MULTIPOINT\")\nall.equal(multipoint, multipoint_2)\n#> [1] TRUE\nall.equal(multipoint, multipoint_3)\n#> [1] TRUE\nmultilinestring_list = list(matrix(c(1, 4, 5, 3), ncol = 2), \n matrix(c(4, 4, 4, 1), ncol = 2),\n matrix(c(2, 4, 2, 2), ncol = 2))\nmultilinestring = st_multilinestring(multilinestring_list)\nmultilinestring_sf = st_sf(geom = st_sfc(multilinestring))\nmultilinestring_sf\n#> Simple feature collection with 1 feature and 0 fields\n#> Geometry type: MULTILINESTRING\n#> Dimension: XY\n#> Bounding box: xmin: 1 ymin: 1 xmax: 4 ymax: 5\n#> CRS: NA\n#> geom\n#> 1 MULTILINESTRING ((1 5, 4 3)...\nlinestring_sf2 = st_cast(multilinestring_sf, \"LINESTRING\")\nlinestring_sf2\n#> Simple feature collection with 3 features and 0 fields\n#> Geometry type: LINESTRING\n#> Dimension: XY\n#> Bounding box: xmin: 1 ymin: 1 xmax: 4 ymax: 5\n#> CRS: NA\n#> geom\n#> 1 LINESTRING (1 5, 4 3)\n#> 2 LINESTRING (4 4, 4 1)\n#> 3 LINESTRING (2 2, 4 2)\nlinestring_sf2$name = c(\"Riddle Rd\", \"Marshall Ave\", \"Foulke St\")\nlinestring_sf2$length = st_length(linestring_sf2)\nlinestring_sf2\n#> Simple feature collection with 3 features and 2 fields\n#> Geometry type: LINESTRING\n#> Dimension: XY\n#> Bounding box: xmin: 1 ymin: 1 xmax: 4 ymax: 5\n#> CRS: NA\n#> geom name length\n#> 1 LINESTRING (1 5, 4 3) Riddle Rd 3.61\n#> 2 LINESTRING (4 4, 4 1) Marshall Ave 3.00\n#> 3 LINESTRING (2 2, 4 2) Foulke St 2.00"},{"path":"geometry-operations.html","id":"geo-ras","chapter":"5 Geometry operations","heading":"5.3 Geometric operations on raster data","text":"\nGeometric raster operations include shift, flipping, mirroring, scaling, rotation warping images.\noperations necessary variety applications including georeferencing, used allow images overlaid accurate map known CRS (Liu Mason 2009).\nvariety georeferencing techniques exist, including:Georectification based known ground control pointsOrthorectification, also accounts local topographyImage registration used combine images thing shot different sensors aligning one image another (terms coordinate system resolution)R rather unsuitable first two points since often require manual intervention usually done help dedicated GIS software (see also Chapter 10).\nhand, aligning several images possible R section shows among others .\noften includes changing extent, resolution origin image.\nmatching projection course also required already covered Section 7.8.case, reasons perform geometric operation single raster image.\ninstance, Chapter 14 define metropolitan areas Germany 20 km2 pixels 500,000 inhabitants.\noriginal inhabitant raster, however, resolution 1 km2 decrease (aggregate) resolution factor 20 (see Section 14.5).\nAnother reason aggregating raster simply decrease run-time save disk space.\ncourse, approach recommended task hand allows coarser resolution raster data.","code":""},{"path":"geometry-operations.html","id":"geometric-intersections","chapter":"5 Geometry operations","heading":"5.3.1 Geometric intersections","text":"\nSection 4.3.1 shown extract values raster overlaid spatial objects.\nretrieve spatial output, can use almost subsetting syntax.\ndifference make clear like keep matrix structure setting drop argument FALSE.\nreturn raster object containing cells whose midpoints overlap clip.operation can also use intersect() crop() command.","code":"\nelev = rast(system.file(\"raster/elev.tif\", package = \"spData\"))\nclip = rast(xmin = 0.9, xmax = 1.8, ymin = -0.45, ymax = 0.45,\n resolution = 0.3, vals = rep(1, 9))\nelev[clip, drop = FALSE]\n#> class : SpatRaster \n#> dimensions : 2, 1, 1 (nrow, ncol, nlyr)\n#> resolution : 0.5, 0.5 (x, y)\n#> extent : 1, 1.5, -0.5, 0.5 (xmin, xmax, ymin, ymax)\n#> coord. ref. : lon/lat WGS 84 (EPSG:4326) \n#> source(s) : memory\n#> varname : elev \n#> name : elev \n#> min value : 18 \n#> max value : 24"},{"path":"geometry-operations.html","id":"extent-and-origin","chapter":"5 Geometry operations","heading":"5.3.2 Extent and origin","text":"\nmerging performing map algebra rasters, resolution, projection, origin /extent match. Otherwise, add values one raster resolution 0.2 decimal degrees second raster resolution 1 decimal degree?\nproblem arises like merge satellite imagery different sensors different projections resolutions.\ncan deal mismatches aligning rasters.simplest case, two images differ regard extent.\nfollowing code adds one row two columns side raster setting new values NA (Figure 5.13).\nFIGURE 5.13: Original raster (left) raster (right) extended one row top bottom two columns left right.\nPerforming algebraic operation two objects differing extents R, terra package returns error.However, can align extent two rasters extend().\nInstead telling function many rows columns added (done ), allow figure using another raster object.\n, extend elev object extent elev_2.\nvalues newly added rows columns set NA.\norigin raster cell corner closest coordinates (0, 0).\norigin() function returns coordinates origin.\nexample cell corner exists coordinates (0, 0), necessarily case.two rasters different origins, cells overlap completely make map algebra impossible.\nchange origin, use origin().24\nFigure 5.14 reveals effect changing origin way.\nFIGURE 5.14: Rasters identical values different origins.\nNote changing resolution (next section) frequently also changes origin.","code":"\nelev = rast(system.file(\"raster/elev.tif\", package = \"spData\"))\nelev_2 = extend(elev, c(1, 2))\nelev_3 = elev + elev_2\n#> Error: [+] extents do not match\nelev_4 = extend(elev, elev_2)\norigin(elev_4)\n#> [1] 0 0\n# change the origin\norigin(elev_4) = c(0.25, 0.25)"},{"path":"geometry-operations.html","id":"aggregation-and-disaggregation","chapter":"5 Geometry operations","heading":"5.3.3 Aggregation and disaggregation","text":"\nRaster datasets can also differ regard resolution.\nmatch resolutions, one can either decrease (aggregate()) increase (disagg()) resolution one raster.25As example, change spatial resolution dem (found spDataLarge package) factor 5 (Figure 5.15).\nAdditionally, output cell value going correspond mean input cells (note one use functions well, median(), sum(), etc.):\nFIGURE 5.15: Original raster (left). Aggregated raster (right).\nTable 5.2 compares properties original aggregated raster.\nNotice “decreasing” resolution aggregate() increases resolution \\((30.85, 30.85)\\) \\((154.25, 154.25)\\).\ndone decreasing number rows (nrow) columns (ncol) (see Section 2.3).\nextent slightly adjusted accommodate new grid size.TABLE 5.2: Properties original aggregated raster.\ndisagg() function increases resolution raster objects.\ncomes two methods compute values newly created cells: default method (method = \"near\") simply gives output cells value input cell, hence duplicates values, translates ‘blocky’ output.\nbilinear method uses four nearest pixel centers input image (salmon colored points Figure 5.16) compute average weighted distance (arrows Figure 5.16).\nvalue output cell represented square upper left corner Figure 5.16.\nFIGURE 5.16: distance-weighted average four closest input cells determine output using bilinear method disaggregation.\nComparing values dem dem_disagg tells us identical (can also use compareGeom() .equal()).\nHowever, hardly expected, since disaggregating simple interpolation technique.\nimportant keep mind disaggregating results finer resolution; corresponding values, however, accurate lower resolution source.","code":"\ndem = rast(system.file(\"raster/dem.tif\", package = \"spDataLarge\"))\ndem_agg = aggregate(dem, fact = 5, fun = mean)\ndem_disagg = disagg(dem_agg, fact = 5, method = \"bilinear\")\nidentical(dem, dem_disagg)\n#> [1] FALSE"},{"path":"geometry-operations.html","id":"resampling","chapter":"5 Geometry operations","heading":"5.3.4 Resampling","text":"\nmethods aggregation disaggregation suitable want change resolution raster aggregation/disaggregation factor.\nHowever, two rasters different resolutions origins?\nrole resampling – process computing values new pixel locations.\nshort, process takes values original raster recalculates new values target raster custom resolution origin (Figure 5.17).\nFIGURE 5.17: Resampling original (input) raster target raster custom resolution origin.\n\nseveral methods estimating values raster different resolutions/origins, shown Figure 5.18.\nmain resampling methods include:Nearest neighbor: assigns value nearest cell original raster cell target one. fast simple technique usually suitable resampling categorical rastersBilinear interpolation: assigns weighted average four nearest cells original raster cell target one (Figure 5.16). fastest method appropriate continuous rastersCubic interpolation: uses values 16 nearest cells original raster determine output cell value, applying third-order polynomial functions. Used continuous rasters results smoother surface compared bilinear interpolation, computationally demandingCubic spline interpolation: also uses values 16 nearest cells original raster determine output cell value, applies cubic splines (piecewise third-order polynomial functions). Used continuous rastersLanczos windowed sinc resampling: uses values 36 nearest cells original raster determine output cell value. Used continuous rasters26The explanation highlights nearest neighbor resampling suitable categorical rasters, methods can used (different outcomes) continuous rasters.\nPlease note also, methods gain complexity processing time top bottom.\nMoreover, resampling can done using statistics (e.g., minimum mode) contributing cells.apply resampling, terra package provides resample() function.\naccepts input raster (x), raster target spatial properties (y), resampling method (method).need raster target spatial properties see resample() function works.\nexample, create target_rast, often use already existing raster object.Next, need provide two raster objects first two arguments one resampling methods described .Figure 5.18 shows comparison different resampling methods dem object.\nFIGURE 5.18: Visual comparison original raster five different resampling methods.\nresample() function also additional resampling methods, including sum, min, q1, med, q3, max, average, mode, rms.\ncalculate given statistic based values non-NA contributing grid cells.\nexample, sum useful raster cell represents spatially extensive variable (e.g., number people).\neffect using sum, resampled raster total number people original one.\nsee section 7.8, raster reprojection special case resampling target raster different CRS original raster.geometry operations terra user-friendly, rather fast, work large raster objects.\nHowever, cases, terra performant either extensive rasters many raster files, alternatives considered.established alternatives come GDAL library.\ncontains several utility functions, including:gdalinfo - lists various information raster file, including resolution, CRS, bounding box, moregdal_translate - converts raster data different file formatsgdal_rasterize - converts vector data raster filesgdalwarp - allows raster mosaicing, resampling, cropping, reprojecting","code":"\ntarget_rast = rast(xmin = 794650, xmax = 798250, \n ymin = 8931750, ymax = 8935350,\n resolution = 300, crs = \"EPSG:32717\")\ndem_resampl = resample(dem, y = target_rast, method = \"bilinear\")"},{"path":"geometry-operations.html","id":"exercises-3","chapter":"5 Geometry operations","heading":"5.4 Exercises","text":"E1. Generate plot simplified versions nz dataset.\nExperiment different values keep (ranging 0.5 0.00005) ms_simplify() dTolerance (100 100,000) st_simplify().value form result start break method, making New Zealand unrecognizable?Advanced: different geometry type results st_simplify() compared geometry type ms_simplify()? problems create can resolved?E2. first exercise Chapter Spatial data operations established Canterbury region 70 101 highest points New Zealand.\nUsing st_buffer(), many points nz_height within 100 km Canterbury?E3. Find geographic centroid New Zealand.\nfar geographic centroid Canterbury?E4. world maps north-orientation.\nworld map south-orientation created reflection (one affine transformations mentioned chapter) world object’s geometry.\nWrite code .\nHint: can use rotation() function chapter transformation.\nBonus: create upside-map country.E5. Run code Section 5.2.6. reference objects created section, subset point p contained within x y.Using base subsetting operators.Using intermediary object created st_intersection().E6. Calculate length boundary lines US states meters.\nstate longest border shortest?\nHint: st_length function computes length LINESTRING MULTILINESTRING geometry.E7. Read srtm.tif file R (srtm = rast(system.file(\"raster/srtm.tif\", package = \"spDataLarge\"))).\nraster resolution 0.00083 0.00083 degrees.\nChange resolution 0.01 0.01 degrees using method available terra package.\nVisualize results.\nCan notice differences results resampling methods?","code":""},{"path":"raster-vector.html","id":"raster-vector","chapter":"6 Raster-vector interactions","heading":"6 Raster-vector interactions","text":"","code":""},{"path":"raster-vector.html","id":"prerequisites-4","chapter":"6 Raster-vector interactions","heading":"Prerequisites","text":"chapter requires following packages:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)"},{"path":"raster-vector.html","id":"introduction-3","chapter":"6 Raster-vector interactions","heading":"6.1 Introduction","text":"\nchapter focuses interactions raster vector geographic data models, introduced Chapter 2.\nincludes several main techniques:\nraster cropping masking using vector objects (Section 6.2),\nextracting raster values using different types vector data (Section 6.3),\nraster-vector conversion (Sections 6.4 6.5).\nconcepts demonstrated using data previous chapters understand potential real-world applications.","code":""},{"path":"raster-vector.html","id":"raster-cropping","chapter":"6 Raster-vector interactions","heading":"6.2 Raster cropping","text":"\nMany geographic data projects involve integrating data many different sources, remote sensing images (rasters) administrative boundaries (vectors).\nOften extent input raster datasets larger area interest.\ncase, raster cropping masking useful unifying spatial extent input data.\noperations reduce object memory use associated computational resources subsequent analysis steps may necessary preprocessing step creating attractive maps involving raster data.use two objects illustrate raster cropping:SpatRaster object srtm representing elevation (meters sea level) southwestern UtahA vector (sf) object zion representing Zion National ParkBoth target cropping objects must projection.\nfollowing code chunk therefore reads datasets spDataLarge package installed Chapter 2, also ‘reprojects’ zion (topic covered Chapter 7):use crop() terra package crop srtm raster.\nfunction reduces rectangular extent object passed first argument based extent object passed second argument.\nfunctionality demonstrated command , generates Figure 6.1(B).\nRelated crop() terra function mask(), sets values outside bounds object passed second argument NA.\nfollowing command therefore masks every cell outside Zion National Park boundaries (Figure 6.1(C)).Importantly, want use crop() mask() together cases.\ncombination functions () limit raster’s extent area interest (b) replace values outside area NA.27Changing settings mask() yields different results.\nSetting inverse = TRUE mask everything inside bounds park (see ?mask details) (Figure 6.1(D)), setting updatevalue = 0 set pixels outside national park 0.\nFIGURE 6.1: Raster cropping raster masking.\n","code":"\nsrtm = rast(system.file(\"raster/srtm.tif\", package = \"spDataLarge\"))\nzion = read_sf(system.file(\"vector/zion.gpkg\", package = \"spDataLarge\"))\nzion = st_transform(zion, st_crs(srtm))\nsrtm_cropped = crop(srtm, zion)\nsrtm_masked = mask(srtm, zion)\nsrtm_cropped = crop(srtm, zion)\nsrtm_final = mask(srtm_cropped, zion)\nsrtm_inv_masked = mask(srtm, zion, inverse = TRUE)"},{"path":"raster-vector.html","id":"raster-extraction","chapter":"6 Raster-vector interactions","heading":"6.3 Raster extraction","text":"\nRaster extraction process identifying returning values associated ‘target’ raster specific locations, based (typically vector) geographic ‘selector’ object.\nresults depend type selector used (points, lines polygons) arguments passed terra::extract() function.\nreverse raster extraction — assigning raster cell values based vector objects — rasterization, described Section 6.4.\nbasic example extracting value raster cell specific points.\npurpose, use zion_points, contain sample 30 locations within Zion National Park (Figure 6.2).\nfollowing command extracts elevation values srtm creates data frame points’ IDs (one value per vector’s row) related srtm values point.\nNow, can add resulting object zion_points dataset cbind() function:\nFIGURE 6.2: Locations points used raster extraction.\n\nRaster extraction also works line selectors.\n, extracts one value raster cell touched line.\nHowever, line extraction approach recommended obtain values along transects, hard get correct distance pair extracted raster values.case, better approach split line many points extract values points.\ndemonstrate , code creates zion_transect, straight line going northwest southeast Zion National Park, illustrated Figure 6.3() (see Section 2.2 recap vector data model):utility extracting heights linear selector illustrated imagining planning hike.\nmethod demonstrated provides ‘elevation profile’ route (line need straight), useful estimating long take due long climbs.first step add unique id transect.\nNext, st_segmentize() function can add points along line(s) provided density (dfMaxLength) convert points st_cast().Now, large set points, want derive distance first point transects subsequent points.\ncase, one transect, code, principle, work number transects:Finally, can extract elevation values point transects combine information main object.resulting zion_transect can used create elevation profiles, illustrated Figure 6.3(B).\nFIGURE 6.3: Location line used () raster extraction (B) elevation along line.\n\nfinal type geographic vector object raster extraction polygons.\nLike lines, polygons tend return many raster values per polygon.\ndemonstrated command , results data frame column names ID (row number polygon) srtm (associated elevation values):results can used generate summary statistics raster values per polygon, example characterize single region compare many regions.\nshown code , creates object zion_srtm_df containing summary statistics elevation values Zion National Park (see Figure 6.4()):preceding code chunk used dplyr provide summary statistics cell values per polygon ID, described Chapter 3.\nresults provide useful summaries, example maximum height park around 2,661 meters sea level (summary statistics, standard deviation, can also calculated way).\none polygon example, data frame single row returned; however, method works multiple selector polygons used.similar approach works counting occurrences categorical raster values within polygons.\nillustrated land cover dataset (nlcd) spDataLarge package Figure 6.4(B), demonstrated code :\nFIGURE 6.4: Area used () continuous (B) categorical raster extraction.\n\nAlthough terra package offers rapid extraction raster values within polygons, extract() can still bottleneck processing large polygon datasets.\nexactextractr package offers significantly faster alternative extracting pixel values exact_extract() function.\nexact_extract() function also computes, default, fraction raster cell overlapped polygon, precise (see note details).","code":"\ndata(\"zion_points\", package = \"spDataLarge\")\nelevation = terra::extract(srtm, zion_points)\nzion_points = cbind(zion_points, elevation)\nzion_transect = cbind(c(-113.2, -112.9), c(37.45, 37.2)) |>\n st_linestring() |> \n st_sfc(crs = crs(srtm)) |>\n st_sf(geometry = _)\nzion_transect$id = 1:nrow(zion_transect)\nzion_transect = st_segmentize(zion_transect, dfMaxLength = 250)\nzion_transect = st_cast(zion_transect, \"POINT\")\nzion_transect = zion_transect |> \n group_by(id) |> \n mutate(dist = st_distance(geometry)[, 1]) \nzion_elev = terra::extract(srtm, zion_transect)\nzion_transect = cbind(zion_transect, zion_elev)\nzion_srtm_values = terra::extract(x = srtm, y = zion)\ngroup_by(zion_srtm_values, ID) |> \n summarize(across(srtm, list(min = min, mean = mean, max = max)))\n#> # A tibble: 1 × 4\n#> ID srtm_min srtm_mean srtm_max\n#> \n#> 1 1 1122 1818. 2661\nnlcd = rast(system.file(\"raster/nlcd.tif\", package = \"spDataLarge\"))\nzion2 = st_transform(zion, st_crs(nlcd))\nzion_nlcd = terra::extract(nlcd, zion2)\nzion_nlcd |> \n group_by(ID, levels) |>\n count()\n#> # A tibble: 7 × 3\n#> # Groups: ID, levels [7]\n#> ID levels n\n#> \n#> 1 1 Developed 4205\n#> 2 1 Barren 98285\n#> 3 1 Forest 298299\n#> 4 1 Shrubland 203700\n#> # ℹ 3 more rows"},{"path":"raster-vector.html","id":"rasterization","chapter":"6 Raster-vector interactions","heading":"6.4 Rasterization","text":"\nRasterization conversion vector objects representation raster objects.\nUsually, output raster used quantitative analysis (e.g., analysis terrain) modeling.\nsaw Chapter 2, raster data model characteristics make conducive certain methods.\nFurthermore, process rasterization can help simplify datasets resulting values spatial resolution: rasterization can seen special type geographic data aggregation.terra package contains function rasterize() work.\nfirst two arguments , x, vector object rasterized , y, ‘template raster’ object defining extent, resolution CRS output.\ngeographic resolution input raster major impact results: low (cell size large), result may miss full geographic variability vector data; high, computational times may excessive.\nsimple rules follow deciding appropriate geographic resolution, heavily dependent intended use results.\nOften target resolution imposed user, example output rasterization needs aligned existing raster.\ndemonstrate rasterization action, use template raster extent CRS input vector data cycle_hire_osm_projected (dataset cycle hire points London illustrated Figure 6.5()) spatial resolution 1000 meters:Rasterization flexible operation: results depend nature template raster, also type input vector (e.g., points, polygons) variety arguments taken rasterize() function.illustrate flexibility, try three different approaches rasterization.\nFirst, create raster representing presence absence cycle hire points (known presence/absence rasters).\ncase rasterize() requires argument addition x y, aforementioned vector raster objects (results illustrated Figure 6.5(B)).fun argument specifies summary statistics used convert multiple observations close proximity associate cells raster object.\ndefault fun = \"last\" used, options fun = \"length\" can used, case count number cycle hire points grid cell (results operation illustrated Figure 6.5(C)).new output, ch_raster2, shows number cycle hire points grid cell.\ncycle hire locations different numbers bicycles described capacity variable, raising question, ’s capacity grid cell?\ncalculate must sum field (\"capacity\"), resulting output illustrated Figure 6.5(D), calculated following command (summary functions mean used).\nFIGURE 6.5: Examples point rasterization.\n\nAnother dataset based California’s polygons borders (created ) illustrates rasterization lines.\ncasting polygon objects multilinestring, template raster created resolution 0.5 degree:considering line polygon rasterization, one useful additional argument touches.\ndefault FALSE, changed TRUE, cells touched line polygon border get value.\nLine rasterization touches = TRUE demonstrated code (Figure 6.6()).Compare polygon rasterization, touches = FALSE default, selects raster cells whose centroids inside selector polygon, illustrated Figure 6.6(B).\nFIGURE 6.6: Examples line polygon rasterizations.\n","code":"\ncycle_hire_osm = spData::cycle_hire_osm\ncycle_hire_osm_projected = st_transform(cycle_hire_osm, \"EPSG:27700\")\nraster_template = rast(ext(cycle_hire_osm_projected), resolution = 1000,\n crs = crs(cycle_hire_osm_projected))\nch_raster1 = rasterize(cycle_hire_osm_projected, raster_template)\nch_raster2 = rasterize(cycle_hire_osm_projected, raster_template, \n fun = \"length\")\nch_raster3 = rasterize(cycle_hire_osm_projected, raster_template, \n field = \"capacity\", fun = sum, na.rm = TRUE)\ncalifornia = dplyr::filter(us_states, NAME == \"California\")\ncalifornia_borders = st_cast(california, \"MULTILINESTRING\")\nraster_template2 = rast(ext(california), resolution = 0.5,\n crs = st_crs(california)$wkt)\ncalifornia_raster1 = rasterize(california_borders, raster_template2,\n touches = TRUE)\ncalifornia_raster2 = rasterize(california, raster_template2) "},{"path":"raster-vector.html","id":"spatial-vectorization","chapter":"6 Raster-vector interactions","heading":"6.5 Spatial vectorization","text":"\nSpatial vectorization counterpart rasterization (Section 6.4), opposite direction.\ninvolves converting spatially continuous raster data spatially discrete vector data points, lines polygons.\nsimplest form vectorization convert centroids raster cells points.\n.points() exactly non-NA raster grid cells (Figure 6.7).\nNote, also used st_as_sf() convert resulting object sf class.\nFIGURE 6.7: Raster point representation elev object.\n\nAnother common type spatial vectorization creation contour lines representing lines continuous height temperatures (isotherms), example.\nuse real-world digital elevation model (DEM) artificial raster elev produces parallel lines (task reader: verify explain happens).\nContour lines can created terra function .contour(), wrapper around built-R function filled.contour(), demonstrated (shown):Contours can also added existing plots functions contour(), rasterVis::contourplot().\n\nillustrated Figure 6.8, isolines can labeled.\nFIGURE 6.8: Digital elevation model hillshading, showing southern flank Mt. Mongón overlaid contour lines.\n\nfinal type vectorization involves conversion rasters polygons.\ncan done terra::.polygons(), converts raster cell polygon consisting five coordinates, stored memory (explaining rasters often fast compared vectors!).illustrated converting grain object polygons subsequently dissolving borders polygons attribute values (also see dissolve argument .polygons()).\nFIGURE 6.9: Vectorization () raster (B) polygons (dissolve = FALSE) aggregated polygons (dissolve = TRUE).\naggregated polygons grain dataset rectilinear boundaries arise defined connecting rectangular pixels.\nsmoothr package described Chapter 5 can used smooth edges polygons.\nsmoothing removes sharp edges polygon boundaries, smoothed polygons exact spatial coverage original pixels.\nCaution therefore taken using smoothed polygons analysis.","code":"\nelev = rast(system.file(\"raster/elev.tif\", package = \"spData\"))\nelev_point = as.points(elev) |> \n st_as_sf()\ndem = rast(system.file(\"raster/dem.tif\", package = \"spDataLarge\"))\ncl = as.contour(dem) |> \n st_as_sf()\nplot(dem, axes = FALSE)\nplot(cl, add = TRUE)\ngrain = rast(system.file(\"raster/grain.tif\", package = \"spData\"))\ngrain_poly = as.polygons(grain) |> \n st_as_sf()"},{"path":"raster-vector.html","id":"exercises-4","chapter":"6 Raster-vector interactions","heading":"6.6 Exercises","text":"following exercises use vector (zion_points) raster dataset (srtm) spDataLarge package.\nalso use polygonal ‘convex hull’ derived vector dataset (ch) represent area interest:E1. Crop srtm raster using (1) zion_points dataset (2) ch dataset.\ndifferences output maps?\nNext, mask srtm using two datasets.\nCan see difference now?\ncan explain ?E2. Firstly, extract values srtm points represented zion_points.\nNext, extract average values srtm using 90 buffer around point zion_points compare two sets values.\nextracting values buffers suitable points alone?Bonus: Implement extraction using exactextractr package compare results.E3. Subset points higher 3100 meters New Zealand (nz_height object) create template raster resolution 3 km extent new point dataset.\nUsing two new objects:Count numbers highest points grid cell.Find maximum elevation grid cell.E4. Aggregate raster counting high points New Zealand (created previous exercise), reduce geographic resolution half (cells 6 x 6 km) plot result.Resample lower resolution raster back original resolution 3 km. results changed?Name two advantages disadvantages reducing raster resolution.E5. Polygonize grain dataset filter squares representing clay.Name two advantages disadvantages vector data raster data.useful convert rasters vectors work?","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(spData)\nzion_points_path = system.file(\"vector/zion_points.gpkg\", package = \"spDataLarge\")\nzion_points = read_sf(zion_points_path)\nsrtm = rast(system.file(\"raster/srtm.tif\", package = \"spDataLarge\"))\nch = st_combine(zion_points) |>\n st_convex_hull() |> \n st_as_sf()"},{"path":"reproj-geo-data.html","id":"reproj-geo-data","chapter":"7 Reprojecting geographic data","heading":"7 Reprojecting geographic data","text":"","code":""},{"path":"reproj-geo-data.html","id":"prerequisites-5","chapter":"7 Reprojecting geographic data","heading":"Prerequisites","text":"chapter requires following packages:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)\nlibrary(spDataLarge)"},{"path":"reproj-geo-data.html","id":"reproj-intro","chapter":"7 Reprojecting geographic data","heading":"7.1 Introduction","text":"Section 2.4 introduced coordinate reference systems (CRSs), focus two major types: geographic (‘lon/lat’, units degrees longitude latitude) projected (typically units meters datum) coordinate systems.\nchapter builds knowledge goes .\ndemonstrates set transform geographic data one CRS another , furthermore, highlights specific issues can arise due ignoring CRSs aware , especially data stored lon/lat coordinates.\nmany projects need worry , let alone convert , different CRSs.\nNonetheless, important know data projected geographic coordinate reference system, consequences geometry operations.\nknow information, CRSs just work behind scenes: people often suddenly need learn CRSs things go wrong.\nclearly defined CRS project data , understanding use different CRSs, can ensure things don’t go wrong.\nFurthermore, learning coordinate systems deepen knowledge geographic datasets use effectively.chapter teaches fundamentals CRSs, demonstrates consequences using different CRSs (including can go wrong), ‘reproject’ datasets one coordinate system another.\nnext section introduce CRSs R, followed Section 7.3 shows get set CRSs associated spatial objects.\nSection 7.4 demonstrates importance knowing CRS data reference worked example creating buffers.\ntackle questions reproject CRS use Section 7.5 Section 7.6, respectively.\nFinally, cover reprojecting vector raster objects Sections 7.7 7.8 modifying map projections Section 7.9.","code":""},{"path":"reproj-geo-data.html","id":"crs-in-r","chapter":"7 Reprojecting geographic data","heading":"7.2 Coordinate reference systems","text":"\nmodern geographic tools require CRS conversions, including core R-spatial packages desktop GIS software QGIS, interface PROJ, open source C++ library “transforms coordinates one coordinate reference system (CRS) another”.\nCRSs can described many ways, including following:Simple yet potentially ambiguous statements “’s lon/lat coordinates”Formalized yet now outdated ‘proj4 strings’ (also known ‘proj-string’) +proj=longlat +ellps=WGS84 +datum=WGS84 +no_defsWith identifying ‘authority:code’ text string EPSG:4326Each refers thing: ‘WGS84’ coordinate system forms basis Global Positioning System (GPS) coordinates many datasets.\none correct?\nshort answer third way identify CRSs preferable: EPSG:4326 understood sf (extension stars) terra packages covered book, plus many software projects working geographic data including QGIS PROJ.\nEPSG:4326 future-proof.\nFurthermore, although machine readable, “EPSG:4326” short, easy remember highly ‘findable’ online (searching EPSG:4326 yields dedicated page website epsg.io, example).\nconcise identifier 4326 understood sf, recommend explicit AUTHORITY:CODE representation prevent ambiguity provide context.\nlonger answer none three descriptions sufficient, detail needed unambiguous CRS handling transformations: due complexity CRSs, possible capture relevant information short text strings.\nreason, Open Geospatial Consortium (OGC, also developed simple features specification sf package implements) developed open standard format describing CRSs called WKT (Well-Known Text).\ndetailed 100+ page document “defines structure content text string implementation abstract model coordinate reference systems described ISO 19111:2019” (Open Geospatial Consortium 2019).\nWKT representation WGS84 CRS, identifier EPSG:4326 follows:\noutput command shows CRS identifier (also known Spatial Reference Identifier SRID) works: simply look-, providing unique identifier associated complete WKT representation CRS.\nraises question: happens mismatch identifier longer WKT representation CRS?\npoint Open Geospatial Consortium (2019) clear, verbose WKT representation takes precedence identifier:attributes values given cited identifier conflict attributes values given explicitly WKT description, WKT values shall prevail.\nconvention referring CRSs identifiers form AUTHORITY:CODE, also used geographic software written languages, allows wide range formally defined coordinate systems referred .28\ncommonly used authority CRS identifiers EPSG, acronym European Petroleum Survey Group published standardized list CRSs (EPSG taken Geomatics Committee International Association Oil & Gas Producers 2005).\nauthorities can used CRS identifiers.\nESRI:54030, example, refers ESRI’s implementation Robinson projection, following WKT string (first eight lines shown):\nWKT strings exhaustive, detailed, precise, allowing unambiguous CRSs storage transformations.\ncontain relevant information given CRS, including datum ellipsoid, prime meridian, projection, units.29\nRecent PROJ versions (6+) still allow use proj-strings define coordinate operations, proj-string keys (+nadgrids, +towgs84, +k, +init=epsg:) either longer supported discouraged.\nAdditionally, three datums (.e., WGS84, NAD83, NAD27) can directly set proj-string.\nLonger explanations evolution CRS definitions PROJ library can found Bivand (2021), chapter 2 Pebesma Bivand (2023b), blog post Floris Vanderhaeghe, available inbo.github.io/tutorials/tutorials/spatial_crs_coding/.\nAlso, outlined PROJ documentation different versions WKT CRS format including WKT1 two variants WKT2, latter (WKT2, 2018 specification) corresponds ISO 19111:2019 (Open Geospatial Consortium 2019).","code":"\nst_crs(\"EPSG:4326\")\n#> Coordinate Reference System:\n#> User input: EPSG:4326 \n#> wkt:\n#> GEOGCRS[\"WGS 84\",\n#> ENSEMBLE[\"World Geodetic System 1984 ensemble\",\n#> MEMBER[\"World Geodetic System 1984 (Transit)\"],\n#> MEMBER[\"World Geodetic System 1984 (G730)\"],\n#> MEMBER[\"World Geodetic System 1984 (G873)\"],\n#> MEMBER[\"World Geodetic System 1984 (G1150)\"],\n#> MEMBER[\"World Geodetic System 1984 (G1674)\"],\n#> MEMBER[\"World Geodetic System 1984 (G1762)\"],\n#> MEMBER[\"World Geodetic System 1984 (G2139)\"],\n#> ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n#> LENGTHUNIT[\"metre\",1]],\n#> ENSEMBLEACCURACY[2.0]],\n#> PRIMEM[\"Greenwich\",0,\n#> ANGLEUNIT[\"degree\",0.0174532925199433]],\n#> CS[ellipsoidal,2],\n#> AXIS[\"geodetic latitude (Lat)\",north,\n#> ORDER[1],\n#> ANGLEUNIT[\"degree\",0.0174532925199433]],\n#> AXIS[\"geodetic longitude (Lon)\",east,\n#> ORDER[2],\n#> ANGLEUNIT[\"degree\",0.0174532925199433]],\n#> USAGE[\n#> SCOPE[\"Horizontal component of 3D system.\"],\n#> AREA[\"World.\"],\n#> BBOX[-90,-180,90,180]],\n#> ID[\"EPSG\",4326]]\nst_crs(\"ESRI:54030\")\n#> Coordinate Reference System:\n#> User input: ESRI:54030 \n#> wkt:\n#> PROJCRS[\"World_Robinson\",\n#> BASEGEOGCRS[\"WGS 84\",\n#> DATUM[\"World Geodetic System 1984\",\n#> ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n#> LENGTHUNIT[\"metre\",1]]],\n...."},{"path":"reproj-geo-data.html","id":"crs-setting","chapter":"7 Reprojecting geographic data","heading":"7.3 Querying and setting coordinate systems","text":"\nLet’s look CRSs stored R spatial objects can queried set.\nFirst, look getting setting CRSs vector geographic data objects, starting following example:new object, new_vector, data frame class sf represents countries worldwide (see help page ?spData::world details).\nCRS can retrieved sf function st_crs().\noutput list containing two main components:User input (case WGS 84, synonym EPSG:4326 case taken input file), corresponding CRS identifiers described abovewkt, containing full WKT string relevant information CRSThe input element flexible, depending input file user input, can contain AUTHORITY:CODE representation (e.g., EPSG:4326), CRS’s name (e.g., WGS 84), even proj-string definition.\nwkt element stores WKT representation, used saving object file coordinate operations.\n, can see new_vector object WGS84 ellipsoid, uses Greenwich prime meridian, latitude longitude axis order.\ncase, also additional elements, USAGE explaining area suitable use CRS, ID pointing CRS’s identifier: EPSG:4326.\nst_crs function also one helpful feature – can retrieve additional information used CRS.\nexample, try run:st_crs(new_vector)$IsGeographic check CRS geographic notst_crs(new_vector)$units_gdal find CRS unitsst_crs(new_vector)$srid extract ‘SRID’ identifier (available)st_crs(new_vector)$proj4string extract proj-string representationIn cases CRS missing wrong CRS set, st_set_crs() function can used (case WKT string remains unchanged CRS already set correctly file read-):\nGetting setting CRSs works similar way raster geographic data objects.\ncrs() function terra package accesses CRS information SpatRaster object (note use cat() function print nicely).output WKT string representation CRS.\nfunction, crs(), can also used set CRS raster objects., can use either identifier (recommended cases) complete WKT string representation.\nAlternative methods set crs include proj-string strings CRSs extracted existing objects crs(), although approaches may less future-proof.Importantly, st_crs() crs() functions alter coordinates’ values geometries.\nrole set metadata information object CRS.cases CRS geographic object unknown, case london dataset created code chunk , building example London introduced Section 2.2:output NA shows sf know CRS unwilling guess (NA literally means ‘available’).\nUnless CRS manually specified loaded source CRS metadata, sf make explicit assumptions coordinate systems, say “don’t know”.\nbehavior makes sense given diversity available CRSs differs approaches, GeoJSON file format specification, makes simplifying assumption coordinates lon/lat CRS: EPSG:4326.\nDatasets without specified CRS can cause problems: geographic coordinates coordinate reference system, software can make good decisions around plotting geometry operations knows type CRS working .\nThus, , important always check CRS dataset set missing.","code":"\nvector_filepath = system.file(\"shapes/world.gpkg\", package = \"spData\")\nnew_vector = read_sf(vector_filepath)\nst_crs(new_vector) # get CRS\n#> Coordinate Reference System:\n#> User input: WGS 84 \n#> wkt:\n#> ...\nnew_vector = st_set_crs(new_vector, \"EPSG:4326\") # set CRS\nraster_filepath = system.file(\"raster/srtm.tif\", package = \"spDataLarge\")\nmy_rast = rast(raster_filepath)\ncat(crs(my_rast)) # get CRS\n#> GEOGCRS[\"WGS 84\",\n#> DATUM[\"World Geodetic System 1984\",\n#> ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n#> LENGTHUNIT[\"metre\",1]]],\n#> PRIMEM[\"Greenwich\",0,\n#> ANGLEUNIT[\"degree\",0.0174532925199433]],\n....\ncrs(my_rast) = \"EPSG:26912\" # set CRS\nlondon = data.frame(lon = -0.1, lat = 51.5) |> \n st_as_sf(coords = c(\"lon\", \"lat\"))\nst_is_longlat(london)\n#> [1] NA\nlondon_geo = st_set_crs(london, \"EPSG:4326\")\nst_is_longlat(london_geo)\n#> [1] TRUE"},{"path":"reproj-geo-data.html","id":"geom-proj","chapter":"7 Reprojecting geographic data","heading":"7.4 Geometry operations on projected and unprojected data","text":"Since sf version 1.0.0, R’s ability work geographic vector datasets lon/lat CRSs improved substantially, thanks integration S2 spherical geometry engine introduced Section 2.2.9.\nshown Figure 7.1, sf uses either GEOS S2 depending type CRS whether S2 disabled (enabled default).\nGEOS always used projected data data CRS; geographic data, S2 used default can disabled sf::sf_use_s2(FALSE).\nFIGURE 7.1: Behavior geometry operations sf package depending input data’s CRS.\ndemonstrate importance CRSs, create buffer 100 km around london object previous section.\nalso create deliberately faulty buffer ‘distance’ 1 degree, roughly equivalent 100 km (1 degree 111 km equator).\ndiving code, may worth skipping briefly ahead peek Figure 7.2 get visual handle outputs able reproduce following code chunks .first stage create three buffers around london london_geo objects created boundary distances 1 degree 100 km (100,000 m, can expressed 1e5 scientific notation) central London:first line , sf assumes input projected generates result buffer units degrees, problematic, see.\nsecond line, sf silently uses spherical geometry engine S2, introduced Chapter 2, calculate extent buffer using default value max_cells = 1000 — set 100 line three — consequences become apparent shortly.\nhighlight impact sf’s use S2 geometry engine unprojected (geographic) coordinate systems, temporarily disable command sf_use_s2() (, TRUE, default), code chunk .\nLike london_buff_no_crs, new london_geo object geographic abomination: units degrees, makes sense vast majority cases:warning message hints issues performing planar geometry operations lon/lat data.\nspherical geometry operations turned , command sf::sf_use_s2(FALSE), buffers (geometric operations) may result worthless outputs use units latitude longitude, poor substitute proper units distances meters.interpret warning geographic (longitude/latitude) CRS “CRS set”: almost always !\nbetter understood suggestion reproject data onto projected CRS.\nsuggestion always need heeded: performing spatial geometric operations makes little difference cases (e.g., spatial subsetting).\noperations involving distances buffering, way ensure good result (without using spherical geometry engines) create projected copy data run operation .\ndone code chunk .result new object identical london, created using suitable CRS (British National Grid, EPSG code 27700 case) units meters.\ncan verify CRS changed using st_crs() follows (output replaced ...,):Notable components CRS description include EPSG code (EPSG: 27700) detailed wkt string (first five lines shown).30\nfact units CRS, described LENGTHUNIT field, meters (rather degrees) tells us projected CRS: st_is_longlat(london_proj) now returns FALSE geometry operations london_proj work without warning.\nBuffer operations london_proj use GEOS, results returned proper units distance.\nfollowing line code creates buffer around projected data exactly 100 km:geometries three london_buff* objects created preceding code specified CRS (london_buff_s2, london_buff_lonlat london_buff_projected) illustrated Figure 7.2.\nFIGURE 7.2: Buffers around London showing results created S2 spherical geometry engine lon/lat data (left), projected data (middle) lon/lat data without using spherical geometry (right). left plot illustrates result buffering unprojected data sf, calls Google’s S2 spherical geometry engine default max cells set 1000 (thin line). thick, blocky line illustrates result operation max cells set 100.\nclear Figure 7.2 buffers based s2 properly projected CRSs ‘squashed’, meaning every part buffer boundary equidistant London.\nresults generated lon/lat CRSs s2 used, either input lacks CRS sf_use_s2() turned , heavily distorted, result elongated north-south axis, highlighting dangers using algorithms assume projected data lon/lat inputs (GEOS ).\nresults generated using S2 also distorted, however, although less dramatically.\nbuffer boundaries Figure 7.2 (left) jagged, although may apparent relevant thick boundary representing buffer created s2 argument max_cells set 100.\nlesson results obtained lon/lat data via S2 different results obtained using projected data.\ndifference S2 derived buffers GEOS derived buffers projected data reduce value max_cells increases: ‘right’ value argument may depend many factors default value 1000 often reasonable default.\nchoosing max_cells values, speed computation balanced resolution results.\nsituations smooth curved boundaries advantageous, transforming projected CRS buffering (performing geometry operations) may appropriate.importance CRSs (primarily whether projected geographic) impacts sf’s default setting use S2 buffers lon/lat data clear example .\nsubsequent sections go depth, exploring CRS use projected CRSs needed details reprojecting vector raster objects.","code":"\nlondon_buff_no_crs = st_buffer(london, dist = 1) # incorrect: no CRS\nlondon_buff_s2 = st_buffer(london_geo, dist = 100000) # silent use of s2\nlondon_buff_s2_100_cells = st_buffer(london_geo, dist = 100000, max_cells = 100) \nsf::sf_use_s2(FALSE)\n#> Spherical geometry (s2) switched off\nlondon_buff_lonlat = st_buffer(london_geo, dist = 1) # incorrect result\n#> Warning in st_buffer.sfc(st_geometry(x), dist, nQuadSegs, endCapStyle =\n#> endCapStyle, : st_buffer does not correctly buffer longitude/latitude data\n#> dist is assumed to be in decimal degrees (arc_degrees).\nsf::sf_use_s2(TRUE)\n#> Spherical geometry (s2) switched on\nlondon_proj = data.frame(x = 530000, y = 180000) |> \n st_as_sf(coords = c(\"x\", \"y\"), crs = \"EPSG:27700\")\nst_crs(london_proj)\n#> Coordinate Reference System:\n#> User input: EPSG:27700 \n#> wkt:\n#> PROJCRS[\"OSGB36 / British National Grid\",\n#> BASEGEOGCRS[\"OSGB36\",\n#> DATUM[\"Ordnance Survey of Great Britain 1936\",\n#> ELLIPSOID[\"Airy 1830\",6377563.396,299.3249646,\n#> LENGTHUNIT[\"metre\",1]]],\n....\nlondon_buff_projected = st_buffer(london_proj, 100000)"},{"path":"reproj-geo-data.html","id":"whenproject","chapter":"7 Reprojecting geographic data","heading":"7.5 When to reproject?","text":"\nprevious section showed set CRS manually, st_set_crs(london, \"EPSG:4326\").\nreal-world applications, however, CRSs usually set automatically data read-.\nmany projects main CRS-related task transform objects, one CRS another.\ndata transformed?\nCRS?\nclear-cut answers questions CRS selection always involves trade-offs (Maling 1992).\nHowever, general principles provided section can help decide.First ’s worth considering transform.\ncases transformation geographic CRS essential, publishing data online leaflet package.\nAnother case two objects different CRSs must compared combined, shown try find distance two sf objects different CRSs:make london london_proj objects geographically comparable, one must transformed CRS .\nCRS use?\nanswer depends context: many projects, especially involving web mapping, require outputs EPSG:4326, case worth transforming projected object.\n, however, project requires planar geometry operations rather spherical geometry operations engine (e.g., create buffers smooth edges), may worth transforming data geographic CRS equivalent object projected CRS, British National Grid (EPSG:27700).\nsubject Section 7.7.","code":"\nst_distance(london_geo, london_proj)\n# > Error: st_crs(x) == st_crs(y) is not TRUE"},{"path":"reproj-geo-data.html","id":"which-crs","chapter":"7 Reprojecting geographic data","heading":"7.6 Which CRS to use?","text":"\nquestion CRS use tricky, rarely ‘right’ answer:\n“exist -purpose projections, involve distortion far center specified frame” (Bivand, Pebesma, Gómez-Rubio 2013).\nAdditionally, attached just one projection every task.\npossible use one projection part analysis, another projection different part, even visualization.\nAlways try pick CRS serves goal best!selecting geographic CRSs, answer often WGS84.\nused web mapping, also GPS datasets thousands raster vector datasets provided CRS default.\nWGS84 common CRS world, worth knowing EPSG code: 4326.31\n‘magic number’ can used convert objects unusual projected CRSs something widely understood.projected CRS required?\ncases, something free decide:\n“often choice projection made public mapping agency” (Bivand, Pebesma, Gómez-Rubio 2013).\nmeans working local data sources, likely preferable work CRS data provided, ensure compatibility, even official CRS accurate.\nexample London easy answer () British National Grid (associated EPSG code 27700) well known (b) original dataset (london) already CRS.\ncommonly used default Universal Transverse Mercator (UTM), set CRSs divides Earth 60 longitudinal wedges 20 latitudinal segments.\nAlmost every place Earth UTM code, “60H” refers northern New Zealand R invented.\nUTM EPSG codes run sequentially 32601 32660 northern hemisphere locations 32701 32760 southern hemisphere locations.show system works, let’s create function, lonlat2UTM() calculate EPSG code associated point planet follows:following command uses function identify UTM zone associated EPSG code Auckland London:transverse Mercator projection used UTM CRSs conformal distorts areas distances increasing severity distance center UTM zone.\nDocumentation GIS software Manifold therefore suggests restricting longitudinal extent projects using UTM zones 6 degrees central meridian (manifold.net).\nTherefore, recommend using UTM focus preserving angles relatively small area!Currently, also tools helping us select proper CRS, includes crsuggest package (K. Walker (2022)).\nmain function package, suggest_crs(), takes spatial object geographic CRS returns list possible projected CRSs used given area.32\nAnother helpful tool webpage https://jjimenezshaw.github.io/crs-explorer/ lists CRSs based selected location type.\nImportant note: tools helpful many situations, need aware properties recommended CRS apply .\ncases appropriate CRS immediately clear, choice CRS depend properties important preserve subsequent maps analysis.\nCRSs either equal-area, equidistant, conformal (shapes remaining unchanged), combination compromises (Section 2.4.2).\nCustom CRSs local parameters can created region interest multiple CRSs can used projects single CRS suits tasks.\n‘Geodesic calculations’ can provide fall-back CRSs appropriate (see proj.org/geodesic.html).\nRegardless projected CRS used, results may accurate geometries covering hundreds kilometers.\ndeciding custom CRS, recommend following:33\nLambert azimuthal equal-area (LAEA) projection custom local projection (set latitude longitude origin center study area), equal-area projection locations distorts shapes beyond thousands kilometersAzimuthal equidistant (AEQD) projections specifically accurate straight-line distance point center point local projectionLambert conformal conic (LCC) projections regions covering thousands kilometers, cone set keep distance area properties reasonable secant linesStereographic (STERE) projections polar regions, taking care rely area distance calculations thousands kilometers centerOne possible approach automatically select projected CRS specific local dataset create AEQD projection center-point study area.\ninvolves creating custom CRS (EPSG code) units meters based center point dataset.\nNote approach used caution: datasets compatible custom CRS created, results may accurate used extensive datasets covering hundreds kilometers.principles outlined section apply equally vector raster datasets.\nfeatures CRS transformation, however, unique geographic data model.\ncover particularities vector data transformation Section 7.7 raster transformation Section 7.8.\nNext, Section 7.9, shows create custom map projections.","code":"\nlonlat2UTM = function(lonlat) {\n utm = (floor((lonlat[1] + 180) / 6) %% 60) + 1\n if (lonlat[2] > 0) {\n utm + 32600\n } else{\n utm + 32700\n }\n}\nlonlat2UTM(c(174.7, -36.9))\n#> [1] 32760\nlonlat2UTM(st_coordinates(london))\n#> [1] 32630"},{"path":"reproj-geo-data.html","id":"reproj-vec-geom","chapter":"7 Reprojecting geographic data","heading":"7.7 Reprojecting vector geometries","text":"\nChapter 2 demonstrated vector geometries made points, points form basis complex objects lines polygons.\nReprojecting vectors thus consists transforming coordinates points, form vertices lines polygons.Section 7.5 contains example least one sf object must transformed equivalent object different CRS calculate distance two objects.Now transformed version london created, using sf function st_transform(), distance two representations London can found.34\nmay come surprise london london2 2 km apart!35Functions querying reprojecting CRSs demonstrated reference cycle_hire_osm, sf object spData represents ‘docking stations’ can hire bicycles London.\nCRS sf objects can queried, learned Section 7.1, set function st_crs().\noutput printed multiple lines text containing information coordinate system:saw Section 7.3, main CRS components, User input wkt, printed single entity. output st_crs() fact named list class crs two elements, single character strings named input wkt, shown output following code chunk:Additional elements can retrieved $ operator, including Name, proj4string epsg (see ?st_crs CRS tranformation tutorial GDAL website details):mentioned Section 7.2, WKT representation, stored $wkt element crs_lnd object ultimate source truth.\nmeans outputs previous code chunk queries wkt representation provided PROJ, rather inherent attributes object CRS.wkt User Input elements CRS changed object’s CRS transformed.\ncode chunk , create new version cycle_hire_osm projected CRS (first 4 lines CRS output shown brevity).resulting object new CRS EPSG code 27700.\nfind details EPSG code, code?\nOne option search online, another look properties CRS object:result shows EPSG code 27700 represents British National Grid, found searching online “EPSG 27700”.","code":"\nlondon2 = st_transform(london_geo, \"EPSG:27700\")\nst_distance(london2, london_proj)\n#> Units: [m]\n#> [,1]\n#> [1,] 2016\nst_crs(cycle_hire_osm)\n#> Coordinate Reference System:\n#> User input: EPSG:4326 \n#> wkt:\n#> GEOGCS[\"WGS 84\",\n#> DATUM[\"WGS_1984\",\n#> SPHEROID[\"WGS 84\",6378137,298.257223563,\n....\ncrs_lnd = st_crs(london_geo)\nclass(crs_lnd)\n#> [1] \"crs\"\nnames(crs_lnd)\n#> [1] \"input\" \"wkt\"\ncrs_lnd$Name\n#> [1] \"WGS 84\"\ncrs_lnd$proj4string\n#> [1] \"+proj=longlat +datum=WGS84 +no_defs\"\ncrs_lnd$epsg\n#> [1] 4326\ncycle_hire_osm_projected = st_transform(cycle_hire_osm, \"EPSG:27700\")\nst_crs(cycle_hire_osm_projected)\n#> Coordinate Reference System:\n#> User input: EPSG:27700 \n#> wkt:\n#> PROJCRS[\"OSGB36 / British National Grid\",\n#> ...crs_lnd_new = st_crs(\"EPSG:27700\")\ncrs_lnd_new$Name\n#> [1] \"OSGB36 / British National Grid\"\ncrs_lnd_new$proj4string\n#> [1] \"+proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000\n+y_0=-100000 +ellps=airy +units=m +no_defs\"\ncrs_lnd_new$epsg\n#> [1] 27700"},{"path":"reproj-geo-data.html","id":"reproj-ras","chapter":"7 Reprojecting geographic data","heading":"7.8 Reprojecting raster geometries","text":"\nprojection concepts described previous section apply rasters.\nHowever, important differences reprojection vectors rasters:\ntransforming vector object involves changing coordinates every vertex, apply raster data.\nRasters composed rectangular cells size (expressed map units, degrees meters), usually impracticable transform coordinates pixels separately.\nThus, raster reprojection involves creating new raster object, often different number columns rows original.\nattributes must subsequently re-estimated, allowing new pixels ‘filled’ appropriate values.\nwords, raster reprojection can thought two separate spatial operations: vector reprojection raster extent another CRS (Section 7.7), computation new pixel values resampling (Section 5.3.4).\nThus cases raster vector data used, better avoid reprojecting rasters reproject vectors instead.raster reprojection process done project() terra package.\nLike st_transform() function demonstrated previous section, project() takes spatial object (raster dataset case) CRS representation second argument.\nside note, second argument can also existing raster object different CRS.Let’s take look two examples raster transformation: using categorical continuous data.\nLand cover data usually represented categorical maps.\nnlcd.tif file provides information small area Utah, USA obtained National Land Cover Database 2011 NAD83 / UTM zone 12N CRS, shown output code chunk (first line output shown).region, eight land cover classes distinguished (full list NLCD2011 land cover classes can found mrlc.gov):reprojecting categorical rasters, estimated values must original.\ndone using nearest neighbor method (near), sets new cell value value nearest cell (center) input raster.\nexample reprojecting cat_raster WGS84, geographic CRS well suited web mapping.\nfirst step obtain definition CRS.\nsecond step reproject raster project() function , case categorical data, uses nearest neighbor method (near).Many properties new object differ previous one, including number columns rows (therefore number cells), resolution (transformed meters degrees), extent, illustrated Table 7.1 (note number categories increases 8 9 addition NA values, new category created — land cover classes preserved).TABLE 7.1: Key attributes original (cat_raster) projected (cat_raster_wgs84) categorical raster datasets.Reprojecting numeric rasters (numeric case integer values) follows almost identical procedure.\ndemonstrated srtm.tif spDataLarge Shuttle Radar Topography Mission (SRTM), represents height meters sea level (elevation) WGS84 CRS:reproject dataset projected CRS, nearest neighbor method appropriate categorical data.\nInstead, use bilinear method computes output cell value based four nearest cells original raster.36\nvalues projected dataset distance-weighted average values four cells:\ncloser input cell center output cell, greater weight.\nfollowing commands create text string representing WGS 84 / UTM zone 12N, reproject raster CRS, using bilinear method (output shown).Raster reprojection numeric variables also leads changes values spatial properties, number cells, resolution, extent.\nchanges demonstrated Table 7.2.37TABLE 7.2: Key attributes original (con_raster) projected (con_raster_ea) continuous raster datasets.","code":"\ncat_raster = rast(system.file(\"raster/nlcd.tif\", package = \"spDataLarge\"))\ncrs(cat_raster)\n#> PROJCRS[\"NAD83 / UTM zone 12N\",\n#> ...\nunique(cat_raster)\n#> levels\n#> 1 Water\n#> 2 Developed\n#> 3 Barren\n#> 4 Forest\n#> 5 Shrubland\n#> 6 Herbaceous\n#> 7 Cultivated\n#> 8 Wetlands\ncat_raster_wgs84 = project(cat_raster, \"EPSG:4326\", method = \"near\")\ncon_raster = rast(system.file(\"raster/srtm.tif\", package = \"spDataLarge\"))\ncat(crs(con_raster))\n#> GEOGCRS[\"WGS 84\",\n#> DATUM[\"World Geodetic System 1984\",\n#> ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n#> LENGTHUNIT[\"metre\",1]]],\n#> PRIMEM[\"Greenwich\",0,\n#> ANGLEUNIT[\"degree\",0.0174532925199433]],\n....\ncon_raster_ea = project(con_raster, \"EPSG:32612\", method = \"bilinear\")\ncat(crs(con_raster_ea))"},{"path":"reproj-geo-data.html","id":"mapproj","chapter":"7 Reprojecting geographic data","heading":"7.9 Custom map projections","text":"\nEstablished CRSs captured AUTHORITY:CODE identifiers EPSG:4326 well suited many applications.\nHowever, desirable use alternative projections create custom CRSs cases.\nSection 7.6 mentioned reasons using custom CRSs provided several possible approaches.\n, show apply ideas R.One take existing WKT definition CRS, modify elements, use new definition reprojecting.\ncan done spatial vectors st_crs() st_transform(), spatial rasters crs() project(), demonstrated following example transforms zion object custom azimuthal equidistant (AEQD) CRS.Using custom AEQD CRS requires knowing coordinates center point dataset degrees (geographic CRS).\ncase, information can extracted calculating centroid zion area transforming WGS84.Next, can use newly obtained values update WKT definition AEQD CRS seen .\nNotice modified just two values – \"Central_Meridian\" longitude \"Latitude_Of_Origin\" latitude centroid.approach’s last step transform original object (zion) new custom CRS (zion_aeqd).Custom projections can also made interactively, example, using Projection Wizard web application (Šavrič, Jenny, Jenny 2016).\nwebsite allows select spatial extent data distortion property, returns list possible projections.\nlist also contains WKT definitions projections can copy use reprojections.\nAlso, see Open Geospatial Consortium (2019) details creating custom CRS definitions WKT strings.\nPROJ strings can also used create custom projections, accepting limitations inherent projections, especially geometries covering large geographic areas, mentioned Section 7.2.\nMany projections developed can set +proj= element PROJ strings, dozens projects described detail PROJ website alone.mapping world preserving area relationships, Mollweide projection, illustrated Figure 7.3, popular often sensible choice (Jenny et al. 2017).\nuse projection, need specify using proj-string element, \"+proj=moll\", st_transform function:\nFIGURE 7.3: Mollweide projection world.\noften desirable minimize distortion spatial properties (area, direction, distance) mapping world.\nOne popular projections achieve Winkel tripel, illustrated Figure 7.4.38\nresult created following command:\nFIGURE 7.4: Winkel tripel projection world.\nMoreover, proj-string parameters can modified CRS definitions, example center projection can adjusted using +lon_0 +lat_0 parameters.\ncode transforms coordinates Lambert azimuthal equal-area projection centered longitude latitude New York City (Figure 7.5).\nFIGURE 7.5: Lambert azimuthal equal-area projection world centered New York City.\ninformation CRS modifications can found Using PROJ documentation.","code":"\nzion = read_sf(system.file(\"vector/zion.gpkg\", package = \"spDataLarge\"))\nzion_centr = st_centroid(zion)\nzion_centr_wgs84 = st_transform(zion_centr, \"EPSG:4326\")\nst_as_text(st_geometry(zion_centr_wgs84))\n#> [1] \"POINT (-113 37.3)\"\nmy_wkt = 'PROJCS[\"Custom_AEQD\",\n GEOGCS[\"GCS_WGS_1984\",\n DATUM[\"WGS_1984\",\n SPHEROID[\"WGS_1984\",6378137.0,298.257223563]],\n PRIMEM[\"Greenwich\",0.0],\n UNIT[\"Degree\",0.0174532925199433]],\n PROJECTION[\"Azimuthal_Equidistant\"],\n PARAMETER[\"Central_Meridian\",-113.0263],\n PARAMETER[\"Latitude_Of_Origin\",37.29818],\n UNIT[\"Meter\",1.0]]'\nzion_aeqd = st_transform(zion, my_wkt)\nworld_mollweide = st_transform(world, crs = \"+proj=moll\")\nworld_wintri = st_transform(world, crs = \"+proj=wintri\")\nworld_laea2 = st_transform(world,\n crs = \"+proj=laea +x_0=0 +y_0=0 +lon_0=-74 +lat_0=40\")"},{"path":"reproj-geo-data.html","id":"exercises-5","chapter":"7 Reprojecting geographic data","heading":"7.10 Exercises","text":"E1. Create new object called nz_wgs transforming nz object WGS84 CRS.Create object class crs use query CRSs.reference bounding box object, units CRS use?Remove CRS nz_wgs plot result: wrong map New Zealand ?E2. Transform world dataset transverse Mercator projection (\"+proj=tmerc\") plot result.\nchanged ?\nTry transform back WGS 84 plot new object.\nnew object differ original one?E3. Transform continuous raster (con_raster) NAD83 / UTM zone 12N using nearest neighbor interpolation method.\nchanged?\ninfluence results?E4. Transform categorical raster (cat_raster) WGS 84 using bilinear interpolation method.\nchanged?\ninfluence results?","code":""},{"path":"read-write.html","id":"read-write","chapter":"8 Geographic data I/O","heading":"8 Geographic data I/O","text":"","code":""},{"path":"read-write.html","id":"prerequisites-6","chapter":"8 Geographic data I/O","heading":"Prerequisites","text":"chapter requires following packages:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)"},{"path":"read-write.html","id":"introduction-4","chapter":"8 Geographic data I/O","heading":"8.1 Introduction","text":"chapter reading writing geographic data.\nGeographic data input essential geocomputation: real-world applications impossible without data.\nData output also vital, enabling others use valuable new improved datasets resulting work.\nTaken together, processes input/output can referred data /O.\nGeographic data /O often done haste beginning end projects otherwise ignored.\nHowever, data import export fundamental success otherwise projects: small /O mistakes made beginning projects (e.g., using --date dataset) can lead large problems line.many geographic file formats, advantages disadvantages, described Section 8.2.\nReading writing file formats covered Sections 8.3 8.4, respectively.\nterms find data, Section 8.5 describes geoportals import data .\nDedicated packages ease geographic data import, sources including OpenStreetMap, described Section 8.6.\nwant put data ‘production’ web services (want sure data adheres established standards), geographic metadata important, described Section 8.7.\nAnother possibility obtain spatial data use web services, outlined Section 8.8.\nfinal Section 8.9 demonstrates methods saving visual outputs (maps), preparation Chapter 9 visualization.","code":""},{"path":"read-write.html","id":"file-formats","chapter":"8 Geographic data I/O","heading":"8.2 File formats","text":"\nGeographic datasets usually stored files spatial databases.\nFile formats can either store vector raster data, spatial databases PostGIS can store (see also Section 10.7).\nToday variety file formats may seem bewildering, much consolidation standardization since beginnings GIS software 1960s first widely distributed program (SYMAP) spatial analysis created Harvard University (Coppock Rhind 1991).\nGDAL (pronounced “goo-dal”, double “o” making reference object-orientation), Geospatial Data Abstraction Library, resolved many issues associated incompatibility geographic file formats since release 2000.\nGDAL provides unified high-performance interface reading writing many raster vector data formats.39\nMany open proprietary GIS programs, including GRASS GIS, ArcGIS QGIS, use GDAL behind GUIs legwork ingesting spitting geographic data appropriate formats.GDAL provides access 200 vector raster data formats.\nTable 8.1 presents basic information selected often used spatial file formats.\nTABLE 8.1: TABLE 8.2: Selected spatial file formats.\n\nimportant development ensuring standardization open-sourcing file formats founding Open Geospatial Consortium (OGC) 1994.\nBeyond defining simple features data model (see Section 2.2.1), OGC also coordinates development open standards, example used file formats GML, KML GeoPackage.\nOpen file formats kind endorsed OGC several advantages proprietary formats: standards published, ensure transparency open possibility users develop adjust file formats specific needs.ESRI Shapefile popular vector data exchange format; however, open format (though specification open).\ndeveloped early 1990s number limitations.\nFirst , multi-file format, consists least three files.\nsupports 255 columns, column names restricted ten characters file size limit 2 GB.\nFurthermore, ESRI Shapefile support possible geometry types, example, unable distinguish polygon multipolygon.40\nDespite limitations, viable alternative missing long time.\nrecent years, GeoPackage emerged, seems suitable replacement candidate ESRI Shapefile.\nGeopackage format exchanging geospatial information OGC standard.\nGeoPackage standard describes rules store geospatial information tiny SQLite container.\nHence, GeoPackage lightweight spatial database container, allows storage vector raster data also non-spatial data extensions.\nAside GeoPackage, geospatial data exchange formats worth checking (Table 8.1).\nGeoTIFF format seems prominent raster data format.\nallows spatial information, CRS, embedded within TIFF file.\nSimilar ESRI Shapefile, format first developed 1990s, open format.\nAdditionally, GeoTIFF still expanded improved.\nOne significant recent additions GeoTIFF format variant called COG (Cloud Optimized GeoTIFF).\nRaster objects saved COGs can hosted HTTP servers, people can read parts file without downloading whole file (see Sections 8.3.2 8.4.2).many geographic file formats beyond shown Table 8.1 new data formats capable representing geographic developed.\nRecent examples formats based GeoArrow Zarr specifications.\nGDAL’s documentation provides good resource learning vector raster drivers.\nFurthermore, data formats can store data models (types) beyond vector raster data models introduced Section 2.2.1.\nincludes LAS LAZ formats storing lidar point clouds, NetCDF HDF storing multidimensional arrays.Spatial data also often stored using tabular (non-spatial) text formats, including CSV files Excel spreadsheets.\nexample, can convenient share spatial samples people use GIS tools exchange data software accept spatial data formats.\nHowever, approach downsides: challenging storing geometries complex POINTs omits important spatial metadata CRS.","code":""},{"path":"read-write.html","id":"data-input","chapter":"8 Geographic data I/O","heading":"8.3 Data input (I)","text":"Executing commands sf::read_sf() (main function use loading vector data) terra::rast() (main function used loading raster data) silently sets chain events reads data files.\nMany R packages provide example datasets (e.g., dataset spData::world used earlier chapters) functions get geographic datasets range data sources.\nload data R , precisely, assign objects workspace.\nmeans objects imported R stored RAM41, can listed ls() (viewable ‘Environment’ panels development environement) can accessed .GlobalEnv R session.","code":""},{"path":"read-write.html","id":"iovec","chapter":"8 Geographic data I/O","heading":"8.3.1 Vector data","text":"\nSpatial vector data comes wide variety file formats.\npopular representations .geojson .gpkg files can imported directly R sf function read_sf() (equivalent st_read()), uses GDAL’s vector drivers behind scenes.\nst_drivers() function returns data frame containing name long_name first two columns, features driver available GDAL (therefore sf), including ability write data store raster data subsequent columns, illustrated key file formats Table 8.3.\nTABLE 8.3: TABLE 8.4: Popular drivers/formats reading/writing vector data.\nfollowing commands show first three drivers reported computer’s GDAL installation (results can vary depending GDAL version installed) summary features.\nNote majority drivers can write data, dozen formats can efficiently represent raster data addition vector data (see ?st_drivers() details):first argument read_sf() dsn, text string object containing single text string.\ncontent text string vary different drivers.\ncases, ESRI Shapefile (.shp) GeoPackage format (.gpkg), dsn file name.\nread_sf() guesses driver based file extension, illustrated .gpkg file :drivers, dsn provided folder name, access credentials database, GeoJSON string representation (see examples read_sf() help page details).vector driver formats can store multiple data layers.\ndefault, read_sf() automatically reads first layer file specified dsn; however, using layer argument can specify layer.\nread_sf() function also allows reading just parts file RAM two possible mechanisms.\nfirst one related query argument, allows specifying part data read OGR SQL query text.\nexample extracts data Tanzania (Figure 8.1A).\ndone specifying want get columns (SELECT *) \"world\" layer name_long equal \"Tanzania\":know names available columns, good approach just read one row data 'SELECT * world FID = 1'.\nFID represents feature ID – often, row number; however, values depend used file format.\nexample, FID starts 0 ESRI Shapefile, 1 file formats, can even arbitrary.second mechanism uses wkt_filter argument.\nargument expects well-known text representing study area want extract data.\nLet’s try using small example – want read polygons file intersect buffer 50,000 meters Tanzania’s borders.\n, need prepare “filter” () creating buffer (Section 5.2.3), (b) converting sf buffer object sfc geometry object st_geometry(), (c) translating geometries well-known text representation st_as_text():Now, can apply “filter” using wkt_filter argument.result, shown Figure 8.1(B), contains Tanzania every country within 50-km buffer.\nFIGURE 8.1: Reading subset vector data using () query (B) wkt filter.\nNaturally, options specific certain drivers.42\nexample, think coordinates stored spreadsheet format (.csv).\nread-files spatial objects, naturally specify names columns (X Y example ) representing coordinates.\ncan help options parameter.\nfind possible options, please refer ‘Open Options’ section corresponding GDAL driver description.\ncomma-separated value (csv) format, visit https://gdal.org/drv_csv.html.Instead columns describing ‘XY’ coordinates, single column can also contain geometry information.\nWell-known text (WKT), well-known binary (WKB), GeoJSON formats examples .\ninstance, world_wkt.csv file column named WKT representing polygons world’s countries.\nuse options parameter indicate .\nfinal example, show read_sf() also reads KML files.\nKML file stores geographic information XML format, data format creation web pages transfer data application-independent way (Nolan Lang 2014).\n, access KML file web.\nfile contains one layer.\nst_layers() lists available layers.\nchoose first layer Placemarks say help layer parameter read_sf().examples presented section far used sf package geographic data import.\nfast flexible, may worth looking packages duckdb, R interface DuckDB database system spatial extension.","code":"\nsf_drivers = st_drivers()\nhead(sf_drivers, n = 3)\nsummary(sf_drivers[-c(1:2)])\nf = system.file(\"shapes/world.gpkg\", package = \"spData\")\nworld = read_sf(f)\ntanzania = read_sf(f, query = 'SELECT * FROM world WHERE name_long = \"Tanzania\"')\ntanzania_buf = st_buffer(tanzania, 50000)\ntanzania_buf_geom = st_geometry(tanzania_buf)\ntanzania_buf_wkt = st_as_text(tanzania_buf_geom)\ntanzania_neigh = read_sf(f, wkt_filter = tanzania_buf_wkt)\ncycle_hire_txt = system.file(\"misc/cycle_hire_xy.csv\", package = \"spData\")\ncycle_hire_xy = read_sf(cycle_hire_txt,\n options = c(\"X_POSSIBLE_NAMES=X\", \"Y_POSSIBLE_NAMES=Y\"))\nworld_txt = system.file(\"misc/world_wkt.csv\", package = \"spData\")\nworld_wkt = read_sf(world_txt, options = \"GEOM_POSSIBLE_NAMES=WKT\")\nu = \"https://developers.google.com/kml/documentation/KML_Samples.kml\"\ndownload.file(u, \"KML_Samples.kml\")\nst_layers(\"KML_Samples.kml\")\n#> Driver: LIBKML \n#> Available layers:\n#> layer_name geometry_type features fields crs_name\n#> 1 Placemarks 3 11 WGS 84\n#> 2 Styles and Markup 1 11 WGS 84\n#> 3 Highlighted Icon 1 11 WGS 84\n....\nkml = read_sf(\"KML_Samples.kml\", layer = \"Placemarks\")"},{"path":"read-write.html","id":"raster-data-read","chapter":"8 Geographic data I/O","heading":"8.3.2 Raster data","text":"\nSimilar vector data, raster data comes many file formats supporting multilayer files.\nterra’s rast() command reads single layer file just one layer provided.also works case want read multilayer file.\nprevious examples read spatial information files stored hard drive.\nHowever, GDAL also allows reading data directly online resources, HTTP/HTTPS/FTP web resources.\nthing need add /vsicurl/ prefix path file.\nLet’s try connecting global monthly snow probability 500-m resolution period 2000-2012.\nSnow probability December stored Cloud Optimized GeoTIFF (COG) file (see Section 8.2) zenodo.org.\nread online file, just need provide URL together /vsicurl/ prefix.\nDue fact input data COG, actually reading file RAM, rather creating connection without obtaining values.\nvalues read apply value-based operation (e.g., crop() extract()).\nallows us also just read tiny portion data without downloading entire file.\nexample, can get snow probability December Reykjavik (70%) specifying coordinates applying extract() function:way, just downloaded single value instead whole, large GeoTIFF file.\nexample just shows one simple (useful) case, explore.\n/vsicurl/ prefix also works raster also vector file formats.\nallows reading vectors directly online storage read_sf() just adding prefix vector file URL.Importantly, /vsicurl/ prefix provided GDAL – many exist, /vsizip/ read spatial files ZIP archives without decompressing beforehand /vsis3/ --fly reading files available AWS S3 buckets.\ncan learn https://gdal.org/user/virtual_file_systems.html.vector data, raster datasets can also stored read spatial databases, notably PostGIS.\nSee Section 10.7 details.","code":"\nraster_filepath = system.file(\"raster/srtm.tif\", package = \"spDataLarge\")\nsingle_layer = rast(raster_filepath)\nmultilayer_filepath = system.file(\"raster/landsat.tif\", package = \"spDataLarge\")\nmultilayer_rast = rast(multilayer_filepath)\nmyurl = paste0(\"/vsicurl/https://zenodo.org/record/5774954/files/\",\n \"clm_snow.prob_esacci.dec_p.90_500m_s0..0cm_2000..2012_v2.0.tif\")\nsnow = rast(myurl)\nsnow\n#> class : SpatRaster \n#> dimensions : 35849, 86400, 1 (nrow, ncol, nlyr)\n#> resolution : 0.00417, 0.00417 (x, y)\n#> extent : -180, 180, -62, 87.4 (xmin, xmax, ymin, ymax)\n#> coord. ref. : lon/lat WGS 84 (EPSG:4326) \n#> source : clm_snow.prob_esacci.dec_p.90_500m_s0..0cm_2000..2012_v2.0.tif \n#> name : clm_snow.prob_esacci.dec_p.90_500m_s0..0cm_2000..2012_v2.0\nrey = data.frame(lon = -21.94, lat = 64.15)\nsnow_rey = extract(snow, rey)\nsnow_rey\n#> ID clm_snow.prob_esacci.dec_p.90_500m_s0..0cm_2000..2012_v2.0\n#> 1 1 70"},{"path":"read-write.html","id":"data-output","chapter":"8 Geographic data I/O","heading":"8.4 Data output (O)","text":"Writing geographic data allows convert one format another save newly created objects.\nDepending data type (vector raster), object class (e.g., sf SpatRaster), type amount stored information (e.g., object size, range values), important know store spatial files efficient way.\nnext two sections demonstrate .","code":""},{"path":"read-write.html","id":"vector-data-1","chapter":"8 Geographic data I/O","heading":"8.4.1 Vector data","text":"counterpart read_sf() write_sf().\nallows write sf objects wide range geographic vector file formats, including common .geojson, .shp .gpkg.\nBased file name, write_sf() decides automatically driver use.\nspeed writing process depends also driver.Note: try write data source , function overwrite file:Instead overwriting file, add new layer file specifying layer argument.\nsupported several spatial formats, including GeoPackage.Alternatively, can use st_write() since equivalent write_sf().\nHowever, different defaults – overwrite files (returns error try ) shows short summary written file format object.layer_options argument also used many different purposes.\nOne write spatial data text file.\ncan done specifying GEOMETRY inside layer_options.\neither AS_XY simple point datasets (creates two new columns coordinates) AS_WKT complex spatial data (one new column created contains well-known text representation spatial objects).","code":"\nwrite_sf(obj = world, dsn = \"world.gpkg\")\nwrite_sf(obj = world, dsn = \"world.gpkg\")\nwrite_sf(obj = world, dsn = \"world_many_layers.gpkg\", layer = \"second_layer\")\nst_write(obj = world, dsn = \"world2.gpkg\")\n#> Writing layer `world2' to data source `world2.gpkg' using driver `GPKG'\n#> Writing 177 features with 10 fields and geometry type Multi Polygon.\nwrite_sf(cycle_hire_xy, \"cycle_hire_xy.csv\", layer_options = \"GEOMETRY=AS_XY\")\nwrite_sf(world_wkt, \"world_wkt.csv\", layer_options = \"GEOMETRY=AS_WKT\")"},{"path":"read-write.html","id":"raster-data-write","chapter":"8 Geographic data I/O","heading":"8.4.2 Raster data","text":"\nwriteRaster() function saves SpatRaster objects files disk.\nfunction expects input regarding output data type file format, also accepts GDAL options specific selected file format (see ?writeRaster details).\nterra package offers seven data types saving raster: INT1U, INT2S, INT2U, INT4S, INT4U, FLT4S, FLT8S,43 determine bit representation raster object written disk (Table 8.5).\ndata type use depends range values raster object.\nvalues data type can represent, larger file get disk.\nUnsigned integers (INT1U, INT2U, INT4U) suitable categorical data, float numbers (FLT4S FLT8S) usually represent continuous data.\nwriteRaster() uses FLT4S default.\nworks cases, size output file unnecessarily large save binary categorical data.\nTherefore, recommend use data type needs least storage space still able represent values (check range values summary() function).TABLE 8.5: Data types supported terra package.default, output file format derived filename.\nNaming file *.tif create GeoTIFF file, demonstrated :raster file formats additional options can set providing GDAL parameters options argument writeRaster().\nGeoTIFF files written terra, default, LZW compression gdal = c(\"COMPRESS=LZW\").\nchange disable compression, need modify argument.\nAdditionally, can save raster object COG (Cloud Optimized GeoTIFF, Section 8.2) filetype = \"COG\" options.learn compression GeoTIFF files, recommend Paul Ramsey’s comprehensive blog post, GeoTiff Compression Dummies can found online.","code":"\nwriteRaster(single_layer, filename = \"my_raster.tif\", datatype = \"INT2U\")\nwriteRaster(x = single_layer, filename = \"my_raster.tif\",\n gdal = c(\"COMPRESS=NONE\"), overwrite = TRUE)\nwriteRaster(x = single_layer, filename = \"my_raster.tif\",\n filetype = \"COG\", overwrite = TRUE)"},{"path":"read-write.html","id":"retrieving-data","chapter":"8 Geographic data I/O","heading":"8.5 Geoportals","text":"\nvast ever-increasing amount geographic data available internet, much free access use (appropriate credit given providers).44\nways now much data, sense often multiple places access dataset.\ndatasets poor quality.\ncontext, vital know look, first section covers important sources.\nVarious ‘geoportals’ (web services providing geospatial datasets Data.gov) good place start, providing wide range data often specific locations (illustrated updated Wikipedia page topic).\nglobal geoportals overcome issue.\nGEOSS portal Copernicus Data Space Ecosystem, example, contain many raster datasets global coverage.\nwealth vector datasets can accessed SEDAC portal run National Aeronautics Space Administration (NASA) European Union’s INSPIRE geoportal, global regional coverage.geoportals provide graphical interface allowing datasets queried based characteristics spatial temporal extent, United States Geological Survey’s EarthExplorer prime example.\nExploring datasets interactively browser effective way understanding available layers.\nDownloading data best done code, however, reproducibility efficiency perspectives.\nDownloads can initiated command line using variety techniques, primarily via URLs APIs (see Copernicus APIs example).45\nFiles hosted static URLs can downloaded download.file(), illustrated code chunk accesses PeRL: Permafrost Region Pond Lake Database pangaea.de:","code":"\ndownload.file(url = \"https://hs.pangaea.de/Maps/PeRL/PeRL_permafrost_landscapes.zip\",\n destfile = \"PeRL_permafrost_landscapes.zip\", \n mode = \"wb\")\nunzip(\"PeRL_permafrost_landscapes.zip\")\ncanada_perma_land = read_sf(\"PeRL_permafrost_landscapes/canada_perma_land.shp\")"},{"path":"read-write.html","id":"geographic-data-packages","chapter":"8 Geographic data I/O","heading":"8.6 Geographic data packages","text":"\nMany R packages developed accessing geographic data, presented Table 8.6.\nprovide interfaces one spatial libraries geoportals aim make data access even quicker command line.\nTABLE 8.6: TABLE 8.7: Selected R packages geographic data retrieval.\nemphasized Table 8.6 represents small number available geographic data packages.\nexample, large number R packages exist obtain various socio-demographic data, tidycensus tigris (USA), cancensus (Canada), eurostat giscoR (European Union), idbr (international databases) – read Analyzing US Census Data (K. E. Walker 2022) find examples analyze data.\nSimilarly, several R packages exist giving access spatial data various regions countries, bcdata (Province British Columbia), geobr (Brazil), RCzechia (Czech Republic), rgugik (Poland).data package syntax accessing data.\ndiversity demonstrated subsequent code chunks, show get data using three packages Table 8.6.46\nCountry borders often useful can accessed ne_countries() function rnaturalearth package (Massicotte South 2023) follows:Country borders can also accessed packages, geodata, giscoR, rgeoboundaries.second example downloads series rasters containing global monthly precipitation sums spatial resolution 10 minutes (~18.5 km equator) using geodata package (Hijmans 2023a).\nresult multilayer object class SpatRaster.third example uses osmdata package (Padgham et al. 2023) find parks OpenStreetMap (OSM) database.\nillustrated code-chunk , queries begin function opq() (short OpenStreetMap query), first argument bounding box, text string representing bounding box (city Leeds, case).\nresult passed function selecting OSM elements ’re interested (parks case), represented key-value pairs.\nNext, passed function osmdata_sf() work downloading data converting list sf objects (see vignette('osmdata') details):limitation osmdata package rate-limited, meaning download large OSM datasets (e.g., OSM data large city).\novercome limitation, osmextract package developed, can used download import binary .pbf files containing compressed versions OSM database predefined regions.OpenStreetMap vast global database crowd-sourced data, growing daily, wider ecosystem tools enabling easy access data, Overpass turbo web service rapid development testing OSM queries osm2pgsql importing data PostGIS database.\nAlthough quality datasets derived OSM varies, data source wider OSM ecosystems many advantages: provide datasets available globally, free charge, constantly improving thanks army volunteers.\nUsing OSM encourages ‘citizen science’ contributions back digital commons (can start editing data representing part world know well www.openstreetmap.org).\nexamples OSM data action provided Chapters 10, 13 14.Sometimes, packages come built-datasets.\ncan accessed four ways: attaching package (package uses ‘lazy loading’ spData ), data(dataset, package = mypackage), referring dataset mypackage::dataset, system.file(filepath, package = mypackage) access raw data files.\nfollowing code chunk illustrates latter two options using world dataset (already loaded attaching parent package library(spData)):47The last example, system.file(\"shapes/world.gpkg\", package = \"spData\"), returns path world.gpkg file, stored inside \"shapes/\" folder spData package.\nAnother way obtain spatial information perform geocoding – transform description location, usually address, coordinates.\nusually done sending query online service getting location result.\nMany services exist differ used method geocoding, usage limitations, costs, application programming interface (API) key requirements.\nR several packages geocoding; however, tidygeocoder seems allow connect largest number geocoding services consistent interface.\ntidygeocoder main function geocode, takes data frame addresses adds coordinates \"lat\" \"long\".\nfunction also allows select geocoding service method argument many additional parameters.Let’s try package searching coordinates John Snow blue plaque located building Soho district London.resulting data frame can converted sf object st_as_sf().tidygeocoder also allows performing opposite process called reverse geocoding used get set information (name, address, etc.) based pair coordinates.\nGeographic data can also imported R various ‘bridges’ geographic software, described Chapter 10.","code":"\nlibrary(rnaturalearth)\nusa_sf = ne_countries(country = \"United States of America\", returnclass = \"sf\")\nlibrary(geodata)\nworldclim_prec = worldclim_global(\"prec\", res = 10, path = tempdir())\nclass(worldclim_prec)\nlibrary(osmdata)\nparks = opq(bbox = \"leeds uk\") |> \n add_osm_feature(key = \"leisure\", value = \"park\") |> \n osmdata_sf()\nworld2 = spData::world\nworld3 = read_sf(system.file(\"shapes/world.gpkg\", package = \"spData\"))\nlibrary(tidygeocoder)\ngeo_df = data.frame(address = \"54 Frith St, London W1D 4SJ, UK\")\ngeo_df = geocode(geo_df, address, method = \"osm\")\ngeo_df\ngeo_sf = st_as_sf(geo_df, coords = c(\"long\", \"lat\"), crs = \"EPSG:4326\")"},{"path":"read-write.html","id":"geographic-metadata","chapter":"8 Geographic data I/O","heading":"8.7 Geographic metadata","text":"Geographic metadata cornerstone geographic information management, used describe datasets, data structures services.\nhelp make datasets FAIR (Findable, Accessible, Interoperable, Reusable) defined ISO/OGC standards, particular ISO 19115 standard underlying schemas.\nstandards widely used within spatial data infrastructures, handled metadata catalogs.Geographic metadata can managed geometa, package allows writing, reading validating geographic metadata according ISO/OGC standards.\nalready supports various international standards geographic metadata information, ISO 19110 (feature catalogue), ISO 19115-1 19115-2 (geographic metadata vector gridded/imagery datasets), ISO 19119 (geographic metadata service), ISO 19136 (Geographic Markup Language) providing methods read, validate write geographic metadata R using ISO/TS 19139 (XML) technical specification.\n\nGeographic metadata can created geometa follows, creates saves metadata file:package comes examples extended packages geoflow ease automate management metadata.field standard geographic information management, distinction data metadata less clear.\nGeography Markup Language (GML) standard file format covers data metadata, example.\ngeometa package allows exporting GML (ISO 19136) objects geometry objects modeled sf.\nfunctionality allows use geographic metadata (enabling inclusion metadata detailed geographic temporal extents, rather simple bounding boxes, example) provision services extend GML standard (e.g., Open Geospatial Consortium Web Coverage Service, OGC-WCS).","code":"\nlibrary(geometa)\n# create a metadata\nmd = ISOMetadata$new()\n#... fill the metadata 'md' object\n# validate metadata\nmd$validate()\n# XML representation of the ISOMetadata\nxml = md$encode()\n# save metadata\nmd$save(\"my_metadata.xml\")\n# read a metadata from an XML file\nmd = readISO19139(\"my_metadata.xml\")"},{"path":"read-write.html","id":"geographic-web-services","chapter":"8 Geographic data I/O","heading":"8.8 Geographic web services","text":"\neffort standardize web APIs accessing spatial data, Open Geospatial Consortium (OGC) created number standard specifications web services (collectively known OWS, short OGC Web Services).\nservices complement use core standards developed model geographic information, ISO/OGC Spatial Schema (ISO 19107:2019), Simple Features (ISO 19125-1:2004), format data, Geographic Markup Language (GML).\nspecifications cover common access services data metadata.\nVector data can accessed Web Feature Service (WFS), whereas grid/imagery can accessed Web Coverage Service (WCS).\nMap image representations, tiles, can accessed Web Map Service (WMS) Web Map Tile Service (WMTS).\nMetadata also covered means Catalogue Service Web (CSW).\nFinally, standard processing handled Web Processing Service (WPS) Web Coverage Processing Service (WCPS).Various open-source projects adopted protocols, GeoServer MapServer data handling, GeoNetwork PyCSW metadata handling, leading standardization queries.\nIntegrated tools Spatial Data Infrastructures (SDI), GeoNode, GeOrchestra Examind also adopted standard webservices, either directly using previously mentioned open-source tools.Like web APIs, OWS APIs use ‘base URL’, ‘endpoint’ ‘URL query arguments’ following ? request data (see best-practices-api-packages vignette httr package).many requests can made OWS service. examples illustrate requests can made directly httr straightforward ows4R package (OGC Web-Services R).Let’s start examples using httr package, can useful understanding web services work.\nOne fundamental requests getCapabilities, demonstrated httr functions GET() modify_url() .\nfollowing code chunk demonstrates API queries can constructed dispatched, case discover capabilities service run Fisheries Aquaculture Division Food Agriculture Organization United Nations (UN-FAO).code chunk demonstrates API requests can constructed programmatically GET() function, takes base URL list query parameters can easily extended.\nresult request saved res, object class response defined httr package, list containing information request, including URL.\ncan seen executing browseURL(res$url), results can also read directly browser.\nOne way extracting contents request follows:Data can downloaded WFS services GetFeature request specific typeName (illustrated code chunk ).Available names differ depending accessed web feature service.\nOne can extract programmatically using web technologies (Nolan Lang 2014) scrolling manually contents GetCapabilities output browser.order keep geometry validity along data access chain, since standards underlying open-source server solutions (GeoServer) built Simple Features access, important deactivate new default behavior introduced sf, use S2 geometry model data access time.\ndone code sf::sf_use_s2(FALSE).\nAlso note use write_disk() ensure results written disk rather loaded memory, allowing imported sf.many everyday tasks, however, higher-level interface may appropriate, number R packages, tutorials, developed precisely purpose.\npackage ows4R developed working OWS services.\nprovides stable interface common access services, WFS, WCS data, CSW metadata, WPS processing.\nOGC services coverage described README package, hosted github.com/eblondel/ows4R, new standard protocols investigation/development.Based example, code shows perform getCapabilities getFeatures operations package.\nows4R package relies principle clients.\ninteract OWS service (WFS), client created follows:operations accessible client object, e.g., getCapabilities getFeatures.explained , accessing data OGC services, handling sf features done deactivating new default behavior introduced sf, sf::sf_use_s2(FALSE).\ndone default ows4R.Additional examples available vignettes, access raster data WCS, access metadata CSW.","code":"\nlibrary(httr)\nbase_url = \"https://www.fao.org\"\nendpoint = \"/fishery/geoserver/wfs\"\nq = list(request = \"GetCapabilities\")\nres = GET(url = modify_url(base_url, path = endpoint), query = q)\nres$url\n#> [1] \"https://www.fao.org/fishery/geoserver/wfs?request=GetCapabilities\"\ntxt = content(res, \"text\")\nxml = xml2::read_xml(txt)\nxml\n#> {xml_document} ...\n#> [1] \\n GeoServer WFS...\n#> [2] \\n UN-FAO Fishe...\n#> ...\nlibrary(sf)\nsf::sf_use_s2(FALSE)\nqf = list(request = \"GetFeature\", typeName = \"fifao:FAO_MAJOR\")\nfile = tempfile(fileext = \".gml\")\nGET(url = base_url, path = endpoint, query = qf, write_disk(file))\nfao_areas = read_sf(file)\nlibrary(ows4R)\nWFS = WFSClient$new(\n url = \"https://www.fao.org/fishery/geoserver/wfs\",\n serviceVersion = \"1.0.0\",\n logger = \"INFO\"\n)\nlibrary(ows4R)\ncaps = WFS$getCapabilities()\nfeatures = WFS$getFeatures(\"fifao:FAO_MAJOR\")"},{"path":"read-write.html","id":"visual-outputs","chapter":"8 Geographic data I/O","heading":"8.9 Visual outputs","text":"\nR supports many different static interactive graphics formats.\nChapter 9 covers map-making detail, worth mentioning ways output visualizations .\ngeneral method save static plot open graphic device, create plot, close , example:available graphic devices include pdf(), bmp(), jpeg(), tiff().\ncan specify several properties output plot, including width, height resolution.\nAdditionally, several graphic packages provide functions save graphical output.\nexample, tmap package tmap_save() function.\ncan save tmap object different graphic formats HTML file specifying object name file path new file.hand, can save interactive maps created mapview package HTML file image using mapshot2() function:","code":"\npng(filename = \"lifeExp.png\", width = 500, height = 350)\nplot(world[\"lifeExp\"])\ndev.off()\nlibrary(tmap)\ntmap_obj = tm_shape(world) + tm_polygons(col = \"lifeExp\")\ntmap_save(tmap_obj, filename = \"lifeExp_tmap.png\")\nlibrary(mapview)\nmapview_obj = mapview(world, zcol = \"lifeExp\", legend = TRUE)\nmapshot2(mapview_obj, url = \"my_interactive_map.html\")"},{"path":"read-write.html","id":"exercises-6","chapter":"8 Geographic data I/O","heading":"8.10 Exercises","text":"E1. List describe three types vector, raster, geodatabase formats.E2. Name least two differences sf functions read_sf() st_read().E3. Read cycle_hire_xy.csv file spData package spatial object (Hint: located misc folder).\ngeometry type loaded object?E4. Download borders Germany using rnaturalearth, create new object called germany_borders.\nWrite new object file GeoPackage format.E5. Download global monthly minimum temperature spatial resolution 5 minutes using geodata package.\nExtract June values, save file named tmin_june.tif file (hint: use terra::subset()).E6. Create static map Germany’s borders, save PNG file.E7. Create interactive map using data cycle_hire_xy.csv file.\nExport map file called cycle_hire.html.","code":""},{"path":"adv-map.html","id":"adv-map","chapter":"9 Making maps with R","heading":"9 Making maps with R","text":"","code":""},{"path":"adv-map.html","id":"prerequisites-7","chapter":"9 Making maps with R","heading":"Prerequisites","text":"chapter requires following packages already using:main package used chapter tmap.\nrecommend install development version r-universe repository, updated frequently CRAN version:uses following visualization packages (also install shiny want develop interactive mapping applications):also need read-couple datasets follows Section 4.3:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)\nlibrary(spDataLarge)\ninstall.packages(\"tmap\", repos = c(\"https://r-tmap.r-universe.dev\",\n \"https://cloud.r-project.org\"))\nlibrary(tmap) # for static and interactive maps\nlibrary(leaflet) # for interactive maps\nlibrary(ggplot2) # tidyverse data visualization package\nnz_elev = rast(system.file(\"raster/nz_elev.tif\", package = \"spDataLarge\"))"},{"path":"adv-map.html","id":"introduction-5","chapter":"9 Making maps with R","heading":"9.1 Introduction","text":"satisfying important aspect geographic research communicating results.\nMap-making — art cartography — ancient skill involving communication, attention detail, element creativity.\nStatic mapping R straightforward plot() function, saw Section 2.2.3.\npossible create advanced maps using base R methods (Murrell 2016).\nfocus chapter, however, cartography dedicated map-making packages.\nlearning new skill, makes sense gain depth--knowledge one area branching .\nMap-making exception, hence chapter’s coverage one package (tmap) depth rather many superficially.addition fun creative, cartography also important practical applications.\ncarefully crafted map can best way communicating results work, poorly designed maps can leave bad impression.\nCommon design issues include poor placement, size readability text careless selection colors, outlined style guide Journal Maps.\nFurthermore, poor map-making can hinder communication results (Brewer 2015):Amateur-looking maps can undermine audience’s ability understand important information weaken presentation professional data investigation.\nMaps used several thousand years wide variety purposes.\nHistoric examples include maps buildings land ownership Old Babylonian dynasty 3000 years ago Ptolemy’s world map masterpiece Geography nearly 2000 years ago (Talbert 2014).Map-making historically activity undertaken , behalf , elite.\nchanged emergence open source mapping software R package tmap ‘print layout’ QGIS, allow anyone make high-quality maps, enabling ‘citizen science’.\nMaps also often best way present findings geocomputational research way accessible.\nMap-making therefore critical part geocomputation emphasis describing, also changing world.chapter shows make wide range maps.\nnext section covers range static maps, including aesthetic considerations, facets inset maps.\nSections 9.3 9.5 cover animated interactive maps (including web maps mapping applications).\nFinally, Section 9.6 covers range alternative map-making packages including ggplot2 cartogram.","code":""},{"path":"adv-map.html","id":"static-maps","chapter":"9 Making maps with R","heading":"9.2 Static maps","text":"\nStatic maps common type visual output geocomputation.\nusually stored standard formats including .png .pdf graphical raster vector outputs, respectively.\nInitially, static maps type maps R produce.\nThings advanced release sp (see Pebesma Bivand 2005), many map-making techniques, functions, packages developed since .\nHowever, despite innovation interactive mapping, static plotting still emphasis geographic data visualization R decade later (Cheshire Lovelace 2015).generic plot() function often fastest way create static maps vector raster spatial objects (see Sections 2.2.3 2.3.3).\nSometimes, simplicity speed priorities, especially development phase project, plot() excels.\nbase R approach also extensible, plot() offering dozens arguments.\nAnother approach grid package allows low-level control static maps, illustrated chapter 14 Murrell (2016).\npart book focuses tmap emphasizes essential aesthetic layout options.\ntmap powerful flexible map-making package sensible defaults.\nconcise syntax allows creation attractive maps minimal code familiar ggplot2 users.\nalso unique capability generate static interactive maps using code via tmap_mode().\nFinally, accepts wider range spatial classes (including sf terra objects) alternatives ggplot2.","code":""},{"path":"adv-map.html","id":"tmap-basics","chapter":"9 Making maps with R","heading":"9.2.1 tmap basics","text":"\nLike ggplot2, tmap based idea ‘grammar graphics’ (Wilkinson Wills 2005).\ninvolves separation input data aesthetics (data visualized): input dataset can ‘mapped’ range different ways including location map (defined data’s geometry), color, visual variables.\nbasic building block tm_shape() (defines input data: vector raster object), followed one layer elements tm_fill() tm_dots().\nlayering demonstrated chunk , generates maps presented Figure 9.1:\nFIGURE 9.1: New Zealand’s shape plotted fill (left), border (middle) fill border (right) layers added using tmap functions.\nobject passed tm_shape() case nz, sf object representing regions New Zealand (see Section 2.2.1 sf objects).\nLayers added represent nz visually, tm_fill() tm_borders() creating shaded areas (left panel) border outlines (middle panel) Figure 9.1, respectively.\nintuitive approach map-making:\ncommon task adding new layers undertaken addition operator +, followed tm_*().\nasterisk (*) refers wide range layer types self-explanatory names including:tm_fill(): shaded areas (multi)polygonstm_borders(): border outlines (multi)polygonstm_polygons(): , shaded areas border outlines (multi)polygonstm_lines(): lines (multi)linestringstm_symbols(): symbols (multi)points, (multi)linestrings, (multi)polygonstm_raster(): colored cells raster data (also tm_rgb() rasters three layers)tm_text(): text information (multi)points, (multi)linestrings, (multi)polygonsThis layering illustrated right panel Figure 9.1, result adding border top fill layer.","code":"\n# Add fill layer to nz shape\ntm_shape(nz) +\n tm_fill() \n# Add border layer to nz shape\ntm_shape(nz) +\n tm_borders() \n# Add fill and border layers to nz shape\ntm_shape(nz) +\n tm_fill() +\n tm_borders() "},{"path":"adv-map.html","id":"map-obj","chapter":"9 Making maps with R","heading":"9.2.2 Map objects","text":"useful feature tmap ability store objects representing maps.\ncode chunk demonstrates saving last plot Figure 9.1 object class tmap (note use tm_polygons() condenses tm_fill() + tm_borders() single function):map_nz can plotted later, example adding layers (shown ) simply running map_nz console, equivalent print(map_nz).New shapes can added + tm_shape(new_obj).\ncase, new_obj represents new spatial object plotted top preceding layers.\nnew shape added way, subsequent aesthetic functions refer , another new shape added.\nsyntax allows creation maps multiple shapes layers, illustrated next code chunk uses function tm_raster() plot raster layer (col_alpha set make layer semi-transparent):Building previously created map_nz object, preceding code creates new map object map_nz1 contains another shape (nz_elev) representing average elevation across New Zealand (see Figure 9.2, left).\nshapes layers can added, illustrated code chunk creates nz_water, representing New Zealand’s territorial waters, adds resulting lines existing map object.limit number layers shapes can added tmap objects, shape can even used multiple times.\nfinal map illustrated Figure 9.2 created adding layer representing high points (stored object nz_height) onto previously created map_nz2 object tm_symbols() (see ?tm_symbols details tmap’s point plotting functions).\nresulting map, four layers, illustrated right-hand panel Figure 9.2:\nuseful little known feature tmap multiple map objects can arranged single ‘metaplot’ tmap_arrange().\ndemonstrated code chunk plots map_nz1 map_nz3, resulting Figure 9.2.\nFIGURE 9.2: Maps added layers final map Figure 9.1.\nelements can also added + operator.\nAesthetic settings, however, controlled arguments layer functions.","code":"\nmap_nz = tm_shape(nz) + tm_polygons()\nclass(map_nz)\n#> [1] \"tmap\"\nmap_nz1 = map_nz +\n tm_shape(nz_elev) + tm_raster(col_alpha = 0.7)\nnz_water = st_union(nz) |>\n st_buffer(22200) |> \n st_cast(to = \"LINESTRING\")\nmap_nz2 = map_nz1 +\n tm_shape(nz_water) + tm_lines()\nmap_nz3 = map_nz2 +\n tm_shape(nz_height) + tm_symbols()\ntmap_arrange(map_nz1, map_nz2, map_nz3)"},{"path":"adv-map.html","id":"visual-variables","chapter":"9 Making maps with R","heading":"9.2.3 Visual variables","text":"\nplots previous section demonstrate tmap’s default aesthetic settings.\nGray shades used tm_fill() tm_symbols() layers continuous black line used represent lines created tm_lines().\ncourse, default values aesthetics can overridden.\npurpose section show .two main types map aesthetics: change data constant.\nUnlike ggplot2, uses helper function aes() represent variable aesthetics, tmap accepts aesthetic arguments, depending selected layer type:fill: fill color polygoncol: color polygon border, line, point, rasterlwd: line widthlty: line typesize: size symbolshape: shape symbolAdditionally, may customize fill border color transparency using fill_alpha col_alpha.map variable aesthetic, pass column name corresponding argument, set fixed aesthetic, pass desired value instead.48\nimpact setting fixed values illustrated Figure 9.3.\nFIGURE 9.3: Impact changing commonly used fill border aesthetics fixed values.\nLike base R plots, arguments defining aesthetics can also receive values vary.\nUnlike base R code (generates left panel Figure 9.4), tmap aesthetic arguments accept numeric vector:Instead fill (aesthetics can vary lwd line layers size point layers) requires character string naming attribute associated geometry plotted.\nThus, one achieve desired result follows (Figure 9.4, right panel):\nFIGURE 9.4: Comparison base (left) tmap (right) handling numeric color field.\nvisual variable three related additional arguments, suffixes .scale, .legend, .free.\nexample, tm_fill() function arguments fill, fill.scale, fill.legend, fill.free.\n.scale argument determines provided values represented map legend (Section 9.2.4), .legend argument used customize legend settings, title, orientation, position (Section 9.2.5).\n.free argument relevant maps many facets determine facet different scale legend.","code":"\nma1 = tm_shape(nz) + tm_polygons(fill = \"red\")\nma2 = tm_shape(nz) + tm_polygons(fill = \"red\", fill_alpha = 0.3)\nma3 = tm_shape(nz) + tm_polygons(col = \"blue\")\nma4 = tm_shape(nz) + tm_polygons(lwd = 3)\nma5 = tm_shape(nz) + tm_polygons(lty = 2)\nma6 = tm_shape(nz) + tm_polygons(fill = \"red\", fill_alpha = 0.3,\n col = \"blue\", lwd = 3, lty = 2)\ntmap_arrange(ma1, ma2, ma3, ma4, ma5, ma6)\nplot(st_geometry(nz), col = nz$Land_area) # works\ntm_shape(nz) + tm_fill(fill = nz$Land_area) # fails\n#> Error: palette should be a character value\ntm_shape(nz) + tm_fill(fill = \"Land_area\")"},{"path":"adv-map.html","id":"scales","chapter":"9 Making maps with R","heading":"9.2.4 Scales","text":"\nScales control values represented map legend, largely depend selected visual variable.\nexample, visual variable col, col.scale controls colors spatial objects related provided values; visual variable size, size.scale controls sizes represent provided values.\ndefault, used scale tm_scale(), selects visual settings automatically given input data type (factor, numeric, integer).\nLet’s see scales work customizing polygons’ fill colors.\nColor settings important part map design – can major impact spatial variability portrayed illustrated Figure 9.5.\nfigure shows four ways coloring regions New Zealand depending median income, left right (demonstrated code chunk ):default setting uses ‘pretty’ breaks, described next paragraphbreaks allows manually set breaksn sets number bins numeric variables categorizedvalues defines color scheme, example, BuGn\nFIGURE 9.5: Illustration color settings. results show (left right): default settings, manual breaks, n breaks, impact changing palette.\n\nalso able customize scales using family functions start tm_scale_ prefix.\nimportant ones tm_scale_intervals(), tm_scale_continuous(), tm_scale_categorical().\ntm_scale_intervals() function splits input data values set intervals.\naddition manually setting breaks, tmap allows users specify algorithms create breaks style argument automatically.\ndefault tm_scale_intervals(style = \"pretty\"), rounds breaks whole numbers possible spaces evenly.\noptions listed presented Figure 9.6.style = \"equal\": divides input values bins equal range appropriate variables uniform distribution (recommended variables skewed distribution resulting map may end little color diversity)style = \"quantile\": ensures number observations fall category (potential downside bin ranges can vary widely)style = \"jenks\": identifies groups similar values data maximizes differences categoriesstyle = \"log10_pretty\": common logarithmic (logarithm base 10) version regular pretty style used variables right-skewed distribution\nFIGURE 9.6: Different interval scale methods set using style argument tmap.\n\ntm_scale_continuous() function presents continuous color field particularly suited continuous rasters (Figure 9.7, left panel).\ncase variables q skewed distribution, can also use variants – tm_scale_continuous_log() tm_scale_continuous_log1p().\nFinally, tm_scale_categorical() designed represent categorical values ensures category receives unique color (Figure 9.7, right panel).\nFIGURE 9.7: Continuous categorical scales tmap.\n\nPalettes define color ranges associated bins determined tm_scale_*() functions, breaks n arguments described .\nexpects vector colors new color palette name, can found interactively cols4all::c4a_gui().\ncan also add - color palette name prefix reverse palette order.\nthree main groups color palettes: categorical, sequential diverging (Figure 9.8), serves different purpose.49\nCategorical palettes consist easily distinguishable colors appropriate categorical data without particular order state names land cover classes.\nColors intuitive: rivers blue, example, pastures green.\nAvoid many categories: maps large legends many colors can uninterpretable.50The second group sequential palettes.\nfollow gradient, example light dark colors (light colors often tend represent lower values), appropriate continuous (numeric) variables.\nSequential palettes can single (greens goes light dark green, example) multi-color/hue (yl_gn_bu gradient light yellow blue via green, example), demonstrated code chunk — output shown, run code see results!third group, diverging palettes, typically range three distinct colors (purple-white-green Figure 9.8) usually created joining two single-color sequential palettes darker colors end.\nmain purpose visualize difference important reference point, e.g., certain temperature, median household income mean probability drought event.\nreference point’s value can adjusted tmap using midpoint argument.\nFIGURE 9.8: Examples categorical, sequential diverging palettes.\ntwo important principles consideration working colors: perceptibility accessibility.\nFirstly, colors maps match perception.\nmeans certain colors viewed experience also cultural lenses.\nexample, green colors usually represent vegetation lowlands, blue connected water coolness.\nColor palettes also easy understand effectively convey information.\nclear values lower higher, colors change gradually.\nSecondly, changes colors accessible largest number people.\nTherefore, important use colorblind friendly palettes often possible.51","code":"\ntm_shape(nz) + tm_polygons(fill = \"Median_income\")\ntm_shape(nz) + tm_polygons(fill = \"Median_income\",\n fill.scale = tm_scale(breaks = c(0, 30000, 40000, 50000)))\ntm_shape(nz) + tm_polygons(fill = \"Median_income\",\n fill.scale = tm_scale(n = 10))\ntm_shape(nz) + tm_polygons(fill = \"Median_income\",\n fill.scale = tm_scale(values = \"BuGn\"))\ntm_shape(nz) + \n tm_polygons(\"Median_income\", fill.scale = tm_scale(values = \"greens\"))\ntm_shape(nz) + \n tm_polygons(\"Median_income\", fill.scale = tm_scale(values = \"yl_gn_bu\"))\ntm_shape(nz) + \n tm_polygons(\"Median_income\",\n fill.scale = tm_scale_continuous(values = \"pu_gn_div\", \n midpoint = 28000))"},{"path":"adv-map.html","id":"legends","chapter":"9 Making maps with R","heading":"9.2.5 Legends","text":"\ndecided visual variable properties, move attention toward related map legend style.\nUsing tm_legend() function, may change title, position, orientation, even disable .\nimportant argument function title, sets title associated legend.\ngeneral, map legend title provide two pieces information: legend represents units presented variable.\nfollowing code chunk demonstrates functionality providing attractive name variable name Land_area (note use expression() create superscript text):default legend orientation tmap \"portrait\", however, alternative legend orientation, \"landscape\", also possible.\n, can also customize location legend using position argument.legend position (also position several map elements tmap) can customized using one functions.\ntwo important :tm_pos_out(): default, adds legend outside map frame area.\ncan customize location two values represent horizontal position (\"left\", \"center\", \"right\"), vertical position (\"bottom\", \"center\", \"top\")tm_pos_in(): puts legend inside map frame area.\nmay decide position using two arguments, first one can \"left\", \"center\", \"right\", second one can \"bottom\", \"center\", \"top\".Alternatively, may just provide vector two values (two numbers 0 1) – case, legend put inside map frame.","code":"\nlegend_title = expression(\"Area (km\"^2*\")\")\ntm_shape(nz) +\n tm_polygons(fill = \"Land_area\", fill.legend = tm_legend(title = legend_title))\ntm_shape(nz) +\n tm_polygons(fill = \"Land_area\",\n fill.legend = tm_legend(title = legend_title,\n orientation = \"landscape\",\n position = tm_pos_out(\"center\", \"bottom\")))"},{"path":"adv-map.html","id":"layouts","chapter":"9 Making maps with R","heading":"9.2.6 Layouts","text":"\nmap layout refers combination map elements cohesive map.\nMap elements include among others objects mapped, map grid, scale bar, title, margins, color settings covered previous section relate palette breakpoints used affect map looks.\nmay result subtle changes can equally large impact impression left maps.Additional map elements graticules , north arrows, scale bars map titles functions: tm_graticules(), tm_compass(), tm_scalebar(), tm_title() (Figure 9.9).52\nFIGURE 9.9: Map additional elements: north arrow scale bar.\ntmap also allows wide variety layout settings changed, , produced using following code (see args(tm_layout) ?tm_layout full list), illustrated Figure 9.10.\nFIGURE 9.10: Layout options specified (left right) scale, bg.color, frame arguments.\narguments tm_layout() provide control many aspects map relation canvas placed.\nuseful layout settings (illustrated Figure 9.11):Margin settings including inner.margin outer.marginFont settings controlled fontface fontfamilyLegend settings including options legend.show (whether show legend) legend.orientation, legend.position, legend.frameFrame width (frame.lwd) option allow double lines (frame.double.line)Color settings controlling color.sepia.intensity (yellowy map looks) color.saturation (color-grayscale)\nFIGURE 9.11: Selected layout options.\n","code":"\nmap_nz + \n tm_graticules() +\n tm_compass(type = \"8star\", position = c(\"left\", \"top\")) +\n tm_scalebar(breaks = c(0, 100, 200), text.size = 1, position = c(\"left\", \"top\")) +\n tm_title(\"New Zealand\")\nmap_nz + tm_layout(scale = 4)\nmap_nz + tm_layout(bg.color = \"lightblue\")\nmap_nz + tm_layout(frame = FALSE)"},{"path":"adv-map.html","id":"faceted-maps","chapter":"9 Making maps with R","heading":"9.2.7 Faceted maps","text":"\nFaceted maps, also referred ‘small multiples’, composed many maps arranged side--side, sometimes stacked vertically (Meulemans et al. 2017).\nFacets enable visualization spatial relationships change respect another variable, time.\nchanging populations settlements, example, can represented faceted map panel representing population particular moment time.\ntime dimension represented via another visual variable color.\nHowever, risks cluttering map involve multiple overlapping points (cities tend move time!).Typically individual facets faceted map contain geometry data repeated multiple times, column attribute data (default plotting method sf objects, see Chapter 2).\nHowever, facets can also represent shifting geometries evolution point pattern time.\nuse case faceted plot illustrated Figure 9.12.\nFIGURE 9.12: Faceted map showing top 30 largest urban agglomerations 1970 2030 based population projections United Nations.\npreceding code chunk demonstrates key features faceted maps created using tm_facets_wrap() function:Shapes facet variable repeated (countries world case)argument varies depending variable (\"year\" case)nrow/ncol setting specifying number rows columns facets arranged intoAlternatively, possible use tm_facets_grid() function allows facets based three different variables: one rows, one columns, possibly one pages.addition utility showing changing spatial relationships, faceted maps also useful foundation animated maps (see Section 9.3).","code":"\nurb_1970_2030 = urban_agglomerations |> \n filter(year %in% c(1970, 1990, 2010, 2030))\n\ntm_shape(world) +\n tm_polygons() +\n tm_shape(urb_1970_2030) +\n tm_symbols(fill = \"black\", col = \"white\", size = \"population_millions\") +\n tm_facets_wrap(by = \"year\", nrow = 2)"},{"path":"adv-map.html","id":"inset-maps","chapter":"9 Making maps with R","heading":"9.2.8 Inset maps","text":"\ninset map smaller map rendered within next main map.\nserve many different purposes, including providing context (Figure 9.13) bringing non-contiguous regions closer ease comparison (Figure 9.14).\nalso used focus smaller area detail cover area map, representing different topic.example , create map central part New Zealand’s Southern Alps.\ninset map show main map relation whole New Zealand.\nfirst step define area interest, can done creating new spatial object, nz_region.second step, create base-map showing New Zealand’s Southern Alps area.\nplace important message stated.third step consists inset map creation.\ngives context helps locate area interest.\nImportantly, map needs clearly indicate location main map, example stating borders.One main differences regular charts (e.g., scatterplots) maps input data determine aspect ratio maps.\nThus, case, need calculate aspect ratios two main datasets, nz_region nz.\nfollowing function, norm_dim() returns normalized width (\"w\") height (\"h\") object (\"snpc\" units understood graphic device).Next, knowing aspect ratios, need specify sizes locations two maps – main map inset map – using viewport() function.\nviewport part graphics device use draw graphical elements given moment.\nviewport main map just representation aspect ratio.hand, viewport inset map needs specify size location.\n, make inset map twice smaller main one multiplying width height 0.5, locate 0.5 cm bottom right main map frame.Finally, combine two maps creating new, blank canvas, printing main map, placing inset map inside main map viewport.\nFIGURE 9.13: Inset map providing context – location central part Southern Alps New Zealand.\nInset maps can saved file either using graphic device (see Section 8.9) tmap_save() function arguments: insets_tm insets_vp.Inset maps also used create one map non-contiguous areas.\nProbably, often used example map United States, consists contiguous United States, Hawaii Alaska.\nimportant find best projection individual inset types cases (see Chapter 7 learn ).\ncan use US National Atlas Equal Area map contiguous United States putting EPSG code crs argument tm_shape().rest objects, hawaii alaska, already proper projections; therefore, just need create two separate maps:final map created combining, resizing arranging three maps:\nFIGURE 9.14: Map United States.\ncode presented compact can used basis inset maps, results, Figure 9.14, provide poor representation locations sizes Hawaii Alaska.\n-depth approach, see us-map vignette geocompkg.","code":"\nnz_region = st_bbox(c(xmin = 1340000, xmax = 1450000,\n ymin = 5130000, ymax = 5210000),\n crs = st_crs(nz_height)) |> \n st_as_sfc()\nnz_height_map = tm_shape(nz_elev, bbox = nz_region) +\n tm_raster(col.scale = tm_scale_continuous(values = \"YlGn\"),\n col.legend = tm_legend(position = c(\"left\", \"top\"))) +\n tm_shape(nz_height) + tm_symbols(shape = 2, col = \"red\", size = 1) +\n tm_scalebar(position = c(\"left\", \"bottom\"))\nnz_map = tm_shape(nz) + tm_polygons() +\n tm_shape(nz_height) + tm_symbols(shape = 2, col = \"red\", size = 0.1) + \n tm_shape(nz_region) + tm_borders(lwd = 3) +\n tm_layout(bg.color = \"lightblue\")\nlibrary(grid)\nnorm_dim = function(obj){\n bbox = st_bbox(obj)\n width = bbox[[\"xmax\"]] - bbox[[\"xmin\"]]\n height = bbox[[\"ymax\"]] - bbox[[\"ymin\"]]\n w = width / max(width, height)\n h = height / max(width, height)\n return(unit(c(w, h), \"snpc\"))\n}\nmain_dim = norm_dim(nz_region)\nins_dim = norm_dim(nz)\nmain_vp = viewport(width = main_dim[1], height = main_dim[2])\nins_vp = viewport(width = ins_dim[1] * 0.5, height = ins_dim[2] * 0.5,\n x = unit(1, \"npc\") - unit(0.5, \"cm\"), y = unit(0.5, \"cm\"),\n just = c(\"right\", \"bottom\"))\ngrid.newpage()\nprint(nz_height_map, vp = main_vp)\npushViewport(main_vp)\nprint(nz_map, vp = ins_vp)\nus_states_map = tm_shape(us_states, crs = \"EPSG:9311\") + \n tm_polygons() + \n tm_layout(frame = FALSE)\nhawaii_map = tm_shape(hawaii) +\n tm_polygons() + \n tm_title(\"Hawaii\") +\n tm_layout(frame = FALSE, bg.color = NA, \n title.position = c(\"LEFT\", \"BOTTOM\"))\nalaska_map = tm_shape(alaska) +\n tm_polygons() + \n tm_title(\"Alaska\") +\n tm_layout(frame = FALSE, bg.color = NA)\nus_states_map\nprint(hawaii_map, vp = grid::viewport(0.35, 0.1, width = 0.2, height = 0.1))\nprint(alaska_map, vp = grid::viewport(0.15, 0.15, width = 0.3, height = 0.3))"},{"path":"adv-map.html","id":"animated-maps","chapter":"9 Making maps with R","heading":"9.3 Animated maps","text":"\nFaceted maps, described Section 9.2.7, can show spatial distributions variables change (e.g., time), approach disadvantages.\nFacets become tiny many .\nFurthermore, fact facet physically separated screen page means subtle differences facets can hard detect.Animated maps solve issues.\nAlthough depend digital publication, becoming less issue content moves online.\nAnimated maps can still enhance paper reports: can always link readers webpage containing animated (interactive) version printed map help make come alive.\nseveral ways generate animations R, including animation packages gganimate, builds ggplot2 (see Section 9.6).\nsection focuses creating animated maps tmap syntax familiar previous sections flexibility approach.Figure 9.15 simple example animated map.\nUnlike faceted plot, squeeze multiple maps single screen allows reader see spatial distribution world’s populous agglomerations evolve time (see book’s website animated version).\nFIGURE 9.15: Animated map showing top 30 largest urban agglomerations 1950 2030 based population projects United Nations. Animated version available online : r.geocompx.org.\nanimated map illustrated Figure 9.15 can created using tmap techniques generate faceted maps, demonstrated Section 9.2.7.\ntwo differences, however, related arguments tm_facets_wrap():nrow = 1, ncol = 1 added keep one moment time one layerfree.coords = FALSE, maintains map extent map iterationThese additional arguments demonstrated subsequent code chunk53:resulting urb_anim represents set separate maps year.\nfinal stage combine save result .gif file tmap_animation().\nfollowing command creates animation illustrated Figure 9.15, elements missing, add exercises:Another illustration power animated maps provided Figure 9.16.\nshows development states United States, first formed east incrementally west finally interior.\nCode reproduce map can found script code/09-usboundaries.R book GitHub repository.\nFIGURE 9.16: Animated map showing population growth, state formation boundary changes United States, 1790-2010. Animated version available online r.geocompx.org.\n","code":"\nurb_anim = tm_shape(world) + tm_polygons() + \n tm_shape(urban_agglomerations) + tm_symbols(size = \"population_millions\") +\n tm_facets_wrap(by = \"year\", nrow = 1, ncol = 1, free.coords = FALSE)\ntmap_animation(urb_anim, filename = \"urb_anim.gif\", delay = 25)"},{"path":"adv-map.html","id":"interactive-maps","chapter":"9 Making maps with R","heading":"9.4 Interactive maps","text":"\nstatic animated maps can enliven geographic datasets, interactive maps can take new level.\nInteractivity can take many forms, common useful ability pan around zoom part geographic dataset overlaid ‘web map’ show context.\nLess advanced interactivity levels include pop-ups appear click different features, kind interactive label.\nadvanced levels interactivity include ability tilt rotate maps, demonstrated mapdeck example , provision “dynamically linked” sub-plots automatically update user pans zooms (Pezanowski et al. 2018).important type interactivity, however, display geographic data interactive ‘slippy’ web maps.\nrelease leaflet package 2015 (uses leaflet JavaScript library) revolutionized interactive web map creation within R, number packages built foundations adding new features (e.g., leaflet.extras2) making creation web maps simple creating static maps (e.g., mapview tmap).\nsection illustrates approach opposite order.\nexplore make slippy maps tmap (syntax already learned), mapview, mapdeck finally leaflet (provides low-level control interactive maps).unique feature tmap mentioned Section 9.2 ability create static interactive maps using code.\nMaps can viewed interactively point switching view mode, using command tmap_mode(\"view\").\ndemonstrated code , creates interactive map New Zealand based tmap object map_nz, created Section 9.2.2, illustrated Figure 9.17:\nFIGURE 9.17: Interactive map New Zealand created tmap view mode. Interactive version available online : r.geocompx.org.\nNow interactive mode ‘turned ’, maps produced tmap launch (another way create interactive maps tmap_leaflet() function).\nNotable features interactive mode include ability specify basemap tm_basemap() (tmap_options()) demonstrated (result shown):impressive little-known feature tmap’s view mode also works faceted plots.\nargument sync tm_facets_wrap() can used case produce multiple maps synchronized zoom pan settings, illustrated Figure 9.18, produced following code:\nFIGURE 9.18: Faceted interactive maps global coffee production 2016 2017 sync, demonstrating tmap’s view mode action.\nSwitch tmap back plotting mode function:proficient tmap, quickest way create interactive maps R may mapview.\nfollowing ‘one liner’ reliable way interactively explore wide range geographic data formats:\nFIGURE 9.19: Illustration mapview action.\nmapview concise syntax, yet, powerful.\ndefault, standard GIS functionality mouse position information, attribute queries (via pop-ups), scale bar, zoom--layer buttons.\nalso offers advanced controls including ability ‘burst’ datasets multiple layers addition multiple layers + followed name geographic object.\nAdditionally, provides automatic coloring attributes via zcol argument.\nessence, can considered data-driven leaflet API (see information leaflet).\nGiven mapview always expects spatial object (including sf SpatRaster) first argument, works well end piped expressions.\nConsider following example sf used intersect lines polygons visualized mapview (Figure 9.20).\nFIGURE 9.20: Using mapview end sf-based pipe expression.\nOne important thing keep mind mapview layers added via + operator (similar ggplot2 tmap).\ndefault, mapview uses leaflet JavaScript library render output maps, user-friendly lot features.\nHowever, alternative rendering libraries performant (work smoothly larger datasets).\nmapview allows set alternative rendering libraries (\"leafgl\" \"mapdeck\") mapviewOptions().54\ninformation mapview, see package’s website : r-spatial.github.io/mapview/.ways create interactive maps R.\ngoogleway package, example, provides interactive mapping interface flexible extensible\n(see googleway-vignette details).\nAnother approach author mapdeck, provides access Uber’s Deck.gl framework.\nuse WebGL enables interactively visualize large datasets millions points.\npackage uses Mapbox access tokens, must register using package.unique feature mapdeck provision interactive 2.5D perspectives, illustrated Figure 9.21.\nmeans can can pan, zoom rotate around maps, view data ‘extruded’ map.\nFigure 9.21, generated following code chunk, visualizes road traffic crashes UK, bar height representing casualties per area.\nFIGURE 9.21: Map generated mapdeck, representing road traffic casualties across UK. Height 1-km cells represents number crashes.\ncan zoom drag map browser, addition rotating tilting pressing Cmd/Ctrl.\nMultiple layers can added pipe operator, demonstrated mapdeck vignettes.\nmapdeck also supports sf objects, can seen replacing add_grid() function call preceding code chunk add_polygon(data = lnd, layer_id = \"polygon_layer\"), add polygons representing London interactive tilted map.Last leaflet mature widely used interactive mapping package R.\nleaflet provides relatively low-level interface Leaflet JavaScript library many arguments can understood reading documentation original JavaScript library (see leafletjs.com).Leaflet maps created leaflet(), result leaflet map object can piped leaflet functions.\nallows multiple map layers control settings added interactively, demonstrated code generates Figure 9.22 (see rstudio.github.io/leaflet/ details).\nFIGURE 9.22: leaflet package action, showing cycle hire points London. See interactive version online.\n","code":"\ntmap_mode(\"view\")\nmap_nz\nmap_nz + tm_basemap(server = \"OpenTopoMap\")\nworld_coffee = left_join(world, coffee_data, by = \"name_long\")\nfacets = c(\"coffee_production_2016\", \"coffee_production_2017\")\ntm_shape(world_coffee) + tm_polygons(facets) + \n tm_facets_wrap(nrow = 1, sync = TRUE)\ntmap_mode(\"plot\")\n#> ℹ tmap mode set to \"plot\".\nmapview::mapview(nz)\nlibrary(mapview)\noberfranken = subset(franconia, district == \"Oberfranken\")\ntrails |>\n st_transform(st_crs(oberfranken)) |>\n st_intersection(oberfranken) |>\n st_collection_extract(\"LINESTRING\") |>\n mapview(color = \"red\", lwd = 3, layer.name = \"trails\") +\n mapview(franconia, zcol = \"district\") +\n breweries\nlibrary(mapdeck)\nset_token(Sys.getenv(\"MAPBOX\"))\ncrash_data = read.csv(\"https://git.io/geocompr-mapdeck\")\ncrash_data = na.omit(crash_data)\nms = mapdeck_style(\"dark\")\nmapdeck(style = ms, pitch = 45, location = c(0, 52), zoom = 4) |>\n add_grid(data = crash_data, lat = \"lat\", lon = \"lng\", cell_size = 1000,\n elevation_scale = 50, colour_range = hcl.colors(6, \"plasma\"))\npal = colorNumeric(\"RdYlBu\", domain = cycle_hire$nbikes)\nleaflet(data = cycle_hire) |> \n addProviderTiles(providers$CartoDB.Positron) |>\n addCircles(col = ~pal(nbikes), opacity = 0.9) |> \n addPolygons(data = lnd, fill = FALSE) |> \n addLegend(pal = pal, values = ~nbikes) |> \n setView(lng = -0.1, 51.5, zoom = 12) |> \n addMiniMap()"},{"path":"adv-map.html","id":"mapping-applications","chapter":"9 Making maps with R","heading":"9.5 Mapping applications","text":"\ninteractive web maps demonstrated Section 9.4 can go far.\nCareful selection layers display, basemaps pop-ups can used communicate main results many projects involving geocomputation.\nweb-mapping approach interactivity limitations:Although map interactive terms panning, zooming clicking, code static, meaning user interface fixedAll map content generally static web map, meaning web maps scale handle large datasets easilyAdditional layers interactivity, graphs showing relationships variables ‘dashboards’, difficult create using web-mapping-approachOvercoming limitations involves going beyond static web mapping toward geospatial frameworks map servers.\nProducts field include GeoDjango (extends Django web framework written Python), MapServer (framework developing web applications, largely written C C++) GeoServer (mature powerful map server written Java).\nscalable, enabling maps served thousands people daily, assuming sufficient public interest maps!\nbad news server-side solutions require much skilled developer time set maintain, often involving teams people roles dedicated geospatial database administrator (DBA).Fortunately R programmers, web-mapping applications can now rapidly created shiny.\ndescribed open source book Mastering Shiny, shiny R package framework converting R code interactive web applications (Wickham 2021).\ncan embed interactive maps shiny apps thanks functions leaflet::renderLeaflet().\nsection gives context, teaches basics shiny web-mapping perspective, culminates full-screen mapping application less 100 lines code.shiny well documented shiny.posit.co, highlights two components every shiny app: ‘front end’ (bit user sees) ‘back end’ code.\nshiny apps, elements typically created objects named ui server within R script named app.R, lives ‘app folder’.\nallows web-mapping applications represented single file, CycleHireApp/app.R file book’s GitHub repo.considering large apps, worth seeing minimal example, named ‘lifeApp’, action.55\ncode defines launches — command shinyApp() — lifeApp, provides interactive slider allowing users make countries appear progressively lower levels life expectancy (see Figure 9.23):\nFIGURE 9.23: Screenshot showing minimal example web-mapping application created shiny.\nuser interface (ui) lifeApp created fluidPage().\ncontains input output ‘widgets’ — case, sliderInput() (many *Input() functions available) leafletOutput().\narranged row-wise default, explaining slider interface placed directly map Figure 9.23 (see ?column adding content column-wise).server side (server) function input output arguments.\noutput list objects containing elements generated render*() function — renderLeaflet() example generates output$map.\nInput elements input$life referred server must relate elements exist ui — defined inputId = \"life\" code .\nfunction shinyApp() combines ui server elements serves results interactively via new R process.\nmove slider map shown Figure 9.23, actually causing R code re-run, although hidden view user interface.Building basic example knowing find help (see ?shiny), best way forward now may stop reading start programming!\nrecommended next step open previously mentioned CycleHireApp/app.R script integrated development environment (IDE) choice, modify re-run repeatedly.\nexample contains components web mapping application implemented shiny ‘shine’ light behave.CycleHireApp/app.R script contains shiny functions go beyond demonstrated simple ‘lifeApp’ example (Figure 9.24).\ninclude reactive() observe() (creating outputs respond user interface — see ?reactive) leafletProxy() (modifying leaflet object already created).\nelements critical creation web-mapping applications implemented shiny.\nrange ‘events’ can programmed including advanced functionality drawing new layers subsetting data, described shiny section RStudio’s leaflet website.Experimenting apps CycleHireApp build knowledge web-mapping applications R, also practical skills.\nChanging contents setView(), example, change starting bounding box user sees app initiated.\nexperimentation done random, reference relevant documentation, starting ?shiny, motivated desire solve problems posed exercises.shiny used way can make prototyping mapping applications faster accessible ever (deploying shiny apps, https://shiny.posit.co/deploy/, separate topic beyond scope chapter).\nEven applications eventually deployed using different technologies, shiny undoubtedly allows web-mapping applications developed relatively lines code (86 case CycleHireApp).\nstop shiny apps getting rather large.\nPropensity Cycle Tool (PCT) hosted pct.bike, example, national mapping tool funded UK’s Department Transport.\nPCT used dozens people day multiple interactive elements based 1000 lines code (Lovelace et al. 2017).apps undoubtedly take time effort develop, shiny provides framework reproducible prototyping aid development process.\nOne potential problem ease developing prototypes shiny temptation start programming early, purpose mapping application envisioned detail.\nreason, despite advocating shiny, recommend starting longer established technology pen paper first stage interactive mapping projects.\nway prototype web applications limited technical considerations, motivations imagination.\nFIGURE 9.24: CycleHireApp, simple web-mapping application finding closest cycle hiring station based location requirement cycles. Interactive version available online : r.geocompx.org.\n","code":"\nlibrary(shiny) # for shiny apps\nlibrary(leaflet) # renderLeaflet function\nlibrary(spData) # loads the world dataset \nui = fluidPage(\n sliderInput(inputId = \"life\", \"Life expectancy\", 49, 84, value = 80),\n leafletOutput(outputId = \"map\")\n )\nserver = function(input, output) {\n output$map = renderLeaflet({\n leaflet() |> \n # addProviderTiles(\"OpenStreetMap.BlackAndWhite\") |>\n addPolygons(data = world[world$lifeExp < input$life, ])})\n}\nshinyApp(ui, server)"},{"path":"adv-map.html","id":"other-mapping-packages","chapter":"9 Making maps with R","heading":"9.6 Other mapping packages","text":"tmap provides powerful interface creating wide range static maps (Section 9.2) also supports interactive maps (Section 9.4).\nmany options creating maps R.\naim section provide taste pointers additional resources: map-making surprisingly active area R package development, learn can covered .mature option use plot() methods provided core spatial packages sf terra, covered Sections 2.2.3 2.3.3, respectively.\nmentioned sections plot methods vector raster objects can combined results draw onto plot area (elements keys sf plots multi-band rasters interfere ).\nbehavior illustrated subsequent code chunk generates Figure 9.25.\nplot() many options can explored following links ?plot help page fifth sf vignette sf5.\nFIGURE 9.25: Map New Zealand created plot(). legend right refers elevation (1000 m sea level).\ntidyverse plotting package ggplot2 also supports sf objects geom_sf().\nsyntax similar used tmap:\ninitial ggplot() call followed one layers, added + geom_*(), * represents layer type geom_sf() (sf objects) geom_points() (points).ggplot2 plots graticules default.\ndefault settings graticules can overridden using scale_x_continuous(), scale_y_continuous() coord_sf(datum = NA).\nnotable features include use unquoted variable names encapsulated aes() indicate aesthetics vary switching data sources using data argument, demonstrated code chunk creates Figure 9.26:Another benefit maps based ggplot2 can easily given level interactivity printed using function ggplotly() plotly package.\nTry plotly::ggplotly(g1), example, compare result plotly mapping functions described : blog.cpsievert..advantage ggplot2 strong user community many add-packages.\nincludes ggspatial, enhances ggplot2’s mapping capabilities providing options add north arrow (annotation_north_arrow()) scale bar (annotation_scale()), add background tiles (annotation_map_tile()).\nalso accepts various spatial data classes layer_spatial().\nThus, able plot SpatRaster objects terra using function seen Figure 9.26.\nFIGURE 9.26: Comparison map New Zealand created ggplot2 alone (left) ggplot2 ggspatial (right).\ntime, ggplot2 drawbacks, example geom_sf() function always able create desired legend use spatial data.\nGood additional ggplot2 resources can found open source ggplot2 book (Wickham 2016) descriptions multitude ‘ggpackages’ ggrepel tidygraph.covered mapping sf, terra ggplot2 first packages highly flexible, allowing creation wide range static maps.\ncover mapping packages plotting specific type map (next paragraph), worth considering alternatives packages already covered general-purpose mapping (Table 9.1).\nTABLE 9.1: TABLE 9.2: Selected general-purpose mapping packages.\nTable 9.1 shows range mapping packages available, many others listed table.\nnote mapsf, can generate range geographic visualizations including choropleth, ‘proportional symbol’ ‘flow’ maps.\ndocumented mapsf vignette.Several packages focus specific map types, illustrated Table 9.3.\npackages create cartograms distort geographical space, create line maps, transform polygons regular hexagonal grids, visualize complex data grids representing geographic topologies, create 3D visualizations.TABLE 9.3: Selected specific-purpose mapping packages, associated metrics.aforementioned packages, however, different approaches data preparation map creation.\nnext paragraph, focus solely cartogram package (Jeworutzki 2023).\nTherefore, suggest read geogrid, geofacet, linemap, tanaka, rayshader documentations learn .cartogram map geometry proportionately distorted represent mapping variable.\nCreation type map possible R cartogram, allows creating contiguous non-contiguous area cartograms.\nmapping package per se, allows construction distorted spatial objects plotted using generic mapping package.cartogram_cont() function creates contiguous area cartograms.\naccepts sf object name variable (column) inputs.\nAdditionally, possible modify intermax argument – maximum number iterations cartogram transformation.\nexample, represent median income New Zeleand’s regions contiguous cartogram (Figure 9.27, right panel) follows:\nFIGURE 9.27: Comparison standard map (left) contiguous area cartogram (right).\ncartogram also offers creation non-contiguous area cartograms using cartogram_ncont() Dorling cartograms using cartogram_dorling().\nNon-contiguous area cartograms created scaling region based provided weighting variable.\nDorling cartograms consist circles area proportional weighting variable.\ncode chunk demonstrates creation non-contiguous area Dorling cartograms US states’ population (Figure 9.28):\nFIGURE 9.28: Comparison non-contiguous area cartogram (left) Dorling cartogram (right).\n","code":"\ng = st_graticule(nz, lon = c(170, 175), lat = c(-45, -40, -35))\nplot(nz_water, graticule = g, axes = TRUE, col = \"blue\")\nterra::plot(nz_elev / 1000, add = TRUE, axes = FALSE)\nplot(st_geometry(nz), add = TRUE)\nlibrary(ggplot2)\ng1 = ggplot() + geom_sf(data = nz, aes(fill = Median_income)) +\n geom_sf(data = nz_height) +\n scale_x_continuous(breaks = c(170, 175))\ng1\nlibrary(ggspatial)\nggplot() + \n layer_spatial(nz_elev) +\n geom_sf(data = nz, fill = NA) +\n annotation_scale() +\n scale_x_continuous(breaks = c(170, 175)) +\n scale_fill_continuous(na.value = NA)\nlibrary(cartogram)\nnz_carto = cartogram_cont(nz, \"Median_income\", itermax = 5)\ntm_shape(nz_carto) + tm_polygons(\"Median_income\")\nus_states9311 = st_transform(us_states, \"EPSG:9311\")\nus_states9311_ncont = cartogram_ncont(us_states9311, \"total_pop_15\")\nus_states9311_dorling = cartogram_dorling(us_states9311, \"total_pop_15\")"},{"path":"adv-map.html","id":"exercises-7","chapter":"9 Making maps with R","heading":"9.7 Exercises","text":"exercises rely new object, africa.\nCreate using world worldbank_df datasets spData package follows:also use zion nlcd datasets spDataLarge:E1. Create map showing geographic distribution Human Development Index (HDI) across Africa base graphics (hint: use plot()) tmap packages (hint: use tm_shape(africa) + ...).Name two advantages based experience.Name three mapping packages advantage .Bonus: create three maps Africa using three packages.E2. Extend tmap created previous exercise legend three bins: “High” (HDI 0.7), “Medium” (HDI 0.55 0.7) “Low” (HDI 0.55).\nBonus: improve map aesthetics, example changing legend title, class labels color palette.E3. Represent africa’s subregions map.\nChange default color palette legend title.\nNext, combine map map created previous exercise single plot.E4. Create land cover map Zion National Park.Change default colors match perception land cover categoriesAdd scale bar north arrow change position improve map’s aesthetic appealBonus: Add inset map Zion National Park’s location context state Utah. (Hint: object representing Utah can subset us_states dataset.)E5. Create facet maps countries Eastern Africa:one facet showing HDI representing population growth (hint: using variables HDI pop_growth, respectively)‘small multiple’ per countryE6. Building previous facet map examples, create animated maps East Africa:Showing country orderShowing country order legend showing HDIE7. Create interactive map HDI Africa:tmapWith mapviewWith leafletBonus: approach, add legend (automatically provided) scale barE8. Sketch paper ideas web-mapping application used make transport land-use policies evidence-based:city live, couple users per dayIn country live, dozens users per dayWorldwide hundreds users per day large data serving requirementsE9. Update code coffeeApp/app.R instead centering Brazil user can select country focus :Using textInput()Using selectInput()E10. Reproduce Figure 9.1 Figure 9.7 closely possible using ggplot2 package.E11. Join us_states us_states_df together calculate poverty rate state using new dataset.\nNext, construct continuous area cartogram based total population.\nFinally, create compare two maps poverty rate: (1) standard choropleth map (2) map using created cartogram boundaries.\ninformation provided first second map?\ndiffer ?E12. Visualize population growth Africa.\nNext, compare maps hexagonal regular grid created using geogrid package.","code":"\nlibrary(spData)\nafrica = world |> \n filter(continent == \"Africa\", !is.na(iso_a2)) |> \n left_join(worldbank_df, by = \"iso_a2\") |> \n select(name, subregion, gdpPercap, HDI, pop_growth) |> \n st_transform(\"ESRI:102022\") |> \n st_make_valid() |> \n st_collection_extract(\"POLYGON\")\nzion = read_sf((system.file(\"vector/zion.gpkg\", package = \"spDataLarge\")))\nnlcd = rast(system.file(\"raster/nlcd.tif\", package = \"spDataLarge\"))"},{"path":"gis.html","id":"gis","chapter":"10 Bridges to GIS software","heading":"10 Bridges to GIS software","text":"","code":""},{"path":"gis.html","id":"prerequisites-8","chapter":"10 Bridges to GIS software","heading":"Prerequisites","text":"chapter requires QGIS, SAGA GRASS GIS installed following packages attached:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(qgisprocess)\nlibrary(Rsagacmd)\nlibrary(rgrass)\nlibrary(rstac)\nlibrary(gdalcubes)"},{"path":"gis.html","id":"introduction-6","chapter":"10 Bridges to GIS software","heading":"10.1 Introduction","text":"defining feature interpreted languages interactive console — technically read-eval-print loop (REPL) — R way interact :\nrather relying pointing clicking different parts screen, type commands console execute Enter key.\ncommon effective workflow using interactive development environments RStudio VS Code type code source files source editor control interactive execution code shortcut Ctrl+Enter.CLIs unique R: early computing environments relied command line ‘shell’ invention widespread adoption computer mouse 1990s graphical user interfaces (GUIs) became common.\nGRASS GIS longest-standing continuously-developed open source GIS software, example, relied command line interface gained GUI (Landa 2008).\npopular GIS software projects GUI-driven.\ncan interact QGIS, SAGA, GRASS GIS gvSIG system terminals embedded CLIs, design encourages people interact ‘pointing clicking’.\nunintended consequence GIS users miss advantages CLI-driven scriptable approaches.\nAccording creator popular QGIS software (Sherman 2008):advent ‘modern’ GIS software, people want point click way life. ’s good, tremendous amount flexibility power waiting command line. Many times can something command line fraction time can GUI.‘CLI vs GUI’ debate adverserial: ways working advantages, depending range factors including task (drawing new features well suited GUIs), level reproducibility desired, user’s skillset.\nGRASS GIS good example GIS software primarily based CLI also prominent GUI.\nLikewise, R focused CLI, IDEs RStudio provide GUI improving accessibility.\nSoftware neatly categorised CLI-based GUI-based.\nHowever, interactive command line interfaces several important advantages terms :Automating repetitive tasksEnabling transparency reproducibilityEncouraging software development providing tools modify existing functions implement new onesDeveloping future-proof efficient programming skills high demandImproving touch typing, key skill digital ageOn hand, good GUIs also advantages, including:‘Shallow’ learning curves meaning geographic data can explored visualized without hours learning new languageSupport ‘digitizing’ (creating new vector datasets), including trace, snap topological tools56Enables georeferencing (matching raster images existing maps) ground control points orthorectificationSupports stereoscopic mapping (e.g., LiDAR structure motion)Another advantage dedicated GIS software projects provide access hundreds ‘geoalgorithms’ via ‘GIS bridges’ (Neteler Mitasova 2008).\nbridges computational recipes enhancing R’s capabilities solving geographic data problems topic chapter.R natural choice people wanting build bridges reproducible data analysis workflows GIS originated interface language.\nkey feature R (predecessor S) provides access statistical algorithms languages (particularly FORTRAN C), powerful high level functional language intuitive REPL environment, C FORTRAN lacked (Chambers 2016).\nR continues tradition interfaces numerous languages, notably C++.Although R designed command line GIS, ability interface dedicated GISs gives astonishing geospatial capabilities.\nGIS bridges, R can replicate diverse workflows, additional reproducibility, scalability productity benefits controlling programming environment consistent CLI.\nFurthermore, R outperforms GISs areas geocomputation, including interactive/animated map-making (see Chapter 9) spatial statistical modeling (see Chapter 12).chapter focuses ‘bridges’ three mature open source GIS products, summarized Table 10.1:QGIS, via package qgisprocess [Dunnington et al. (2024); Section 10.2]SAGA, via Rsagacmd [Pawley (2023); Section 10.3]GRASS GIS, via rgrass [Bivand (2023); Section 10.4]also major developments enabling open source GIS software write execute R scripts inside QGIS (see docs.qgis.org) GRASS GIS (see grasswiki.osgeo.org).TABLE 10.1: Comparison three open-source GIS. Hybrid refers support vector raster operations.addition three R-GIS bridges mentioned , chapter also provides brief introduction R interfaces spatial libraries (Section 10.6), spatial databases (Section 10.7), cloud-based processing Earth observation data (Section 10.8).","code":""},{"path":"gis.html","id":"rqgis","chapter":"10 Bridges to GIS software","heading":"10.2 qgisprocess: a bridge to QGIS and beyond","text":"QGIS popular open-source GIS (Table 10.1; Graser Olaya (2015)).\nQGIS provides unified interface QGIS’s native geoalgorithms, GDAL, — installed — providers GRASS GIS, SAGA (Graser Olaya 2015).\nSince version 3.14 (released summer 2020), QGIS ships qgis_process command line utility accessing bounty functionality geocomputation.\nqgis_process provides access 300+ geoalgorithms standard QGIS installation 1,000+ via plugins external providers GRASS GIS SAGA.qgisprocess package provides access qgis_process R.\npackage requires QGIS — relevant plugins GRASS GIS SAGA, used chapter — installed available system.\ninstallation instructions, see qgisprocess’s documentation.quick way get --running qgisprocess Docker installed via qgis image developed part project.\nAssuming Docker installed sufficient computational resources, can run R session qgisprocess relevant plugins following command (see geocompx/docker repository details):docker run -e DISABLE_AUTH=true -p 8786:8787 ghcr.io/geocompx/docker:qgisThis package automatically tries detect QGIS installation complains find .57\npossible solutions configuration fails: can set options(qgisprocess.path = \"path//your_qgis_process\"), set R_QGISPROCESS_PATH environment variable.\napproaches can also used one QGIS installation want decide one use.\ndetails, please refer qgisprocess ‘getting started’ vignette.\nNext, can find plugins (meaning different software) available computer:tells us GRASS GIS (grassprovider) SAGA (processing_saga_nextgen) plugins available system yet enabled.\nSince need later chapter, let’s enable .Please note aside installing SAGA system also need install QGIS Python plugin Processing Saga NextGen.\ncan within QGIS Plugin Manager programmatically help Python package qgis-plugin-manager (least Linux).qgis_providers() lists name software corresponding count available geoalgorithms.output table affirms can use QGIS geoalgorithms (native, qgis, 3d, pdal) external ones third-party providers GDAL, SAGA GRASS GIS QGIS interface.Now, ready geocomputation QGIS friends, within R!\nLet’s try two example case studies.\nfirst one shows unite two polygonal datasets different borders (Section 10.2.1).\nsecond one focuses deriving new information digital elevation model represented raster (Section 10.2.2).","code":"\nlibrary(qgisprocess)\n#> Attempting to load the cache ... Success!\n#> QGIS version: 3.30.3-'s-Hertogenbosch\n#> ...\nqgis_plugins()\n#> # A tibble: 4 × 2\n#> name enabled\n#> \n#> 1 grassprovider FALSE\n#> 2 otbprovider FALSE\n#> 3 processing TRUE\n#> 4 processing_saga_nextgen FALSE\nqgis_enable_plugins(c(\"grassprovider\", \"processing_saga_nextgen\"), \n quiet = TRUE)\nqgis_providers()\n#> # A tibble: 7 × 3\n#> provider provider_title algorithm_count\n#> \n#> 1 gdal GDAL 56\n#> 2 grass GRASS 306\n#> 3 qgis QGIS 50\n#> 4 3d QGIS (3D) 1\n#> 5 native QGIS (native c++) 243\n#> 6 pdal QGIS (PDAL) 17\n#> 7 sagang SAGA Next Gen 509"},{"path":"gis.html","id":"qgis-vector","chapter":"10 Bridges to GIS software","heading":"10.2.1 Vector data","text":"Consider situation two polygon objects different spatial units (e.g., regions, administrative units).\ngoal merge two objects one, containing boundary lines related attributes.\nuse incongruent polygons already encountered Section 4.2.8 (Figure 10.1).\npolygon datasets available spData package, like use geographic CRS (see also Chapter 7).\nFIGURE 10.1: Illustration two areal units: incongruent (black lines) aggregating zones (red borders).\nfirst step find algorithm can merge two vector objects.\nlist available algorithms, can use qgis_algorithms() function.\nfunction returns data frame containing available providers algorithms contain.58To find algorithm, can use qgis_search_algorithms() function.\nAssuming short description function contains word “union”, can run following code find algorithm interest:One algorithms list, \"native:union\", sounds promising.\nnext step find algorithm can use .\nrole qgis_show_help(), returns short summary algorithm , arguments, outputs.59\nmakes output rather long.\nfollowing command returns data frame row representing argument required \"native:union\" columns name, description, type, default value, available values, acceptable values associated :arguments, contained union_arguments$name, INPUT, OVERLAY, OVERLAY_FIELDS_PREFIX, OUTPUT.\nunion_arguments$acceptable_values contains list possible input values argument.\nMany functions require inputs representing paths vector layer; qgisprocess functions accept sf objects arguments.\nObjects terra stars package can used “path raster layer” expected.\ncan convenient, recommend providing path spatial data disk read submit qgisprocess algorithm: first thing qgisprocess executing geoalgorithm export spatial data living R session back disk format known QGIS .gpkg .tif files.\ncan increase algorithm runtimes.main function qgisprocess qgis_run_algorithm(), sends inputs QGIS returns outputs.\naccepts algorithm name set named arguments shown help list, performs expected calculations.\ncase, three arguments seem important - INPUT, OVERLAY, OUTPUT.\nfirst one, INPUT, main vector object incongr_wgs, second one, OVERLAY, aggzone_wgs.\nlast argument, OUTPUT, output file name, qgisprocess automatically choose create tempdir() none provided.Running line code save two input objects temporary .gpkg files, run selected algorithm , return temporary .gpkg file output.\nqgisprocess package stores qgis_run_algorithm() result list containing, case, path output file.\ncan either read file back R read_sf() (e.g., union_sf = read_sf(union[[1]])) directly st_as_sf():Note QGIS union operation merges two input layers one layer using intersection symmetrical difference two input layers (, way, also default union operation GRASS GIS SAGA).\nst_union(incongr_wgs, aggzone_wgs) (see Exercises)!result, union_sf, multipolygon larger number features two input objects.\nNotice, however, many polygons small represent real areas rather result two datasets different level detail.\nartifacts error called sliver polygons (see red-colored polygons left panel Figure 10.2).\nOne way identify slivers find polygons comparatively small areas, , e.g., 25000 m2, next remove .\nLet’s search appropriate algorithm.time found algorithm, v.clean, included QGIS, GRASS GIS.\nGRASS GIS’s v.clean powerful tool cleaning topology spatial vector data.\nImportantly, can use qgisprocess.Similarly previous step, start looking algorithm’s help.omitted output , help text quite long contains lot arguments.60\nv.clean multi tool – can clean different types geometries solve different types topological problems.\nexample, let’s focus just arguments, however, encourage visit algorithm’s documentation learn v.clean capabilities.main argument algorithm input – vector object.\nNext, need select tool – cleaning method. 61\ndozen tools exist v.clean allowing remove duplicate geometries, remove small angles lines, remove small areas, among others.\ncase, interested latter tool, rmarea.\nSeveral tools, rmarea included, expect additional argument threshold, whose behavior depends selected tool.\ncase, rmarea tool removes areas smaller equal provided threshold.\nNote threshold must specified square meters regardless coordinate reference system input layer.Let’s run algorithm convert output new sf object clean_sf.result, right panel 10.2, looks expected – sliver polygons now removed.\nFIGURE 10.2: Sliver polygons colored red (left panel). Cleaned polygons (right panel).\n","code":"\ndata(\"incongruent\", \"aggregating_zones\", package = \"spData\")\nincongr_wgs = st_transform(incongruent, \"EPSG:4326\")\naggzone_wgs = st_transform(aggregating_zones, \"EPSG:4326\")\n# output not shown\nqgis_algorithms()\nqgis_search_algorithms(\"union\")\n#> # A tibble: 2 × 5\n#> provider provider_title group algorithm algorithm_title \n#> \n#> 1 native QGIS (native c++) Vector overlay native:multiunion Union (multiple)\n#> 2 native QGIS (native c++) Vector overlay native:union Union \nalg = \"native:union\"\nunion_arguments = qgis_get_argument_specs(alg)\nunion_arguments\n#> # A tibble: 5 × 6\n#> name description qgis_type default_value available_values acceptable_...\n#> \n#> 1 INPUT Input layer source \n#> 2 OVERLAY Overlay la… source \n#> 3 OVERLA… Overlay fi… string \n#> 4 OUTPUT Union sink \n#> 5 GRID_S… Grid size number \n\n#> [[1]]\n#> [1] \"A numeric value\" \n#> [2] \"field:FIELD_NAME to use a data defined value taken from the FIELD_NAME\n#> field\" \n#> [3] \"expression:SOME EXPRESSION to use a data defined value calculated using\n#> a custom QGIS expression\"\nunion = qgis_run_algorithm(alg,\n INPUT = incongr_wgs, OVERLAY = aggzone_wgs\n)\nunion\n#> $ OUTPUT: 'qgis_outputVector' chr \"/tmp/...gpkg\"\nunion_sf = st_as_sf(union)\nqgis_search_algorithms(\"clean\")\n#> # A tibble: 1 × 5\n#> provider provider_title group algorithm algorithm_title\n#> \n#> 1 grass GRASS Vector (v.*) grass:v.clean v.clean\nqgis_show_help(\"grass:v.clean\")\nqgis_get_argument_specs(\"grass:v.clean\") |>\n select(name, description) |>\n slice_head(n = 4)\n#> # A tibble: 4 × 2\n#> name description\n#> \n#> 1 input Layer to clean\n#> 2 type Input feature type\n#> 3 tool Cleaning tool\n#> 4 threshold Threshold (comma separated for each tool)\nclean = qgis_run_algorithm(\"grass:v.clean\",\n input = union_sf, \n tool = \"rmarea\", threshold = 25000\n)\nclean_sf = st_as_sf(clean)"},{"path":"gis.html","id":"qgis-raster","chapter":"10 Bridges to GIS software","heading":"10.2.2 Raster data","text":"Digital elevation models (DEMs) contain elevation information raster cell.\nused many purposes, including satellite navigation, water flow models, surface analysis, visualization.\n, interested deriving new information DEM raster used predictors statistical learning.\nVarious terrain parameters can helpful, example, prediction landslides (see Chapter 12)section, use dem.tif – digital elevation model Mongón study area (downloaded Land Process Distributed Active Archive Center, see also ?dem.tif).\nresolution 30 30 meters uses projected CRS.terra package’s terrain() command already allows calculation several fundamental topographic characteristics slope, aspect, TPI (Topographic Position Index), TRI (Topographic Ruggedness Index), roughness, flow directions.\nHowever, GIS programs offer many terrain characteristics, can suitable certain contexts.\nexample, topographic wetness index (TWI) found useful studying hydrological biological processes (Sørensen, Zinko, Seibert 2006).\nLet’s search algorithm list index using \"wetness\" keyword.output code suggests desired algorithm exists SAGA software.62\nThough SAGA hybrid GIS, main focus raster processing, , particularly digital elevation models (soil properties, terrain attributes, climate parameters).\nHence, SAGA especially good fast processing large (high-resolution) raster datasets (Conrad et al. 2015).\"sagang:sagawetnessindex\" algorithm actually modified TWI, results realistic soil moisture potential cells located valley floors (Böhner Selige 2006)., stick default values arguments.\nTherefore, specify one argument – input DEM.\ncourse, applying algorithm make sure parameter values correspondence study aim.63Before running SAGA algorithm within QGIS, change default raster output format .tif SAGA’s native raster format .sdat.\nHence, output rasters specify now written .sdat format.\nDepending software versions (SAGA, GDAL) using, might necessary often enough save trouble trying read output rasters created SAGA.\"sagang:sagawetnessindex\" returns one four rasters – catchment area, catchment slope, modified catchment area, topographic wetness index.\ncan read selected output providing output name qgis_as_terra() function.\nsince done SAGA processing within QGIS, change raster output format back .tif.can see TWI map left panel Figure 10.3.\ntopographic wetness index unitless: low values represent areas accumulate water, higher values show areas accumulate water increasing levels.Information digital elevation models can also categorized, example, geomorphons – geomorphological phenotypes consisting 10 classes represent terrain forms, slopes, ridges, valleys (Jasiewicz Stepinski 2013).\nphenotypes used many studies, including landslide susceptibility, ecosystem services, human mobility, digital soil mapping.original implementation geomorphons’ algorithm created GRASS GIS, can find qgisprocess list \"grass:r.geomorphon\":Calculation geomorphons requires input DEM (elevation), can customized set optional arguments.\nincludes, search – length line--sight calculated, -m – flag specifying search value provided meters (number cells).\ninformation additional arguments can found original paper GRASS GIS documentation.output, dem_geomorph$forms, contains raster file 10 categories – one representing terrain form.\ncan read R qgis_as_terra(), visualize (Figure 10.3, right panel) use subsequent calculations.Interestingly, connections geomorphons TWI values, shown Figure 10.3.\nlargest TWI values mostly occur valleys hollows, lowest values seen, expected, ridges.\nFIGURE 10.3: Topographic wetness index (TWI, left panel) geomorphons (right panel) derived Mongón study area.\n","code":"\nlibrary(qgisprocess)\nlibrary(terra)\ndem = system.file(\"raster/dem.tif\", package = \"spDataLarge\")\nqgis_search_algorithms(\"wetness\") |>\n dplyr::select(provider_title, algorithm) |>\n head(2)\n#> # A tibble: 2 × 2\n#> provider_title algorithm\n#> \n#> 1 SAGA Next Gen sagang:sagawetnessindex\n#> 2 SAGA Next Gen sagang:topographicwetnessindexonestep\nqgis_show_help(\"sagang:sagawetnessindex\")\noptions(qgisprocess.tmp_raster_ext = \".sdat\")\ndem_wetness = qgis_run_algorithm(\"sagang:sagawetnessindex\",\n DEM = dem\n)\ndem_wetness_twi = qgis_as_terra(dem_wetness$TWI)\n# plot(dem_wetness_twi)\noptions(qgisprocess.tmp_raster_ext = \".tif\")\nqgis_search_algorithms(\"geomorphon\")\n#> [1] \"grass:r.geomorphon\" \"sagang:geomorphons\" \nqgis_show_help(\"grass:r.geomorphon\")\n# output not shown\ndem_geomorph = qgis_run_algorithm(\"grass:r.geomorphon\",\n elevation = dem,\n `-m` = TRUE, search = 120\n)\ndem_geomorph_terra = qgis_as_terra(dem_geomorph$forms)"},{"path":"gis.html","id":"saga","chapter":"10 Bridges to GIS software","heading":"10.3 SAGA","text":"System Automated Geoscientific Analyses (SAGA; Table 10.1) provides possibility execute SAGA modules via command line interface (saga_cmd.exe Windows just saga_cmd Linux) (see SAGA wiki modules).\naddition, Python interface (SAGA Python API).\nRsagacmd uses former run SAGA within R.use Rsagacmd section delineate areas similar values normalized difference vegetation index (NDVI) Mongón study area Peru 22nd September 2000 (Figure 10.4, left panel) using seeded region growing algorithm SAGA.64To start using Rsagacmd, need run saga_gis() function.\nserves two main purposes:dynamically65 creates new object contains links valid SAGA libraries toolsIt sets general package options, raster_backend (R package use handling raster data), vector_backend (R package use handling vector data), cores (maximum number CPU cores used processing, default: )saga object contains connections available SAGA tools.\norganized list libraries (groups tools), inside library list tools.\ncan access tool $ sign (remember use TAB autocompletion).seeded region growing algorithm works two main steps (Adams Bischof 1994; Böhner, Selige, Ringeler 2006).\nFirst, initial cells (“seeds”) generated finding cells smallest variance local windows specified size.\nSecond, region growing algorithm used merge neighboring pixels seeds create homogeneous areas.example, first pointed imagery_segmentation library seed_generation tool.\nalso assigned sg object, retype whole tool code next steps.66\njust type sg, get quick summary tool data frame parameters, descriptions, defaults.\nmay also use tidy(sg) extract just parameters’ table.\nseed_generation tool takes raster dataset first argument (features); optional arguments include band_width specifies size initial polygons.output list three objects: variance – raster map local variance, seed_grid – raster map generated seeds, seed_points – spatial vector object generated seeds.second SAGA tool use seeded_region_growing.67\nseeded_region_growing tool requires two inputs: seed_grid calculated previous step ndvi raster object.\nAdditionally, can specify several parameters, normalize standardize input features, neighbour (4 8-neighborhood), method.\nlast parameter can set either 0 1 (region growing based raster cells’ values positions just values).\ndetailed description method, see Böhner, Selige, Ringeler (2006)., change method 1, meaning output regions created based similarity NDVI values.tool returns list three objects: segments, similarity, table.\nsimilarity object raster showing similarity seeds cells, table data frame storing information input seeds.\nFinally, ndvi_srg$segments raster resulting areas (Figure 10.4, right panel).\ncan convert polygons .polygons() st_as_sf() (Section 6.5).\nFIGURE 10.4: Normalized difference vegetation index (NDVI, left panel) NDVI-based segments derived using seeded region growing algorithm Mongón study area.\nresulting polygons (segments) represent areas similar values.\ncan also aggregated larger polygons using various techniques, clustering (e.g., k-means), regionalization (e.g., SKATER) supervised classification methods.\ncan try Exercises.R also tools achieve goal creating polygons similar values (-called segments).\nincludes SegOptim package (Gonçalves et al. 2019) allows running several image segmentation algorithms supercells (Nowosad Stepinski 2022) implements superpixels algorithm SLIC work geospatial data.","code":"\nndvi = rast(system.file(\"raster/ndvi.tif\", package = \"spDataLarge\"))\nlibrary(Rsagacmd)\nsaga = saga_gis(raster_backend = \"terra\", vector_backend = \"sf\")\nsg = saga$imagery_segmentation$seed_generation\nndvi_seeds = sg(ndvi, band_width = 2)\n# plot(ndvi_seeds$seed_grid)\nsrg = saga$imagery_segmentation$seeded_region_growing\nndvi_srg = srg(ndvi_seeds$seed_grid, ndvi, method = 1)\nplot(ndvi_srg$segments)\nndvi_segments = ndvi_srg$segments |>\n as.polygons() |>\n st_as_sf()"},{"path":"gis.html","id":"grass","chapter":"10 Bridges to GIS software","heading":"10.4 GRASS GIS","text":"U.S. Army - Construction Engineering Research Laboratory (USA-CERL) created core Geographical Resources Analysis Support System (GRASS GIS) (Table 10.1; Neteler Mitasova (2008)) 1982 1995.\nAcademia continued work since 1997.\nSimilar SAGA, GRASS GIS focused raster processing beginning later, since GRASS GIS 6.0, adding advanced vector functionality (Bivand, Pebesma, Gómez-Rubio 2013).GRASS GIS stores input data internal database.\nregard vector data, GRASS GIS default topological GIS, .e., stores geometry adjacent features .\nSQLite default database driver vector attribute management, attributes linked geometry, .e., GRASS GIS database, via keys (GRASS GIS vector management).one can use GRASS GIS, one set GRASS GIS database (also within R), users might find process bit intimidating beginning.\nFirst , GRASS GIS database requires directory, , turn, contains location (see GRASS GIS Database help pages grass.osgeo.org information).\nlocation stores geodata one project one area.\nWithin one location, several mapsets can exist typically refer different users different tasks.\nlocation also PERMANENT mapset – mandatory mapset created automatically.\norder share geographic data users project, database owner can add spatial data PERMANENT mapset.\naddition, PERMANENT mapset stores projection, spatial extent default resolution raster data.\n, sum – GRASS GIS database may contain many locations (data one location CRS), location can store many mapsets (groups datasets).\nPlease refer Neteler Mitasova (2008) GRASS GIS quick start information GRASS GIS spatial database system.\nquickly use GRASS GIS within R, use link2GI package, however, one can also set GRASS GIS database step--step.\nSee GRASS within R .\nPlease note code instructions following paragraphs might hard follow using GRASS GIS first time running code line--line examining intermediate results, reasoning behind become even clearer., introduce rgrass one interesting problems GIScience - traveling salesman problem.\nSuppose traveling salesman like visit 24 customers.\nAdditionally, salesman like start finish journey home makes total 25 locations covering shortest distance possible.\nsingle best solution problem; however, check possible solutions (mostly) impossible modern computers (Longley 2015).\ncase, number possible solutions correspond (25 - 1)! / 2, .e., factorial 24 divided 2 (since differentiate forward backward direction).\nEven one iteration can done nanosecond, still corresponds 9837145 years.\nLuckily, clever, almost optimal solutions run tiny fraction inconceivable amount time.\nGRASS GIS provides one solutions (details, see v.net.salesman).\nuse case, like find shortest path first 25 bicycle stations (instead customers) London’s streets (simply assume first bike station corresponds home traveling salesman).Aside cycle hire points data, need street network area.\ncan download OpenStreetMap help osmdata package (see also Section 8.5).\n, constrain query street network (OSM language called “highway”) bounding box points, attach corresponding data sf-object.\nosmdata_sf() returns list several spatial objects (points, lines, polygons, etc.), , keep line objects related ids.68Now data, can go initiate GRASS GIS session.\nLuckily, linkGRASS() link2GI packages lets one set GRASS GIS environment just one line code.\nthing need provide spatial object determines projection extent spatial database.\nFirst, linkGRASS() finds GRASS GIS installations computer.\nSince set ver_select TRUE, can interactively choose one found GRASS GIS-installations.\njust one installation, linkGRASS() automatically chooses .\nSecond, linkGRASS() establishes connection GRASS GIS.can use GRASS GIS geoalgorithms, also need add data GRASS GIS’s spatial database.\nLuckily, convenience function write_VECT() us.\n(Use write_RAST() raster data.)\ncase, add street cycle hire point data using first attribute column, name london_streets points GRASS GIS.rgrass package expects inputs gives outputs terra objects.\nTherefore, need convert sf spatial vectors terra’s SpatVectors using vect() function able use write_VECT().69Now, datasets exist GRASS GIS database.\nperform network analysis, need topologically clean street network.\nGRASS GIS’s \"v.clean\" takes care removal duplicates, small angles dangles, among others.\n, break lines intersection ensure subsequent routing algorithm can actually turn right left intersection, save output GRASS GIS object named streets_clean.likely cycling station points lie exactly street segment.\nHowever, find shortest route , need connect nearest streets segment.\n\"v.net\"’s connect-operator exactly .\nsave output streets_points_con.resulting clean dataset serves input \"v.net.salesman\" algorithm, finally finds shortest route cycle hire stations.\nOne arguments center_cats, requires numeric range input.\nrange represents points shortest route calculated.\nSince like calculate route cycle stations, set 1-25.\naccess GRASS GIS help page traveling salesman algorithm, run execGRASS(\"g.manual\", entry = \"v.net.salesman\").see result, read result R, convert sf-object keeping geometry, visualize help mapview package (Figure 10.5 Section 9.4).\nFIGURE 10.5: Shortest route (blue line) 24 cycle hire stations (blue dots) OSM street network London.\nimportant considerations note process:used GRASS GIS’s spatial database allows faster processing.\nmeans exported geographic data beginning.\ncreated new objects imported final result back R.\nfind datasets currently available, run execGRASS(\"g.list\", type = \"vector,raster\", flags = \"p\").also accessed already existing GRASS GIS spatial database within R.\nPrior importing data R, might want perform (spatial) subsetting.\nUse \"v.select\" \"v.extract\" vector data.\n\"db.select\" lets select subsets attribute table vector layer without returning corresponding geometry.can also start R within running GRASS GIS session (information please refer Bivand, Pebesma, Gómez-Rubio 2013).Refer excellent GRASS GIS online help execGRASS(\"g.manual\", flags = \"\") information available GRASS GIS geoalgorithm.","code":"\ndata(\"cycle_hire\", package = \"spData\")\npoints = cycle_hire[1:25, ]\nlibrary(osmdata)\nb_box = st_bbox(points)\nlondon_streets = opq(b_box) |>\n add_osm_feature(key = \"highway\") |>\n osmdata_sf()\nlondon_streets = london_streets[[\"osm_lines\"]]\nlondon_streets = select(london_streets, osm_id)\nlibrary(rgrass)\nlink2GI::linkGRASS(london_streets, ver_select = TRUE)\nwrite_VECT(terra::vect(london_streets), vname = \"london_streets\")\nwrite_VECT(terra::vect(points[, 1]), vname = \"points\")\nexecGRASS(\n cmd = \"v.clean\", input = \"london_streets\", output = \"streets_clean\",\n tool = \"break\", flags = \"overwrite\"\n)\nexecGRASS(\n cmd = \"v.net\", input = \"streets_clean\", output = \"streets_points_con\",\n points = \"points\", operation = \"connect\", threshold = 0.001,\n flags = c(\"overwrite\", \"c\")\n)\nexecGRASS(\n cmd = \"v.net.salesman\", input = \"streets_points_con\",\n output = \"shortest_route\", center_cats = paste0(\"1-\", nrow(points)),\n flags = \"overwrite\"\n)\nroute = read_VECT(\"shortest_route\") |>\n st_as_sf() |>\n st_geometry()\nmapview::mapview(route) + points"},{"path":"gis.html","id":"when-to-use-what","chapter":"10 Bridges to GIS software","heading":"10.5 When to use what?","text":"recommend single R-GIS interface hard since usage depends personal preferences, tasks hand familiarity different GIS software packages turn probably depends domain.\nmentioned previously, SAGA especially good fast processing large (high-resolution) raster datasets, frequently used hydrologists, climatologists soil scientists (Conrad et al. 2015).\nGRASS GIS, hand, GIS presented supporting topologically based spatial database especially useful network analyses also simulation studies.\nQGISS much user-friendly compared GRASS GIS SAGA, especially first-time GIS users, probably popular open-source GIS.\nTherefore, qgisprocess appropriate choice use cases.\nmain advantages :unified access several GIS, therefore provision >1000 geoalgorithms (Table 10.1) including duplicated functionality, e.g., can perform overlay-operations using QGIS-, SAGA- GRASS GIS-geoalgorithmsAutomatic data format conversions (SAGA uses .sdat grid files GRASS GIS uses database format QGIS handle corresponding conversions)automatic passing geographic R objects QGIS geoalgorithms back RConvenience functions support named arguments automatic default value retrieval (inspired rgrass)means, use cases certainly use one R-GIS bridges.\nThough QGIS GIS providing unified interface several GIS software packages, provides access subset corresponding third-party geoalgorithms (information please refer Muenchow, Schratz, Brenning (2017)).\nTherefore, use complete set SAGA GRASS GIS functions, stick Rsagacmd rgrass.\naddition, like run simulations help geodatabase (Krug, Roura-Pascual, Richardson 2010), use rgrass directly since qgisprocess always starts new GRASS GIS session call.\nFinally, need topological correct data /spatial database management functionality multi-user access, recommend usage GRASS GIS.Please note number GIS software packages scripting interface dedicated R package accesses : gvSig, OpenJump, Orfeo Toolbox.70","code":""},{"path":"gis.html","id":"gdal","chapter":"10 Bridges to GIS software","heading":"10.6 Bridges to GDAL","text":"discussed Chapter 8, GDAL low-level library supports many geographic data formats.\nGDAL effective GIS programs use GDAL background importing exporting geographic data, rather re-inventing wheel using bespoke read-write code.\nGDAL offers data /O.\ngeoprocessing tools vector raster data, functionality create tiles serving raster data online, rapid rasterization vector data.\nSince GDAL command line tool, commands can accessed within R via system() command.code chunk demonstrates functionality:\nlinkGDAL() searches computer working GDAL installation adds location executable files PATH variable, allowing GDAL called (usually needed Windows).Now can use system() function call GDAL tools.\nexample, ogrinfo provides metadata vector dataset.\ncall tool two additional flags: -al list features layers -get summary (complete geometry list):commonly used GDAL tools include:gdalinfo: provides metadata raster datasetgdal_translate: converts different raster file formatsogr2ogr: converts different vector file formatsgdalwarp: reprojects, transform, clip raster datasetsgdaltransform: transforms coordinatesVisit https://gdal.org/programs/ see complete list GDAL tools read help files.‘link’ GDAL provided link2GI used foundation advanced GDAL work R system CLI.\nTauDEM (https://hydrology.usu.edu/taudem/) Orfeo Toolbox (https://www.orfeo-toolbox.org/) spatial data processing libraries/programs offering command line interface – example shows access libraries system command line via R.\nturn starting point creating proper interface libraries form new R packages.diving project create new bridge, however, important aware power existing R packages system() calls may platform-independent (may fail computers).\nhand, sf terra brings power provided GDAL, GEOS PROJ R via R/C++ interface provided Rcpp, avoids system() calls.71","code":"\nlink2GI::linkGDAL()\nour_filepath = system.file(\"shapes/world.gpkg\", package = \"spData\")\ncmd = paste(\"ogrinfo -al -so\", our_filepath)\nsystem(cmd)\n#> INFO: Open of `.../spData/shapes/world.gpkg'\n#> using driver `GPKG' successful.\n#>\n#> Layer name: world\n#> Geometry: Multi Polygon\n#> Feature Count: 177\n#> Extent: (-180.000000, -89.900000) - (179.999990, 83.645130)\n#> Layer SRS WKT:\n#> ..."},{"path":"gis.html","id":"postgis","chapter":"10 Bridges to GIS software","heading":"10.7 Bridges to spatial databases","text":"\nSpatial database management systems (spatial DBMS) store spatial non-spatial data structured way.\ncan organize large collections data related tables (entities) via unique identifiers (primary foreign keys) implicitly via space (think instance spatial join).\nuseful geographic datasets tend become big messy quite quickly.\nDatabases enable storing querying large datasets efficiently based spatial non-spatial fields, provide multi-user access topology support.important open source spatial database PostGIS (Obe Hsu 2015).72\nR bridges spatial DBMSs PostGIS important, allowing access huge data stores without loading several gigabytes geographic data RAM, likely crashing R session.\nremainder section shows PostGIS can called R, based “Hello real-world” PostGIS Action, Second Edition (Obe Hsu 2015).73The subsequent code requires working internet connection, since accessing PostgreSQL/PostGIS database living QGIS Cloud (https://qgiscloud.com/).74\nfirst step create connection database providing name, host name, user information.new object, conn, just established link R session database.\nstore data.Often first question , ‘tables can found database?’.\ncan answered dbListTables() follows:answer five tables.\n, interested restaurants highways tables.\nformer represents locations fast-food restaurants US, latter principal US highways.\nfind attributes available table, can run dbListFields:Now, know available datasets, can perform queries – ask database questions.\nquery needs provided language understandable database – usually, SQL.\nfirst query select US Route 1 state Maryland (MD) highways table.\nNote read_sf() allows us read geographic data database provided open connection database query.\nAdditionally, read_sf() needs know column represents geometry (: wkb_geometry).results sf-object named us_route type MULTILINESTRING.mentioned , also possible ask non-spatial questions, also query datasets based spatial properties.\nshow , next example adds 35-kilometer (35,000 m) buffer around selected highway (Figure 10.6).Note spatial query using functions (ST_Union(), ST_Buffer()) already familiar .\nfind also sf-package, though written lowercase characters (st_union(), st_buffer()).\nfact, function names sf package largely follow PostGIS naming conventions.75The last query find Hardee’s restaurants (HDE) within 35 km buffer zone (Figure 10.6).Please refer Obe Hsu (2015) detailed explanation spatial SQL query.\nFinally, good practice close database connection follows:76\nFIGURE 10.6: Visualization output previous PostGIS commands showing highway (black line), buffer (light yellow) four restaurants (red points) within buffer.\nUnlike PostGIS, sf supports spatial vector data.\nquery manipulate raster data stored PostGIS database, use rpostgis package (Bucklin Basille 2018) /use command line tools rastertopgsql comes part PostGIS installation.subsection brief introduction PostgreSQL/PostGIS.\nNevertheless, like encourage practice storing geographic non-geographic data spatial DBMS attaching subsets R’s global environment needed (geo-)statistical analysis.\nPlease refer Obe Hsu (2015) detailed description SQL queries presented comprehensive introduction PostgreSQL/PostGIS general.\nPostgreSQL/PostGIS formidable choice open-source spatial database.\ntrue lightweight SQLite/SpatiaLite database engine GRASS GIS uses SQLite background (see Section 10.4).datasets big PostgreSQL/PostGIS require massive spatial data management query performance, may worth exploring large-scale geographic querying distributed computing systems.\nsystems outside scope book worth mentioning open source software providing functionality exists.\nProminent projects space include GeoMesa Apache Sedona.\napache.sedona package provides interface latter.","code":"\nlibrary(RPostgreSQL)\nconn = dbConnect(\n drv = PostgreSQL(),\n dbname = \"rtafdf_zljbqm\", host = \"db.qgiscloud.com\",\n port = \"5432\", user = \"rtafdf_zljbqm\", password = \"d3290ead\"\n)\ndbListTables(conn)\n#> [1] \"spatial_ref_sys\" \"topology\" \"layer\" \"restaurants\"\n#> [5] \"highways\"\ndbListFields(conn, \"highways\")\n#> [1] \"qc_id\" \"wkb_geometry\" \"gid\" \"feature\"\n#> [5] \"name\" \"state\"\nquery = paste(\n \"SELECT *\",\n \"FROM highways\",\n \"WHERE name = 'US Route 1' AND state = 'MD';\"\n)\nus_route = read_sf(conn, query = query, geom = \"wkb_geometry\")\nquery = paste(\n \"SELECT ST_Union(ST_Buffer(wkb_geometry, 35000))::geometry\",\n \"FROM highways\",\n \"WHERE name = 'US Route 1' AND state = 'MD';\"\n)\nbuf = read_sf(conn, query = query)\nquery = paste(\n \"SELECT *\",\n \"FROM restaurants r\",\n \"WHERE EXISTS (\",\n \"SELECT gid\",\n \"FROM highways\",\n \"WHERE\",\n \"ST_DWithin(r.wkb_geometry, wkb_geometry, 35000) AND\",\n \"name = 'US Route 1' AND\",\n \"state = 'MD' AND\",\n \"r.franchise = 'HDE');\"\n)\nhardees = read_sf(conn, query = query)\nRPostgreSQL::postgresqlCloseConnection(conn)"},{"path":"gis.html","id":"cloud","chapter":"10 Bridges to GIS software","heading":"10.8 Bridges to cloud technologies and services","text":"recent years, cloud technologies become prominent internet.\nalso includes use store process spatial data.\nMajor cloud computing providers (Amazon Web Services, Microsoft Azure / Planetary Computer, Google Cloud Platform, others) offer vast catalogs open Earth observation data, complete Sentinel-2 archive, platforms.\ncan use R directly connect process data archives, ideally machine cloud region.Three promising developments make working image archives cloud platforms easier efficient SpatioTemporal Asset Catalog (STAC), cloud-optimized GeoTIFF (COG) image file format, concept data cubes.\nSection 10.8.1 introduces individual developments briefly describes can used R.Besides hosting large data archives, numerous cloud-based services process Earth observation data launched last years.\nincludes OpenEO initiative – unified interface programming languages (including R) various cloud-based services.\ncan find information OpenEO Section 10.8.2.","code":""},{"path":"gis.html","id":"staccog","chapter":"10 Bridges to GIS software","heading":"10.8.1 STAC, COGs, and data cubes in the cloud","text":"SpatioTemporal Asset Catalog (STAC) general description format spatiotemporal data used describe variety datasets cloud platforms including imagery, synthetic aperture radar (SAR) data, point clouds.\nBesides simple static catalog descriptions, STAC-API presents web service query items (e.g., images) catalogs space, time, properties.\nR, rstac package (Simoes, Souza, et al. 2021) allows connect STAC-API endpoints search items.\nexample , request images Sentinel-2 Cloud-Optimized GeoTIFF (COG) dataset Amazon Web Services intersect predefined area time interest.\nresult contains found images metadata (e.g., cloud cover) URLs pointing actual files AWS.Cloud storage differs local hard disks traditional image file formats perform well cloud-based geoprocessing.\nCloud-optimized GeoTIFF makes reading rectangular subsets image reading images lower resolution much efficient.\nR user, install anything work COGs GDAL (package using ) can already work COGs.\nHowever, keep mind availability COGs big plus browsing catalogs data providers.larger areas interest, requested images still relatively difficult work : may use different map projections, may spatially overlap, spatial resolution often depends spectral band.\ngdalcubes package (Appel Pebesma 2019) can used abstract individual images create process image collections four-dimensional data cubes.code shows minimal example create lower resolution (250m) maximum NDVI composite Sentinel-2 images returned previous STAC-API search.filter images cloud cover, provide property filter function applied STAC result item creating image collection.\nfunction receives available metadata image input list returns single logical value images function yields TRUE considered.\ncase, ignore images 10% cloud cover.\ndetails, please refer tutorial presented OpenGeoHub summer school 2021.77The combination STAC, COGs, data cubes forms cloud-native workflow analyze (large) collections satellite imagery cloud.\ntools already form backbone, example, sits package, allows land use land cover classification big Earth observation data R.\npackage builds EO data cubes image collections available cloud services performs land classification data cubes using various machine deep learning algorithms.\ninformation sits visit https://e-sensing.github.io/sitsbook/ read related article (Simoes, Camara, et al. 2021).","code":"\nlibrary(rstac)\n# Connect to the STAC-API endpoint for Sentinel-2 data\n# and search for images intersecting our AOI\ns = stac(\"https://earth-search.aws.element84.com/v0\")\nitems = s |>\n stac_search(collections = \"sentinel-s2-l2a-cogs\",\n bbox = c(7.1, 51.8, 7.2, 52.8),\n datetime = \"2020-01-01/2020-12-31\") |>\n post_request() |>\n items_fetch()\nlibrary(gdalcubes)\n# Filter images by cloud cover and create an image collection object\ncloud_filter = function(x) {\n x[[\"eo:cloud_cover\"]] < 10\n}\ncollection = stac_image_collection(items$features, \n property_filter = cloud_filter)\n# Define extent, resolution (250m, daily) and CRS of the target data cube\nv = cube_view(srs = \"EPSG:3857\", extent = collection, dx = 250, dy = 250,\n dt = \"P1D\") # \"P1D\" is an ISO 8601 duration string\n# Create and process the data cube\ncube = raster_cube(collection, v) |>\n select_bands(c(\"B04\", \"B08\")) |>\n apply_pixel(\"(B08-B04)/(B08+B04)\", \"NDVI\") |>\n reduce_time(\"max(NDVI)\")\n# gdalcubes_options(parallel = 8)\n# plot(cube, zlim = c(0, 1))"},{"path":"gis.html","id":"openeo","chapter":"10 Bridges to GIS software","heading":"10.8.2 openEO","text":"OpenEO (Schramm et al. 2021) initiative support interoperability among cloud services defining common language processing data.\ninitial idea described r-spatial.org blog post aims making possible users change cloud services easily little code changes possible.\nstandardized processes use multidimensional data cube model interface data.\nImplementations available eight different backends (see https://hub.openeo.org) users can connect R, Python, JavaScript, QGIS, web editor define (chain) processes collections.\nSince functionality data availability differs among backends, openeo R package (Lahn 2021) dynamically loads available processes collections connected backend.\nAfterwards, users can load image collections, apply chain processes, submit jobs, explore plot results.following code connect openEO platform backend, request available datasets, processes, output formats, define process graph compute maximum NDVI image Sentinel-2 data, finally executes graph logging backend.\nopenEO platform backend includes free tier registration possible existing institutional internet platform accounts.","code":"\nlibrary(openeo)\ncon = connect(host = \"https://openeo.cloud\")\np = processes() # load available processes\ncollections = list_collections() # load available collections\nformats = list_file_formats() # load available output formats\n# Load Sentinel-2 collection\ns2 = p$load_collection(id = \"SENTINEL2_L2A\",\n spatial_extent = list(west = 7.5, east = 8.5,\n north = 51.1, south = 50.1),\n temporal_extent = list(\"2021-01-01\", \"2021-01-31\"),\n bands = list(\"B04\", \"B08\"))\n# Compute NDVI vegetation index\ncompute_ndvi = p$reduce_dimension(data = s2, dimension = \"bands\",\n reducer = function(data, context) {\n (data[2] - data[1]) / (data[2] + data[1])\n })\n# Compute maximum over time\nreduce_max = p$reduce_dimension(data = compute_ndvi, dimension = \"t\",\n reducer = function(x, y) {\n max(x)\n })\n# Export as GeoTIFF\nresult = p$save_result(reduce_max, formats$output$GTiff)\n# Login, see https://docs.openeo.cloud/getting-started/r/#authentication\nlogin(login_type = \"oidc\", provider = \"egi\", \n config = list(client_id = \"...\", secret = \"...\"))\n# Execute processes\ncompute_result(graph = result, output_file = tempfile(fileext = \".tif\"))"},{"path":"gis.html","id":"exercises-8","chapter":"10 Bridges to GIS software","heading":"10.9 Exercises","text":"E1. Compute global solar irradiation area system.file(\"raster/dem.tif\", package = \"spDataLarge\") March 21 11:00 using r.sun GRASS GIS qgisprocess.E2. Compute catchment area catchment slope system.file(\"raster/dem.tif\", package = \"spDataLarge\") using Rsagacmd.E3. Continue working ndvi_segments object created SAGA section.\nExtract average NDVI values ndvi raster group six clusters using kmeans().\nVisualize results.E4. Attach data(random_points, package = \"spDataLarge\") read system.file(\"raster/dem.tif\", package = \"spDataLarge\") R.\nSelect point randomly random_points find dem pixels can seen point (hint: viewshed can calculated using GRASS GIS).\nVisualize result.\nexample, plot hillshade, digital elevation model, viewshed output, point.\nAdditionally, give mapview try.E5. Use gdalinfo via system call raster file stored disk choice.\nkind information can find ?E6. Use gdalwarp decrease resolution raster file (example, resolution 0.5, change 1). Note: -tr -r flags used exercise.E7. Query Californian highways PostgreSQL/PostGIS database living QGIS Cloud introduced chapter.E8. ndvi.tif raster (system.file(\"raster/ndvi.tif\", package = \"spDataLarge\")) contains NDVI calculated Mongón study area based Landsat data September 22nd, 2000.\nUse rstac, gdalcubes, terra download Sentinel-2 images area \n2020-08-01 2020-10-31, calculate NDVI, compare results ndvi.tif.","code":""},{"path":"algorithms.html","id":"algorithms","chapter":"11 Scripts, algorithms and functions","heading":"11 Scripts, algorithms and functions","text":"","code":""},{"path":"algorithms.html","id":"prerequisites-9","chapter":"11 Scripts, algorithms and functions","heading":"Prerequisites","text":"chapter minimal software prerequisites primarily uses base R.\n, sf package used check results algorithm develop calculate area polygons.\nterms prior knowledge, chapter assumes understanding geographic classes introduced Chapter 2 can used represent wide range input file formats (see Chapter 8).","code":""},{"path":"algorithms.html","id":"intro-algorithms","chapter":"11 Scripts, algorithms and functions","heading":"11.1 Introduction","text":"Chapter 1 established geocomputation using existing tools, developing new ones, “form shareable R scripts functions”.\nchapter teaches building blocks reproducible code.\nalso introduces low-level geometric algorithms, type used Chapter 10.\nReading help understand algorithms work write code can used many times, many people, multiple datasets.\nchapter , , make skilled programmer.\nProgramming hard requires plenty practice (Abelson, Sussman, Sussman 1996):appreciate programming intellectual activity right must turn computer programming; must read write computer programs — many .strong reasons learning program.\nAlthough chapter teach programming — see resources Wickham (2019), Gillespie Lovelace (2016), Xiao (2016) teach programming R languages — provide starting points, focused geometry data, form good foundation developing programming skills.chapter also demonstrates highlights importance reproducibility.\nadvantages reproducibility go beyond allowing others replicate work:\nreproducible code often better every way code written run , including terms computational efficiency, ‘scalability’ (capability code run large datasets) ease adapting maintaining .Scripts basis reproducible R code, topic covered Section 11.2.\nAlgorithms recipes modifying inputs using series steps, resulting output, described Section 11.3.\nease sharing reproducibility, algorithms can placed functions.\ntopic Section 11.4.\nexample finding centroid polygon used tie concepts together.\nChapter 5 already introduced centroid function st_centroid(), example highlights seemingly simple operations result comparatively complex code, affirming following observation (Wise 2001):One intriguing things spatial data problems things appear trivially easy human can surprisingly difficult computer.example also reflects secondary aim chapter , following Xiao (2016), “duplicate available , show things work”.","code":""},{"path":"algorithms.html","id":"scripts","chapter":"11 Scripts, algorithms and functions","heading":"11.2 Scripts","text":"functions distributed packages building blocks R code, scripts glue holds together.\nScripts stored executed logical order create reproducible workflows, manually workflow automation tools targets (Landau 2021).\nnew programming scripts may seem intimidating first encounter , simply plain text files.\nScripts usually saved file extension representing language contain, .py scripts written Python .rs scripts written Rust.\nR scripts saved .R extension named reflect .\nexample 11-hello.R, script file stored code folder book’s repository.\n11-hello.R simple script containing two lines code, one comment:contents script particularly exciting demonstrate point: scripts need complicated.\nSaved scripts can called executed entirety R command line source() function, demonstrated .\noutput command shows comment ignored print() command executed:can also call R scripts system command line shells bash PowerShell, assuming RScript executable configured available, example follows:strict rules can go script files nothing prevent saving broken, non-reproducible code.\nLines code contain valid R commented , adding # start line, prevent errors, shown line 1 11-hello.R script.\n, however, conventions worth following:Write script order: just like script film, scripts clear order ‘setup’, ‘data processing’ ‘save results’ (roughly equivalent ‘beginning’, ‘middle’ ‘end’ film).Add comments script people (future self) can understand \nminimum, comment state purpose script (see Figure 11.1) (long scripts) divide sections.\ncan done RStudio, example, shortcut Ctrl+Shift+R, creates ‘foldable’ code section headingsAbove , scripts reproducible: self-contained scripts work computer useful scripts run computer, good day.\ninvolves attaching required packages beginning, reading-data persistent sources (reliable website) ensuring previous steps taken78It hard enforce reproducibility R scripts, tools can help.\ndefault, RStudio ‘code-checks’ R scripts underlines faulty code red wavy line, illustrated :\nFIGURE 11.1: Code checking RStudio. example, script 11-centroid-alg.R, highlights unclosed curly bracket line 19.\n\ncontents section apply type R script.\nparticular consideration scripts geocomputation tend external dependencies, GDAL dependency needed core R packages working geographic data, made heavy use Chapter 8 data import export.\nGIS software dependencies may needed run specialist geoalgorithms, outlined Chapter 10.\nScripts working geographic data also often require input datasets available specific formats.\ndependencies mentioned comments suitable place project part, described dependencies tools renv package Docker.‘Defensive’ programming techniques good error messages can save time checking dependencies communicating users certain requirements met.\nstatements, implemented () R, can used send messages run lines code , , certain conditions met.\nfollowing lines code, example, send message users certain file missing:work undertaken 11-centroid-alg.R script demonstrated reproducible example , creates pre-requisite object named poly_mat, representing square sides 9 units length.\nexample shows source() works URLs, assuming internet connection.\n, script can called source(\"code/11-centroid-alg.R\"), assuming previously downloaded github.com/geocompx/geocompr repository running R geocompr folder.","code":"\n# Aim: provide a minimal R script\nprint(\"Hello geocompr\")\nsource(\"code/11-hello.R\")\n#> [1] \"Hello geocompr\"Rscript code/11-hello.R\nif (!file.exists(\"required_geo_data.gpkg\")) {\n message(\"No file, required_geo_data.gpkg is missing!\")\n} \n#> No file, required_geo_data.gpkg is missing!\npoly_mat = cbind(\n x = c(0, 9, 9, 0, 0),\n y = c(0, 0, 9, 9, 0)\n)\n# Short URL to code/11-centroid-alg.R in the geocompr repo\nsource(\"https://t.ly/0nzj\")#> [1] \"The area is: 81\"\n#> [1] \"The coordinates of the centroid are: 4.5, 4.5\""},{"path":"algorithms.html","id":"geometric-algorithms","chapter":"11 Scripts, algorithms and functions","heading":"11.3 Geometric algorithms","text":"Algorithms can understood computing equivalent baking recipe.\ncomplete set instructions , undertaken inputs result useful/tasty outcomes.\nInputs ingredients flour sugar case baking, data input parameters case algorithms.\ntasty cakes may result baking recipe, successful algorithms computational outcomes environmental/social/benefits.\ndiving reproducible example, brief history shows algorithms relate scripts (covered Section 11.2) functions (can used generalize algorithms make portable easy--use, ’ll see Section 11.4).word “algorithm” originated 9th century Baghdad publication Hisab al-jabr w’al-muqabala, early math textbook.\nbook translated Latin became popular author’s last name, al-Khwārizmī, “immortalized scientific term: Al-Khwarizmi became Alchoarismi, Algorismi , eventually, algorithm” (Bellos 2011).\ncomputing age, algorithm refers series steps solves problem, resulting pre-defined output.\nInputs must formally defined suitable data structure (Wise 2001).\nAlgorithms often start flow charts pseudocode showing aim process implemented code.\nease usability, common algorithms often packaged inside functions, may hide steps taken (unless look function’s source code, see Section 11.4).Geoalgorithms, encountered Chapter 10, algorithms take geographic data , generally, return geographic results (alternative terms thing include GIS algorithms geometric algorithms).\nmay sound simple deep subject entire academic field, Computational Geometry, dedicated study (Berg et al. 2008) numerous books subject.\nO’Rourke (1998), example, introduces subject range progressively harder geometric algorithms using reproducible freely available C code.example geometric algorithm one finds centroid polygon.\nmany approaches centroid calculation, work specific types spatial data.\npurposes section, choose approach easy visualize: breaking polygon many triangles finding centroid , approach discussed Kaiser Morin (1993) alongside centroid algorithms (mentioned briefly O’Rourke 1998).\nhelps break approach discrete tasks writing code (subsequently referred step 1 step 4, also presented schematic diagram pseudocode):Divide polygon contiguous trianglesFind centroid triangleFind area triangleFind area-weighted mean triangle centroidsThese steps may sound straightforward, converting words working code requires work plenty trial--error, even inputs constrained:\nalgorithm work convex polygons, contain internal angles greater 180°, star shapes allowed (packages decido sfdct can triangulate non-convex polygons using external libraries, shown algorithm vignette hosted geocompx.org).simplest data structure representing polygon matrix x y coordinates row represents vertex tracing polygon’s border order first last rows identical (Wise 2001).\ncase, create polygon five vertices base R, building example GIS Algorithms (Xiao 2016 see github.com/gisalgs Python code), illustrated Figure 11.2:Now example dataset, ready undertake step 1 outlined .\ncode shows can done creating single triangle (T1), demonstrates method; also demonstrates step 2 calculating centroid based formula \\(1/3(+ b + c)\\) \\(\\) \\(c\\) coordinates representing triangle’s vertices:\nFIGURE 11.2: Illustration polygon centroid calculation problem.\nStep 3 find area triangle, weighted mean accounting disproportionate impact large triangles accounted .\nformula calculate area triangle follows (Kaiser Morin 1993):\\[\n\\frac{A_x ( B_y − C_y ) + B_x ( C_y − A_y ) + C_x ( A_y − B_y )}{ 2 }\n\\]\\(\\) \\(C\\) triangle’s three points \\(x\\) \\(y\\) refer x y dimensions.\ntranslation formula R code works data matrix representation triangle T1 follows (function abs() ensures positive result):code chunk outputs correct result.79\nproblem code clunky must re-typed want run another triangle matrix.\nmake code generalizable, see can converted function Section 11.4.Step 4 requires steps 2 3 undertaken just one triangle (demonstrated ) triangles.\nrequires iteration create triangles representing polygon, illustrated Figure 11.3.\nlapply() vapply() used iterate triangle provide concise solution base R:80\nFIGURE 11.3: Illustration iterative centroid algorithm triangles. X represents area-weighted centroid iterations 2 3.\nnow position complete step 4 calculate total area sum() centroid coordinates polygon weighted.mean(C[, 1], ) weighted.mean(C[, 2], ) (exercise alert readers: verify commands work).\ndemonstrate link algorithms scripts, contents section condensed 11-centroid-alg.R.\nsaw end Section 11.2 script can calculate centroid square.\ngreat thing scripting algorithm works new poly_mat object (see exercises verify results reference st_centroid()):example shows low-level geographic operations can developed first principles base R.\nalso shows tried--tested solution already exists, may worth re-inventing wheel:\naimed find centroid polygon, quicker represent poly_mat sf object use pre-existing sf::st_centroid() function instead.\nHowever, great benefit writing algorithms 1st principles understand every step process, something guaranteed using peoples’ code.\nconsideration performance: R may slow compared low-level languages C++ number crunching (see Section 1.4) optimization difficult.\naim develop new methods, computational efficiency prioritized.\ncaptured saying “premature optimization root evil (least ) programming” (Knuth 1974).Algorithm development hard.\napparent amount work gone developing centroid algorithm base R just one, rather inefficient, approach problem limited real-world applications (convex polygons uncommon practice).\nexperience lead appreciation low-level geographic libraries GEOS CGAL (Computational Geometry Algorithms Library) run fast work wide range input geometry types.\ngreat advantage open source nature libraries source code readily available study, comprehension (skills confidence) modification.81","code":"\n# generate a simple matrix representation of a polygon:\nx_coords = c(10, 20, 12, 0, 0, 10)\ny_coords = c(0, 15, 20, 10, 0, 0)\npoly_mat = cbind(x_coords, y_coords)\n# create a point representing the origin:\nOrigin = poly_mat[1, ]\n# create 'triangle matrix':\nT1 = rbind(Origin, poly_mat[2:3, ], Origin) \nC1 = (T1[1,] + T1[2,] + T1[3,]) / 3\n# calculate the area of the triangle represented by matrix T1:\nabs(T1[1, 1] * (T1[2, 2] - T1[3, 2]) +\n T1[2, 1] * (T1[3, 2] - T1[1, 2]) +\n T1[3, 1] * (T1[1, 2] - T1[2, 2])) / 2\n#> [1] 85\ni = 2:(nrow(poly_mat) - 2)\nT_all = lapply(i, function(x) {\n rbind(Origin, poly_mat[x:(x + 1), ], Origin)\n})\n\nC_list = lapply(T_all, function(x) (x[1, ] + x[2, ] + x[3, ]) / 3)\nC = do.call(rbind, C_list)\n\nA = vapply(T_all, function(x) {\n abs(x[1, 1] * (x[2, 2] - x[3, 2]) +\n x[2, 1] * (x[3, 2] - x[1, 2]) +\n x[3, 1] * (x[1, 2] - x[2, 2]) ) / 2\n }, FUN.VALUE = double(1))\nsource(\"code/11-centroid-alg.R\")\n#> [1] \"The area is: 245\"\n#> [1] \"The coordinates of the centroid are: 8.83, 9.22\""},{"path":"algorithms.html","id":"functions","chapter":"11 Scripts, algorithms and functions","heading":"11.4 Functions","text":"Like algorithms, functions take input return output.\nFunctions, however, refer implementation particular programming language, rather ‘recipe’ .\nR, functions objects right, can created joined together modular fashion.\ncan, example, create function undertakes step 2 centroid generation algorithm follows:example demonstrates two key components functions:\n1) function body, code inside curly brackets define function inputs; 2) arguments, list arguments function works — x case (third key component, environment, beyond scope section).\ndefault, functions return last object calculated (coordinates centroid case t_centroid()).82The function now works inputs pass , illustrated command calculates area 1st triangle example polygon previous section (see Figure 11.3).can also create function calculate triangle’s area, name t_area():Note function’s creation, triangle’s area can calculated single line code, avoiding duplication verbose code:\nfunctions mechanism generalizing code.\nnewly created function t_area() takes object x, assumed dimensions ‘triangle matrix’ data structure ’ve using, returns area, illustrated T1 follows:can test generalizability function using find area new triangle matrix, height 1 base 3:useful feature functions modular.\nProvided know output , one function can used building block another.\nThus, functions t_centroid() t_area() can used sub-components larger function work script 11-centroid-alg.R: calculate area convex polygon.\ncode chunk creates function poly_centroid() mimic behavior sf::st_centroid() convex polygons.83Functions, poly_centroid(), can extended provide different types output.\nreturn result object class sfg, example, ‘wrapper’ function can used modify output poly_centroid() returning result:can verify output output sf::st_centroid() follows:","code":"\nt_centroid = function(x) {\n (x[1, ] + x[2, ] + x[3, ]) / 3\n}\nt_centroid(T1)\n#> x_coords y_coords \n#> 14.0 11.7\nt_area = function(x) {\n abs(\n x[1, 1] * (x[2, 2] - x[3, 2]) +\n x[2, 1] * (x[3, 2] - x[1, 2]) +\n x[3, 1] * (x[1, 2] - x[2, 2])\n ) / 2\n}\nt_area(T1)\n#> [1] 85\nt_new = cbind(x = c(0, 3, 3, 0),\n y = c(0, 0, 1, 0))\nt_area(t_new)\n#> x \n#> 1.5\npoly_centroid = function(poly_mat) {\n Origin = poly_mat[1, ] # create a point representing the origin\n i = 2:(nrow(poly_mat) - 2)\n T_all = lapply(i, function(x) {rbind(Origin, poly_mat[x:(x + 1), ], Origin)})\n C_list = lapply(T_all, t_centroid)\n C = do.call(rbind, C_list)\n A = vapply(T_all, t_area, FUN.VALUE = double(1))\n c(weighted.mean(C[, 1], A), weighted.mean(C[, 2], A))\n}\npoly_centroid(poly_mat)\n#> [1] 8.83 9.22\npoly_centroid_sfg = function(x) {\n centroid_coords = poly_centroid(x)\n sf::st_point(centroid_coords)\n}\npoly_sfc = sf::st_polygon(list(poly_mat))\nidentical(poly_centroid_sfg(poly_mat), sf::st_centroid(poly_sfc))\n#> [1] TRUE"},{"path":"algorithms.html","id":"programming","chapter":"11 Scripts, algorithms and functions","heading":"11.5 Programming","text":"chapter moved quickly, scripts functions via tricky topic algorithms.\ndiscussed abstract, also created working examples solve specific problem:script 11-centroid-alg.R introduced demonstrated ‘polygon matrix’individual steps allowed script work described algorithm, computational recipeTo generalize algorithm converted modular functions eventually combined create function poly_centroid() previous sectionEach may seem straightforward.\nHowever, skillful programming complex involves combining element — scripts, algorithms functions — system, efficiency style.\noutcome robust user-friendly tools people can use.\nnew programming, expect people reading book , able follow reproduce results preceding sections major achievement.\nProgramming takes many hours dedicated study practice become proficient.challenge facing developers aiming implement new algorithms efficient way put perspective considering amount work gone creating simple function intended use production: current state, poly_centroid() fails (non-convex) polygons!\nraises question: generalize function?\nTwo options (1) find ways triangulate non-convex polygons (topic covered online Algorithms Extended article hosted geocompx.github.io/geocompkg/articles/) (2) explore centroid algorithms rely triangular meshes.wider question : worth programming solution high performance algorithms already implemented packaged functions st_centroid()?\nreductionist answer specific case ‘’.\nwider context, considering benefits learning program, answer ‘depends’.\nprogramming, ’s easy waste hours trying implement method, find someone already done hard work.\ncan understand chapter stepping stone towards geometric algorithm programming wizardry.\nHowever, can also seen lesson try program generalized solution, use existing higher-level solutions.\nsurely occasions writing new functions best way forward, also times using functions already exist best way forward.“reinvent wheel” applies much, , programming walks life.\nbit research thinking outset project can help decide programming time best spent.\nThree principles can also help maximize use effort writing code, whether ’s simple script package composed hundreds functions:DRY (don’t repeat ): minimize repetition code aim use fewer lines code solve particular problem.\nprinciple explained reference use functions reduce code repetition Functions chapter R Data Science (Grolemund Wickham 2016).KISS (keep simple stupid): principle suggests simple solutions tried first preferred complex solutions, using dependencies needed, aiming keep scripts concise.\nprinciple computing analogy quote “things made simple possible, simpler”.Modularity: code easier maintain ’s divided well-defined pieces.\nfunction one thing really well.\nfunction becoming long, think splitting multiple small functions, re-used purposes, supporting DRY KISS principles.guarantee chapter instantly enable create perfectly formed functions work.\n, however, confident contents help decide appropriate time try (existing functions solve problem, programming task within capabilities benefits solution likely outweigh time costs developing ).\nusing principles , combination practical experience working examples , build scripting, package-writing programming skills.\nFirst steps towards programming can slow (exercises rushed) long-term rewards can large.","code":""},{"path":"algorithms.html","id":"ex-algorithms","chapter":"11 Scripts, algorithms and functions","heading":"11.6 Exercises","text":"E1. Read script 11-centroid-alg.R code folder book’s GitHub repository.best practices covered Section 11.2 follow?Create version script computer IDE RStudio (preferably typing-script line--line, coding style comments, rather copy-pasting — help learn type scripts). Using example square polygon (e.g., created poly_mat = cbind(x = c(0, 9, 9, 0, 0), y = c(0, 0, 9, 9, 0))) execute script line--line.changes made script make reproducible?documentation improved?E2. geometric algorithms section calculated area geographic centroid polygon represented poly_mat 245 8.8, 9.2, respectively.Reproduce results computer reference script 11-centroid-alg.R, implementation algorithm (bonus: type commands - try avoid copy-pasting).results correct? Verify converting poly_mat sfc object (named poly_sfc) st_polygon() (hint: function takes objects class list()) using st_area() st_centroid().E3. stated algorithm created works convex hulls. Define convex hulls (see geometry operations chapter) test algorithm polygon convex hull.Bonus 1: Think method works convex hulls note changes need made algorithm make work types polygon.Bonus 2: Building contents 11-centroid-alg.R, write algorithm using base R functions can find total length linestrings represented matrix form.E4. functions section created different versions poly_centroid() function generated outputs class sfg (poly_centroid_sfg()) type-stable matrix outputs (poly_centroid_type_stable()).\nextend function creating version (e.g., called poly_centroid_sf()) type stable (accepts inputs class sf) returns sf objects (hint: may need convert object x matrix command sf::st_coordinates(x)).Verify works running poly_centroid_sf(sf::st_sf(sf::st_sfc(poly_sfc)))error message get try run poly_centroid_sf(poly_mat)?","code":""},{"path":"spatial-cv.html","id":"spatial-cv","chapter":"12 Statistical learning","heading":"12 Statistical learning","text":"","code":""},{"path":"spatial-cv.html","id":"prerequisites-10","chapter":"12 Statistical learning","heading":"Prerequisites","text":"chapter assumes proficiency geographic data analysis, example gained studying contents working-exercises Chapters 2 7.\nfamiliarity generalized linear models (GLM) machine learning highly recommended (example . Zuur et al. 2009; James et al. 2013).chapter uses following packages:84Required data attached due course.","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(future) # parallel processing\nlibrary(lgr) # logging framework for R\nlibrary(mlr3) # unified interface to machine learning algorithms\nlibrary(mlr3learners) # most important machine learning algorithms\nlibrary(mlr3extralearners) # access to even more learning algorithms\nlibrary(mlr3proba) # make probabilistic predictions, here only needed for mlr3extralearners::list_learners()\nlibrary(mlr3spatiotempcv) # spatio-temporal resampling strategies\nlibrary(mlr3tuning) # hyperparameter tuning\nlibrary(mlr3viz) # plotting functions for mlr3 objects\nlibrary(progressr) # report progress updates\nlibrary(pROC) # compute roc values"},{"path":"spatial-cv.html","id":"intro-cv1","chapter":"12 Statistical learning","heading":"12.1 Introduction","text":"Statistical learning concerned use statistical computational models identifying patterns data predicting patterns.\nDue origins, statistical learning one R’s great strengths (see Section 1.4).85\nStatistical learning combines methods statistics machine learning can categorized supervised unsupervised techniques.\nincreasingly used disciplines ranging physics, biology ecology geography economics (James et al. 2013).chapter focuses supervised techniques training dataset, opposed unsupervised techniques clustering.\nResponse variables can binary (landslide occurrence), categorical (land use), integer (species richness count) numeric (soil acidity measured pH).\nSupervised techniques model relationship responses — known sample observations — one predictors.primary aim much machine learning research make good predictions.\nMachine learning thrives age ‘big data’ methods make assumptions input variables can handle huge datasets.\nMachine learning conducive tasks prediction future customer behavior, recommendation services (music, movies, buy next), face recognition, autonomous driving, text classification predictive maintenance (infrastructure, industry).chapter based case study: modeling occurrence landslides.\napplication links applied nature geocomputation, defined Chapter 1, illustrates machine learning borrows field statistics sole aim prediction.\nTherefore, chapter first introduces modeling cross-validation concepts help Generalized Linear Model (. Zuur et al. 2009).\nBuilding , chapter implements typical machine learning algorithm, namely Support Vector Machine (SVM).\nmodels’ predictive performance assessed using spatial cross-validation (CV), accounts fact geographic data special.CV determines model’s ability generalize new data, splitting dataset (repeatedly) training test sets.\nuses training data fit model, checks performance predicting test data.\nCV helps detect overfitting since models predict training data closely (noise) tend perform poorly test data.Randomly splitting spatial data can lead training points neighbors space test points.\nDue spatial autocorrelation, test training datasets independent scenario, consequence CV fails detect possible overfitting.\nSpatial CV alleviates problem central theme chapter.Hence, emphasize , chapter focusing predictive performance models.\nteach predictive mapping.\ntopic Chapter 15.","code":""},{"path":"spatial-cv.html","id":"case-landslide","chapter":"12 Statistical learning","heading":"12.2 Case study: Landslide susceptibility","text":"case study based dataset landslide locations Southern Ecuador, illustrated Figure 12.1 described detail Muenchow, Brenning, Richter (2012).\nsubset dataset used paper provided spDataLarge package, can loaded follows:code loads three objects: data.frame named lsl, sf object named study_mask SpatRaster (see Section 2.3.4) named ta containing terrain attribute rasters.\nlsl contains factor column lslpts TRUE corresponds observed landslide ‘initiation point’, coordinates stored columns x y.86\n175 landslide 175 non-landslide points, shown summary(lsl$lslpts).\n175 non-landslide points sampled randomly study area, restriction must fall outside small buffer around landslide polygons.\nFIGURE 12.1: Landslide initiation points (red) points unaffected landsliding (blue) Southern Ecuador.\nfirst three rows lsl, rounded two significant digits, can found Table 12.1.\nTABLE 12.1: TABLE 12.2: Structure lsl dataset.\nmodel landslide susceptibility, need predictors.\nSince terrain attributes frequently associated landsliding (Muenchow, Brenning, Richter 2012), already extracted following terrain attributes ta lsl:slope: slope angle (°)cplan: plan curvature (rad m−1) expressing convergence divergence slope thus water flowcprof: profile curvature (rad m-1) measure flow acceleration, also known downslope change slope angleelev: elevation (m .s.l.) representation different altitudinal zones vegetation precipitation study arealog10_carea: decadic logarithm catchment area (log10 m2) representing amount water flowing towards locationIt might worthwhile exercise compute terrain attributes help R-GIS bridges (see Chapter 10) extract landslide points (see Exercise section end chapter).","code":"\ndata(\"lsl\", \"study_mask\", package = \"spDataLarge\")\nta = terra::rast(system.file(\"raster/ta.tif\", package = \"spDataLarge\"))"},{"path":"spatial-cv.html","id":"conventional-model","chapter":"12 Statistical learning","heading":"12.3 Conventional modeling approach in R","text":"introducing mlr3 package, umbrella-package providing unified interface dozens learning algorithms (Section 12.5), worth taking look conventional modeling interface R.\nintroduction supervised statistical learning provides basis spatial CV, contributes better grasp mlr3 approach presented subsequently.Supervised learning involves predicting response variable function predictors (Section 12.4).\nR, modeling functions usually specified using formulas (see ?formula details R formulas).\nfollowing command specifies runs generalized linear model:worth understanding three input arguments:formula, specifies landslide occurrence (lslpts) function predictorsA family, specifies type model, case binomial response binary (see ?family)data frame contains response predictors (columns)results model can printed follows (summary(fit) provides detailed account results):model object fit, class glm, contains coefficients defining fitted relationship response predictors.\ncan also used prediction.\ndone generic predict() method, case calls function predict.glm().\nSetting type response returns predicted probabilities (landslide occurrence) observation lsl, illustrated (see ?predict.glm).Spatial distribution maps can made applying coefficients predictor rasters.\ncan done manually terra::predict().\naddition model object (fit), latter function also expects SpatRaster predictors (raster layers) named model’s input data frame (Figure 12.2).\nFIGURE 12.2: Spatial distribution mapping landslide susceptibility using GLM.\n, making predictions neglect spatial autocorrelation since assume average predictive accuracy remains without spatial autocorrelation structures.\nHowever, possible include spatial autocorrelation structures models well predictions.\nThough, beyond scope book, give interested reader pointers look :predictions regression kriging combines predictions regression kriging regression’s residuals (Goovaerts 1997; Hengl 2007; Bivand, Pebesma, Gómez-Rubio 2013).One can also add spatial correlation (dependency) structure generalized least squares model (nlme::gls(); . Zuur et al. (2009); . F. Zuur et al. (2017)).One can also use mixed-effect modeling approaches.\nBasically, random effect imposes dependency structure response variable turn allows observations one class similar another class (. Zuur et al. 2009).\nClasses can , example, bee hives, owl nests, vegetation transects altitudinal stratification.\nmixed modeling approach assumes normal independent distributed random intercepts.\ncan even extended using random intercept normal spatially dependent.\n, however, resort likely Bayesian modeling approaches since frequentist software tools rather limited respect especially complex models (Blangiardo Cameletti 2015; . F. Zuur et al. 2017).Spatial distribution mapping one important outcome model (Figure 12.2).\nEven important good underlying model making since prediction map useless model’s predictive performance bad.\nOne popular measures assess predictive performance binomial model Area Receiver Operator Characteristic Curve (AUROC).\nvalue 0.5 1.0, 0.5 indicating model better random 1.0 indicating perfect prediction two classes.\nThus, higher AUROC, better model’s predictive power.\nfollowing code chunk computes AUROC value model roc(), takes response predicted values inputs.\nauc() returns area curve.AUROC value 0.82 represents good fit.\nHowever, overoptimistic estimation since computed complete dataset.\nderive biased-reduced assessment, use cross-validation case spatial data make use spatial CV.","code":"\nfit = glm(lslpts ~ slope + cplan + cprof + elev + log10_carea,\n family = binomial(),\n data = lsl)\nclass(fit)\n#> [1] \"glm\" \"lm\"\nfit\n#> \n#> Call: glm(formula = lslpts ~ slope + cplan + cprof + elev + log10_carea, \n#> family = binomial(), data = lsl)\n#> \n#> Coefficients:\n#> (Intercept) slope cplan cprof elev log10_carea \n#> 2.51e+00 7.90e-02 -2.89e+01 -1.76e+01 1.79e-04 -2.27e+00 \n#> \n#> Degrees of Freedom: 349 Total (i.e. Null); 344 Residual\n#> Null Deviance: 485 \n#> Residual Deviance: 373 AIC: 385\npred_glm = predict(object = fit, type = \"response\")\nhead(pred_glm)\n#> 1 2 3 4 5 6 \n#> 0.1901 0.1172 0.0952 0.2503 0.3382 0.1575\n# making the prediction\npred = terra::predict(ta, model = fit, type = \"response\")\npROC::auc(pROC::roc(lsl$lslpts, fitted(fit)))\n#> Area under the curve: 0.8216"},{"path":"spatial-cv.html","id":"intro-cv","chapter":"12 Statistical learning","heading":"12.4 Introduction to (spatial) cross-validation","text":"Cross-validation belongs family resampling methods (James et al. 2013).\nbasic idea split (repeatedly) dataset training test sets whereby training data used fit model applied test set.\nComparing predicted values known response values test set (using performance measure AUROC binomial case) gives bias-reduced assessment model’s capability generalize learned relationship independent data.\nexample, 100-repeated 5-fold cross-validation means randomly split data five partitions (folds) fold used test set (see upper row Figure 12.3).\nguarantees observation used one test sets, requires fitting five models.\nSubsequently, procedure repeated 100 times.\ncourse, data splitting differ repetition.\nOverall, sums 500 models, whereas mean performance measure (AUROC) models model’s overall predictive power.However, geographic data special.\nsee Chapter 13, ‘first law’ geography states points close , generally, similar points away (Miller 2004).\nmeans points statistically independent training test points conventional CV often close (see first row Figure 12.3).\n‘Training’ observations near ‘test’ observations can provide kind ‘sneak preview’:\ninformation unavailable training dataset.\nalleviate problem ‘spatial partitioning’ used split observations spatially disjointed subsets (using observations’ coordinates k-means clustering; Brenning (2012b); second row Figure 12.3).\npartitioning strategy difference spatial conventional CV.\nresult, spatial CV leads bias-reduced assessment model’s predictive performance, hence helps avoid overfitting.\nFIGURE 12.3: Spatial visualization selected test training observations cross-validation one repetition. Random (upper row) spatial partitioning (lower row).\n","code":""},{"path":"spatial-cv.html","id":"spatial-cv-with-mlr3","chapter":"12 Statistical learning","heading":"12.5 Spatial CV with mlr3","text":"\ndozens packages statistical learning, described example CRAN machine learning task view.\nGetting acquainted packages, including undertake cross-validation hyperparameter tuning, can time-consuming process.\nComparing model results different packages can even laborious.\nmlr3 package ecosystem developed address issues.\nacts ‘meta-package’, providing unified interface popular supervised unsupervised statistical learning techniques including classification, regression, survival analysis clustering (Lang et al. 2019; Becker et al. 2022).\nstandardized mlr3 interface based eight ‘building blocks’.\nillustrated Figure 12.4, clear order.\nFIGURE 12.4: Basic building blocks mlr3 package. Source: Becker et al. (2022). (Permission reuse figure kindly granted.)\nmlr3 modeling process consists three main stages.\nFirst, task specifies data (including response predictor variables) model type (regression classification).\nSecond, learner defines specific learning algorithm applied created task.\nThird, resampling approach assesses predictive performance model, .e., ability generalize new data (see also Section 12.4).","code":""},{"path":"spatial-cv.html","id":"glm","chapter":"12 Statistical learning","heading":"12.5.1 Generalized linear model","text":"use GLM mlr3, must create task containing landslide data.\nSince response binary (two-category variable) spatial dimension, create classification task as_task_classif_st() mlr3spatiotempcv package (Schratz et al. 2021, non-spatial tasks, use mlr3::as_task_classif() mlr3::as_task_regr() regression tasks, see ?Task task types).87\nfirst essential argument as_task_ functions x.\nx expects input data includes response predictor variables.\ntarget argument indicates name response variable (case lslpts) positive determines two factor levels response variable indicate landslide initiation point (case TRUE).\nvariables lsl dataset serve predictors.\nspatial CV, need provide extra arguments.\ncoordinate_names argument expects names coordinate columns (see Section 12.4 Figure 12.3).\nAdditionally, indicate used CRS (crs) decide want use coordinates predictors modeling (coords_as_features).Note mlr3spatiotempcv::as_task_classif_st() also accepts sf-object input backend parameter.\ncase, might want additionally specify coords_as_features argument.\nconvert lsl sf-object as_task_classif_st() just turn back non-spatial data.table object background.short data exploration, autoplot() function mlr3viz package might come handy since plots response predictors predictors predictors (shown).created task, need choose learner determines statistical learning method use.\nclassification learners start classif. regression learners regr. (see ?Learner details).\nmlr3extralearners::list_mlr3learners() lists available learners package mlr3 imports (Table 12.3).\nfind learners able model binary response variable, can run:TABLE 12.3: Sample available learners binomial tasks mlr3 package.yields learners able model two-class problems (landslide yes ).\nopt binomial classification method used Section 12.3 implemented classif.log_reg mlr3learners.\nAdditionally, need specify predict.type determines type prediction prob resulting predicted probability landslide occurrence 0 1 (corresponds type = response predict.glm()).access help page learner find package taken, can run:set-steps modeling mlr3 may seem tedious.\nremember, single interface provides access 130+ learners shown mlr3extralearners::list_mlr3learners(); far tedious learn interface learner!\nadvantages simple parallelization resampling techniques ability tune machine learning hyperparameters (see Section 12.5.2).\nimportantly, (spatial) resampling mlr3spatiotempcv (Schratz et al. 2021) straightforward, requiring two steps: specifying resampling method running .\nuse 100-repeated 5-fold spatial CV: five partitions chosen based provided coordinates task partitioning repeated 100 times:88To execute spatial resampling, run resample() using previously specified task, learner, resampling strategy.\ntakes time (around 15 seconds modern laptop) computes 500 resampling partitions 500 models.\nperformance measure, choose AUROC.\nretrieve , use score() method resampling result output object (score_spcv_glm).\nreturns data.table object 500 rows – one model.output preceding code chunk bias-reduced assessment model’s predictive performance.\nsaved extdata/12-bmr_score.rds book’s GitHub repository.\nrequired, can read follows:compute mean AUROC 500 models, run:put results perspective, let us compare AUROC values 100-repeated 5-fold non-spatial cross-validation (Figure 12.5; code non-spatial cross-validation shown explored exercise section).\nexpected (see Section 12.4), spatially cross-validated result yields lower AUROC values average conventional cross-validation approach, underlining -optimistic predictive performance latter due spatial autocorrelation.\nFIGURE 12.5: Boxplot showing difference GLM AUROC values spatial conventional 100-repeated 5-fold cross-validation.\n","code":"\n# 1. create task\ntask = mlr3spatiotempcv::as_task_classif_st(\n mlr3::as_data_backend(lsl), \n target = \"lslpts\", \n id = \"ecuador_lsl\",\n positive = \"TRUE\",\n coordinate_names = c(\"x\", \"y\"),\n crs = \"EPSG:32717\",\n coords_as_features = FALSE\n )\n# plot response against each predictor\nmlr3viz::autoplot(task, type = \"duo\")\n# plot all variables against each other\nmlr3viz::autoplot(task, type = \"pairs\")\nmlr3extralearners::list_mlr3learners(\n filter = list(class = \"classif\", properties = \"twoclass\"), \n select = c(\"id\", \"mlr3_package\", \"required_packages\")) |>\n head()\n# 2. specify learner\nlearner = mlr3::lrn(\"classif.log_reg\", predict_type = \"prob\")\nlearner$help()\n# 3. specify resampling\nresampling = mlr3::rsmp(\"repeated_spcv_coords\", folds = 5, repeats = 100)\n# reduce verbosity\nlgr::get_logger(\"mlr3\")$set_threshold(\"warn\")\n# run spatial cross-validation and save it to resample result glm (rr_glm)\nrr_spcv_glm = mlr3::resample(task = task,\n learner = learner,\n resampling = resampling)\n# compute the AUROC as a data.table\nscore_spcv_glm = rr_spcv_glm$score(measure = mlr3::msr(\"classif.auc\"))\n# keep only the columns you need\nscore_spcv_glm = dplyr::select(score_spcv_glm, task_id, learner_id, \n resampling_id, classif.auc)\nscore = readRDS(\"extdata/12-bmr_score.rds\")\nscore_spcv_glm = dplyr::filter(score, learner_id == \"classif.log_reg\", \n resampling_id == \"repeated_spcv_coords\")\nmean(score_spcv_glm$classif.auc) |>\n round(2)\n#> [1] 0.77"},{"path":"spatial-cv.html","id":"svm","chapter":"12 Statistical learning","heading":"12.5.2 Spatial tuning of machine-learning hyperparameters","text":"Section 12.4 introduced machine learning part statistical learning.\nrecap, adhere following definition machine learning Jason Brownlee:Machine learning, specifically field predictive modeling, primarily concerned minimizing error model making accurate predictions possible, expense explainability.\napplied machine learning borrow, reuse steal algorithms many different fields, including statistics use towards ends.Section 12.5.1 GLM used predict landslide susceptibility.\nsection introduces support vector machines (SVM) purpose.\nRandom forest models might popular SVMs; however, positive effect tuning hyperparameters model performance much pronounced case SVMs (Probst, Wright, Boulesteix 2018).\nSince (spatial) hyperparameter tuning major aim section, use SVM.\nwishing apply random forest model, recommend read chapter, proceed Chapter 15 apply currently covered concepts techniques make spatial distribution maps based random forest model.SVMs search best possible ‘hyperplanes’ separate classes (classification case) estimate ‘kernels’ specific hyperparameters create non-linear boundaries classes (James et al. 2013).\nMachine learning algorithms often feature hyperparameters parameters.\nParameters can estimated data hyperparameters set learning begins (see also machine mastery blog hyperparameter optimization chapter mlr3 book).\noptimal hyperparameter configuration usually found within specific search space determined help cross-validation methods.\ncalled hyperparameter tuning main topic section.SVM implementations provided kernlab allow hyperparameters tuned automatically, usually based random sampling (see upper row Figure 12.3).\nworks non-spatial data less use spatial data ‘spatial tuning’ undertaken.defining spatial tuning, set mlr3 building blocks, introduced Section 12.5.1, SVM.\nclassification task remains , hence can simply reuse task object created Section 12.5.1.\nLearners implementing SVM can found using list_mlr3learners() command mlr3extralearners.options, use ksvm() kernlab package (Karatzoglou et al. 2004).\nallow non-linear relationships, use popular radial basis function (Gaussian) kernel (\"rbfdot\") also default ksvm().\nSetting type argument \"C-svc\" makes sure ksvm() solving classification task.\nmake sure tuning stop one failing model, additionally define fallback learner (information please refer https://mlr3book.mlr-org.com/chapters/chapter10/advanced_technical_aspects_of_mlr3.html#sec-fallback).next stage specify resampling strategy.\nuse 100-repeated 5-fold spatial CV.Note exact code used resampling GLM Section 12.5.1; simply repeated reminder.far, process identical described Section 12.5.1.\nnext step new, however: tune hyperparameters.\nUsing data performance assessment tuning potentially lead overoptimistic results (Cawley Talbot 2010).\ncan avoided using nested spatial CV.\nFIGURE 12.6: Schematic hyperparameter tuning performance estimation levels CV. (Figure taken Schratz et al. (2019). Permission reuse kindly granted.)\nmeans split fold five spatially disjoint subfolds used determine optimal hyperparameters (tune_level object code chunk ; see Figure 12.6 visual representation).\nrandom selection values C Sigma additionally restricted predefined tuning space (search_space object).\nrange tuning space chosen values recommended literature (Schratz et al. 2019).\nfind optimal hyperparameter combination, fit 50 models (terminator object code chunk ) subfolds randomly selected values hyperparameters C Sigma.next stage modify learner lrn_ksvm accordance characteristics defining hyperparameter tuning auto_tuner().tuning now set-fit 250 models determine optimal hyperparameters one fold.\nRepeating fold, end 1,250 (250 * 5) models repetition.\nRepeated 100 times means fitting total 125,000 models identify optimal hyperparameters (Figure 12.3).\nused performance estimation, requires fitting another 500 models (5 folds * 100 repetitions; see Figure 12.3).\nmake performance estimation processing chain even clearer, let us write commands given computer:Performance level (upper left part Figure 12.6) - split dataset five spatially disjoint (outer) subfoldsTuning level (lower left part Figure 12.6) - use first fold performance level split spatially five (inner) subfolds hyperparameter tuning.\nUse 50 randomly selected hyperparameters inner subfolds, .e., fit 250 modelsPerformance estimation - Use best hyperparameter combination previous step (tuning level) apply first outer fold performance level estimate performance (AUROC)Repeat steps 2 3 remaining four outer foldsRepeat steps 2 4, 100 timesThe process hyperparameter tuning performance estimation computationally intensive.\ndecrease model runtime, mlr3 offers possibility use parallelization help future package.\nSince run nested cross-validation, can decide like parallelize inner outer loop (see lower left part Figure 12.6).\nSince former run 125,000 models, whereas latter runs 500, quite obvious parallelize inner loop.\nset parallelization inner loop, run:Additionally, instructed future use half instead available cores (default), setting allows possible users work high performance computing cluster case one used.Now set computing nested spatial CV.\nSpecifying resample() parameters follows exact procedure presented using GLM, difference store_models encapsulate arguments.\nSetting former TRUE allow extraction hyperparameter tuning results important plan follow-analyses tuning.\nlatter ensures processing continues even one models throws error.\navoids process stopping just one failed model, desirable large model runs.\nprocessing completed, one can look failed models.\nprocessing, good practice explicitly stop parallelization future:::ClusterRegistry(\"stop\").\nFinally, save output object (result) disk case like use another R session.\nrunning subsequent code, aware time-consuming since run spatial cross-validation 125,500 models.\ncan easily run half day modern laptop.\nNote runtime depends many aspects: CPU speed, selected algorithm, selected number cores dataset.case want run code locally, saved score_svm book’s GitHub repository.\ncan loaded follows:Let us look final AUROC: model’s ability discriminate two classes.appears GLM (aggregated AUROC 0.77) slightly better SVM specific case.\nguarantee absolute fair comparison, one also make sure two models use exact partitions – something shown silently used background (see code/12_cv.R book’s GitHub repository information).\n, mlr3 offers functions benchmark_grid() benchmark() (see also https://mlr3book.mlr-org.com/chapters/chapter3/evaluation_and_benchmarking.html#sec-benchmarking, Becker et al. 2022).\nexplore functions detail Exercises.\nPlease note also using 50 iterations random search SVM probably yield hyperparameters result models better AUROC (Schratz et al. 2019).\nhand, increasing number random search iterations also increase total number models thus runtime.far spatial CV used assess ability learning algorithms generalize unseen data.\npredictive mapping purposes, one tune hyperparameters complete dataset.\ncovered Chapter 15.","code":"\nmlr3_learners = mlr3extralearners::list_mlr3learners()\n#> This will take a few seconds.\nmlr3_learners |>\n dplyr::filter(class == \"classif\" & grepl(\"svm\", id)) |>\n dplyr::select(id, class, mlr3_package, required_packages)\n#> id class mlr3_package required_packages\n#> \n#> 1: classif.ksvm classif mlr3extralearners mlr3,mlr3extralearners,kernlab\n#> 2: classif.lssvm classif mlr3extralearners mlr3,mlr3extralearners,kernlab\n#> 3: classif.svm classif mlr3learners mlr3,mlr3learners,e1071\nlrn_ksvm = mlr3::lrn(\"classif.ksvm\", predict_type = \"prob\", kernel = \"rbfdot\",\n type = \"C-svc\")\nlrn_ksvm$encapsulate(method = \"try\", \n fallback = lrn(\"classif.featureless\", \n predict_type = \"prob\"))\n# performance estimation level\nperf_level = mlr3::rsmp(\"repeated_spcv_coords\", folds = 5, repeats = 100)\n# five spatially disjoint partitions\ntune_level = mlr3::rsmp(\"spcv_coords\", folds = 5)\n# define the outer limits of the randomly selected hyperparameters\nsearch_space = paradox::ps(\n C = paradox::p_dbl(lower = -12, upper = 15, trafo = function(x) 2^x),\n sigma = paradox::p_dbl(lower = -15, upper = 6, trafo = function(x) 2^x)\n)\n# use 50 randomly selected hyperparameters\nterminator = mlr3tuning::trm(\"evals\", n_evals = 50)\ntuner = mlr3tuning::tnr(\"random_search\")\nat_ksvm = mlr3tuning::auto_tuner(\n learner = lrn_ksvm,\n resampling = tune_level,\n measure = mlr3::msr(\"classif.auc\"),\n search_space = search_space,\n terminator = terminator,\n tuner = tuner\n)\nlibrary(future)\n# execute the outer loop sequentially and parallelize the inner loop\nfuture::plan(list(\"sequential\", \"multisession\"), \n workers = floor(availableCores() / 2))\nprogressr::with_progress(expr = {\n rr_spcv_svm = mlr3::resample(task = task,\n learner = at_ksvm, \n # outer resampling (performance level)\n resampling = perf_level,\n store_models = FALSE,\n encapsulate = \"evaluate\")\n})\n# stop parallelization\nfuture:::ClusterRegistry(\"stop\")\n# compute the AUROC values\nscore_spcv_svm = rr_spcv_svm$score(measure = mlr3::msr(\"classif.auc\")) \n# keep only the columns you need\nscore_spcv_svm = dplyr::select(score_spcv_svm, task_id, learner_id, \n resampling_id, classif.auc)\nscore = readRDS(\"extdata/12-bmr_score.rds\")\nscore_spcv_svm = dplyr::filter(score, learner_id == \"classif.ksvm.tuned\", \n resampling_id == \"repeated_spcv_coords\")\n# final mean AUROC\nround(mean(score_spcv_svm$classif.auc), 2)\n#> [1] 0.74"},{"path":"spatial-cv.html","id":"conclusions","chapter":"12 Statistical learning","heading":"12.6 Conclusions","text":"Resampling methods important part data scientist’s toolbox (James et al. 2013).\nchapter used cross-validation assess predictive performance various models.\ndescribed Section 12.4, observations spatial coordinates may statistically independent due spatial autocorrelation, violating fundamental assumption cross-validation.\nSpatial CV addresses issue reducing bias introduced spatial autocorrelation.mlr3 package facilitates (spatial) resampling techniques combination popular statistical learning techniques including linear regression, semi-parametric models generalized additive models machine learning techniques random forests, SVMs, boosted regression trees (Bischl et al. 2016; Schratz et al. 2019).\nMachine learning algorithms often require hyperparameter inputs, optimal ‘tuning’ can require thousands model runs require large computational resources, consuming much time, RAM /cores.\nmlr3 tackles issue enabling parallelization.Machine learning overall, use understand spatial data, large field chapter provided basics, learn.\nrecommend following resources direction:mlr3 book (Becker et al. (2022); https://mlr3book.mlr-org.com/) especially chapter handling spatiotemporal dataAn academic paper hyperparameter tuning (Schratz et al. 2019)academic paper use mlr3spatiotempcv (Schratz et al. 2021)case spatiotemporal data, one account spatial temporal autocorrelation CV (Meyer et al. 2018)","code":""},{"path":"spatial-cv.html","id":"exercises-9","chapter":"12 Statistical learning","heading":"12.7 Exercises","text":"E1. Compute following terrain attributes elev dataset loaded terra::rast(system.file(\"raster/ta.tif\", package = \"spDataLarge\"))$elev help R-GIS bridges (see bridges GIS software chapter):SlopePlan curvatureProfile curvatureCatchment areaE2. Extract values corresponding output rasters lsl data frame (data(\"lsl\", package = \"spDataLarge\") adding new variables called slope, cplan, cprof, elev log_carea.E3. Use derived terrain attribute rasters combination GLM make spatial prediction map similar shown Figure 12.2.\nRunning data(\"study_mask\", package = \"spDataLarge\") attaches mask study area.E4. Compute 100-repeated 5-fold non-spatial cross-validation spatial CV based GLM learner compare AUROC values resampling strategies help boxplots.Hint: need specify non-spatial resampling strategy.Another hint: might want solve Excercises 4 6 one go help mlr3::benchmark() mlr3::benchmark_grid() (information, please refer https://mlr3book.mlr-org.com/chapters/chapter10/advanced_technical_aspects_of_mlr3.html#sec-fallback).\n, keep mind computation can take long, probably several days.\n, course, depends system.\nComputation time shorter RAM cores disposal.E5. Model landslide susceptibility using quadratic discriminant analysis (QDA).\nAssess predictive performance QDA.\ndifference spatially cross-validated mean AUROC value QDA GLM?E6. Run SVM without tuning hyperparameters.\nUse rbfdot kernel \\(\\sigma\\) = 1 C = 1.\nLeaving hyperparameters unspecified kernlab’s ksvm() otherwise initialize automatic non-spatial hyperparameter tuning.","code":""},{"path":"transport.html","id":"transport","chapter":"13 Transportation","heading":"13 Transportation","text":"","code":""},{"path":"transport.html","id":"prerequisites-11","chapter":"13 Transportation","heading":"Prerequisites","text":"chapter uses following packages:89","code":"\nlibrary(sf)\nlibrary(dplyr)\nlibrary(spDataLarge)\nlibrary(stplanr) # for processing geographic transport data\nlibrary(tmap) # map-making (see Chapter 9)\nlibrary(ggplot2) # data visualization package\nlibrary(sfnetworks) # spatial network classes and functions "},{"path":"transport.html","id":"introduction-7","chapter":"13 Transportation","heading":"13.1 Introduction","text":"sectors geographic space tangible transportation.\neffort moving (overcoming distance) central ‘first law’ geography, defined Waldo Tobler 1970 follows (Waldo R. Tobler 1970):Everything related everything else, near things related distant things.‘law’ basis spatial autocorrelation key geographic concepts.\napplies phenomena diverse friendship networks ecological diversity can explained costs transport — terms time, energy money — constitute ‘friction distance’.\nperspective, transport technologies disruptive, changing spatial relationships geographic entities including mobile humans goods: “purpose transportation overcome space” (Rodrigue, Comtois, Slack 2013).Transport inherently spatial activity, involving moving origin point ‘’ destination point ‘B’, infinite localities .\ntherefore unsurprising transport researchers long turned geographic computational methods understand movement patterns, interventions can improve performance (Lovelace 2021).chapter introduces geographic analysis transport systems different geographic levels:Areal units: transport patterns can understood reference zonal aggregates, main mode travel (car, bike foot, example), average distance trips made people living particular zone, covered Section 13.3Desire lines: straight lines represent ‘origin-destination’ data records many people travel (travel) places (points zones) geographic space, topic Section 13.4Nodes: points transport system can represent common origins destinations public transport stations bus stops rail stations, topic Section 13.5Routes: lines representing path along route network along desire lines nodes.\nRoutes (can represented single linestrings multiple short segments) routing engines generate , covered Section 13.6Route networks: represent system roads, paths linear features area covered Section 13.7.\ncan represented geographic features (typically short segments road add create full network) structured interconnected graph, level traffic different segments referred ‘flow’ transport modelers (Hollander 2016)Another key level agents, mobile entities like vehicles enable us move bikes buses.\ncan represented computationally software MATSim /B Street, represent dynamics transport systems using agent-based modeling (ABM) framework, usually high levels spatial temporal resolution (Horni, Nagel, Axhausen 2016).\nABM powerful approach transport research great potential integration R’s spatial classes (Thiele 2014; Lovelace Dumont 2016), outside scope chapter.\nBeyond geographic levels agents, basic unit analysis many transport models trip, single purpose journey origin ‘’ destination ‘B’ (Hollander 2016).\nTrips join-different levels transport systems can represented simplistically geographic desire lines connecting zone centroids (nodes) routes follow transport route network.\ncontext, agents usually point entities move within transport network.Transport systems dynamic (Xie Levinson 2011).\nfocus chapter geographic analysis transport systems, provides insights approach can used simulate scenarios change, Section 13.8.\npurpose geographic transport modeling can interpreted simplifying complexity spatiotemporal systems ways capture essence.\nSelecting appropriate levels geographic analysis can help simplify complexity without losing important features variables, enabling better decision-making effective interventions (Hollander 2016).Typically, models designed tackle particular problem, improve safety environmental performance transport systems.\nreason, chapter based around policy scenario, introduced next section, asks: increase cycling city Bristol?\nChapter 14 demonstrates related application geocomputation: prioritizing location new bike shops.\nlink chapters: new effectively-located cycling infrastructure can get people cycling, boosting demand bike shops local economic activity.\nhighlights important feature transport systems: closely linked broader phenomena land-use patterns.","code":""},{"path":"transport.html","id":"bris-case","chapter":"13 Transportation","heading":"13.2 A case study of Bristol","text":"case study used chapter located Bristol, city west England, around 30 km east Welsh capital Cardiff.\noverview region’s transport network illustrated Figure 13.1, shows diversity transport infrastructure, cycling, public transport, private motor vehicles.\nFIGURE 13.1: Bristol’s transport network represented colored lines active (green), public (railways, blue) private motor (red) modes travel. Black border lines represent inner city boundary (highlighted yellow) larger Travel Work Area (TTWA).\nBristol 10th largest city council England, population half million people, although travel catchment area larger (see Section 13.3).\nvibrant economy aerospace, media, financial service tourism companies, alongside two major universities.\nBristol shows high average income per person also contains areas severe deprivation (Bristol City Council 2015).terms transport, Bristol well served rail road links, relatively high level active travel.\n19% citizens cycle 88% walk least per month according Active People Survey (national average 15% 81%, respectively).\n8% population said cycled work 2011 census, compared 3% nationwide.Like many cities, Bristol major congestion, air quality physical inactivity problems.\nCycling can tackle issues efficiently: greater potential replace car trips walking, typical speeds 15-20 km/h vs 4-6 km/h walking.\nreason Bristol’s Transport Strategy ambitious plans cycling.highlight importance policy considerations transportation research, chapter guided need provide evidence people (transport planners, politicians stakeholders) tasked getting people cars onto sustainable modes — walking cycling particular.\nbroader aim demonstrate geocomputation can support evidence-based transport planning.\nchapter learn :Describe geographical patterns transport behavior citiesIdentify key public transport nodes supporting multi-modal tripsAnalyze travel ‘desire lines’ find many people drive short distancesIdentify cycle route locations encourage less car driving cyclingTo get wheels rolling practical aspects chapter, next section begins loading zonal data travel patterns.\nzone-level datasets small often vital gaining basic understanding settlement’s overall transport system.","code":""},{"path":"transport.html","id":"transport-zones","chapter":"13 Transportation","heading":"13.3 Transport zones","text":"Although transport systems primarily based linear features nodes — including pathways stations — often makes sense start areal data, break continuous space tangible units (Hollander 2016).\naddition boundary defining study area (Bristol case), two zone types particular interest transport researchers: origin destination zones.\nOften, geographic units used origins destinations.\nHowever, different zoning systems, ‘Workplace Zones’, may appropriate represent increased density trip destinations areas many ‘trip attractors’ schools shops (Office National Statistics 2014).simplest way define study area often first matching boundary returned OpenStreetMap.\ncan done command osmdata::getbb(\"Bristol\", format_out = \"sf_polygon\", limit = 1).\nreturns sf object (list sf objects limit = 1 specified) representing bounds largest matching city region, either rectangular polygon bounding box detailed polygonal boundary.90\nBristol, detailed polygon returned, represented bristol_region object spDataLarge package.\nSee inner blue boundary Figure 13.1: couple issues approach:first boundary returned OSM may official boundary used local authoritiesEven OSM returns official boundary, may inappropriate transport research bear little relation people travelTravel Work Areas (TTWAs) address issues creating zoning system analogous hydrological watersheds.\nTTWAs first defined contiguous zones within 75% population travels work (Coombes, Green, Openshaw 1986), definition used chapter.\nBristol major employer attracting travel surrounding towns, TTWA substantially larger city bounds (see Figure 13.1).\npolygon representing transport-orientated boundary stored object bristol_ttwa, provided spDataLarge package loaded beginning chapter.origin destination zones used chapter : officially defined zones intermediate geographic resolution (official name Middle layer Super Output Areas MSOAs).\nhouses around 8,000 people.\nadministrative zones can provide vital context transport analysis, type people might benefit particular interventions (e.g., Moreno-Monroy, Lovelace, Ramos (2017)).geographic resolution zones important: small zones high geographic resolution usually preferable high number large regions can consequences processing (especially origin-destination analysis number possibilities increases non-linear function number zones) (Hollander 2016).\nAnother issue small zones related anonymity rules. make\nimpossible infer identity individuals zones, detailed\nsocio-demographic variables often available low geographic\nresolution. Breakdowns travel mode age sex, example, \navailable Local Authority level UK, much\nhigher Output Area level, contains around 100 households.\ndetails, see www.ons.gov.uk/methodology/geography.\n102 zones used chapter stored bristol_zones, illustrated Figure 13.2.\nNote zones get smaller densely populated areas: houses similar number people.\nbristol_zones contains attribute data transport, however, name code zone:add travel data, perform attribute join, common task described Section 3.2.4.\nuse travel data UK’s 2011 census question travel work, data stored bristol_od, provided ons.gov.uk data portal.\nbristol_od origin-destination (OD) dataset travel work zones UK’s 2011 Census (see Section 13.4).\nfirst column ID zone origin second column zone destination.\nbristol_od rows bristol_zones, representing travel zones rather zones :results previous code chunk shows 10 OD pairs every zone, meaning need aggregate origin-destination data joined bristol_zones, illustrated (origin-destination data described Section 13.4).preceding chunk:Grouped data zone origin (contained column o)Aggregated variables bristol_od dataset numeric, find total number people living zone mode transport91Renamed grouping variable o matches ID column geo_code bristol_zones objectThe resulting object zones_attr data frame rows representing zones ID variable.\ncan verify IDs match zones dataset using %% operator follows:results show 102 zones present new object zone_attr form can joined onto zones.92\ndone using joining function left_join() (note inner_join() produce result): result zones_joined, contains new columns representing total number trips originating zone study area (almost 1/4 million) mode travel (bicycle, foot, car train).\ngeographic distribution trip origins illustrated left-hand map Figure 13.2.\nshows zones 0 4,000 trips originating study area.\ntrips made people living near center Bristol fewer outskirts.\n?\nRemember dealing trips within study region: low trip numbers outskirts region can explained fact many people peripheral zones travel regions outside study area.\nTrips outside study region can included regional model special destination ID covering trips go zone represented model (Hollander 2016).\ndata bristol_od, however, simply ignores trips: ‘intra-zonal’ model.way OD datasets can aggregated zone origin, can also aggregated provide information destination zones.\nPeople tend gravitate towards central places.\nexplains spatial distribution represented right panel Figure 13.2 relatively uneven, common destination zones concentrated Bristol city center.\nresult zones_od, contains new column reporting number trip destinations mode, created follows:simplified version Figure 13.2 created code (see 13-zones.R code folder book’s GitHub repository reproduce figure Section 9.2.7 details faceted maps tmap):\nFIGURE 13.2: Number trips (commuters) living working region. left map shows zone origin commute trips; right map shows zone destination (generated script 13-zones.R).\n","code":"\nnames(bristol_zones)\n#> [1] \"geo_code\" \"name\" \"geometry\"\nnrow(bristol_od)\n#> [1] 2910\nnrow(bristol_zones)\n#> [1] 102\nzones_attr = bristol_od |> \n group_by(o) |> \n summarize(across(where(is.numeric), sum)) |> \n dplyr::rename(geo_code = o)\nsummary(zones_attr$geo_code %in% bristol_zones$geo_code)\n#> Mode TRUE \n#> logical 102\nzones_joined = left_join(bristol_zones, zones_attr, by = \"geo_code\")\nsum(zones_joined$all)\n#> [1] 238805\nnames(zones_joined)\n#> [1] \"geo_code\" \"name\" \"all\" \"bicycle\" \"foot\" \n#> [6] \"car_driver\" \"train\" \"geometry\"\nzones_destinations = bristol_od |> \n group_by(d) |> \n summarize(across(where(is.numeric), sum)) |> \n select(geo_code = d, all_dest = all)\nzones_od = inner_join(zones_joined, zones_destinations, by = \"geo_code\")\nqtm(zones_od, c(\"all\", \"all_dest\")) +\n tm_layout(panel.labels = c(\"Origin\", \"Destination\"))"},{"path":"transport.html","id":"desire-lines","chapter":"13 Transportation","heading":"13.4 Desire lines","text":"Desire lines connect origins destinations, representing people desire go, typically zones.\nrepresent quickest ‘bee line’ ‘crow flies’ route B taken, obstacles buildings windy roads getting way (see convert desire lines routes next section).\nTypically, desire lines represented geographically starting ending geographic (population weighted) centroid zone.\ntype desire line create use section, although worth aware ‘jittering’ techniques enable multiple start end points increase spatial coverage accuracy analyses building OD data (Lovelace, Félix, Carlino 2022).already loaded data representing desire lines dataset bristol_od.\norigin-destination (OD) data frame object represents number people traveling zone represented o d, illustrated Table 13.1.\narrange OD data trips filter-top 5, type (please refer Chapter 3 detailed description non-spatial attribute operations):TABLE 13.1: Sample top 5 origin-destination pairs Bristol OD data frame, representing travel desire lines zones study area.resulting table provides snapshot Bristolian travel patterns terms commuting (travel work).\ndemonstrates walking popular mode transport among top 5 origin-destination pairs, zone E02003043 popular destination (Bristol city center, destination top 5 OD pairs), intrazonal trips, one part zone E02003043 another (first row Table 13.1), constitute traveled OD pair dataset.\npolicy perspective, raw data presented Table 13.1 limited use: aside fact contains tiny portion 2,910 OD pairs, tells us little policy measures needed, proportion trips made walking cycling.\nfollowing command calculates percentage desire line made active modes:two main types OD pairs: interzonal intrazonal.\nInterzonal OD pairs represent travel zones destination different origin.\nIntrazonal OD pairs represent travel within zone (see top row Table 13.1).\nfollowing code chunk splits od_bristol two types:next step convert interzonal OD pairs sf object representing desire lines can plotted map stplanr function od2line().93An illustration results presented Figure 13.3, simplified version created following command (see code 13-desire.R reproduce figure exactly Chapter 9 details visualization tmap):\nFIGURE 13.3: Desire lines representing trip patterns Bristol, width representing number trips color representing percentage trips made active modes (walking cycling). four black lines represent interzonal OD pairs Table 13.1.\nmap shows city center dominates transport patterns region, suggesting policies prioritized , although number peripheral sub-centers can also seen.\nDesire lines important generalized components transport systems.\nconcrete components include nodes, specific destinations (rather hypothetical straight lines represented desire lines).\nNodes covered next section.","code":"\nod_top5 = bristol_od |> \n slice_max(all, n = 5)\nbristol_od$Active = (bristol_od$bicycle + bristol_od$foot) /\n bristol_od$all * 100\nod_intra = filter(bristol_od, o == d)\nod_inter = filter(bristol_od, o != d)\ndesire_lines = od2line(od_inter, zones_od)\n#> Creating centroids representing desire line start and end points.\nqtm(desire_lines, lines.lwd = \"all\")"},{"path":"transport.html","id":"nodes","chapter":"13 Transportation","heading":"13.5 Nodes","text":"Nodes geographic transport datasets points among predominantly linear features comprise transport networks.\nBroadly two main types transport nodes:Nodes directly network zone centroids individual origins destinations houses workplacesNodes part transport networks.\nTechnically, node can located point transport network practice often special kinds vertex intersections pathways (junctions) points entering exiting transport network bus stops train stations94Transport networks can represented graphs, segment connected (via edges representing geographic lines) one edges network.\nNodes outside network can added “centroid connectors”, new route segments nearby nodes network (Hollander 2016).95\nEvery node network connected one ‘edges’ represent individual segments network.\nsee transport networks can represented graphs Section 13.7.Public transport stops particularly important nodes can represented either type node: bus stop part road, large rail station represented pedestrian entry point hundreds meters railway tracks.\nuse railway stations illustrate public transport nodes, relation research question increasing cycling Bristol.\nstations provided spDataLarge bristol_stations.common barrier preventing people switching away cars commuting work distance home work far walk cycle.\nPublic transport can reduce barrier providing fast high-volume option common routes cities.\nactive travel perspective, public transport ‘legs’ longer journeys divide trips three:origin leg, typically residential areas public transport stationsThe public transport leg, typically goes station nearest trip’s origin station nearest destinationThe destination leg, station alighting destinationBuilding analysis conducted Section 13.4, public transport nodes can used construct three-part desire lines trips can taken bus (mode used example) rail.\nfirst stage identify desire lines public transport travel, case easy previously created dataset desire_lines already contains variable describing number trips train (public transport potential also estimated using public transport routing services OpenTripPlanner).\nmake approach easier follow, select top three desire lines terms rails use:challenge now ‘break-’ lines three pieces, representing travel via public transport nodes.\ncan done converting desire line multilinestring object consisting three line geometries representing origin, public transport destination legs trip.\noperation can divided three stages: matrix creation (origins, destinations ‘via’ points representing rail stations), identification nearest neighbors conversion multilinestrings.\nundertaken line_via().\nstplanr function takes input lines points returns copy desire lines — see ?line_via() details works.\noutput input line, except new geometry columns representing journey via public transport nodes, demonstrated :illustrated Figure 13.4, initial desire_rail lines now three additional geometry list columns representing travel home origin station, destination, finally destination station destination.\ncase, destination leg short (walking distance) origin legs may sufficiently far justify investment cycling infrastructure encourage people cycle stations outward leg peoples’ journey work residential areas surrounding three origin stations Figure 13.4.\nFIGURE 13.4: Station nodes (red dots) used intermediary points convert straight desire lines high rail usage (thin green lines) three legs: origin station (orange) via public transport (blue) destination (pink, visible short).\n","code":"\ndesire_rail = top_n(desire_lines, n = 3, wt = train)\nncol(desire_rail)\n#> [1] 9\ndesire_rail = line_via(desire_rail, bristol_stations)\nncol(desire_rail)\n#> [1] 12"},{"path":"transport.html","id":"routes","chapter":"13 Transportation","heading":"13.6 Routes","text":"\ngeographical perspective, routes desire lines longer straight: origin destination points desire line representation travel, pathway get B complex.\ngeometries routes typically (always) determined transport network.desire lines contain two vertices (beginning end points), routes can contain number vertices, representing points B joined straight lines: definition linestring geometry.\nRoutes covering large distances following intricate network can many hundreds vertices; routes grid-based simplified road networks tend fewer.Routes generated desire lines , commonly, matrices containing coordinate pairs representing desire lines.\nrouting process done range broadly-defined routing engines: software web services return geometries attributes describing get origins destinations.\nRouting engines can classified based run relative R:-memory routing using R packages enable route calculation (described Section 13.6.2)Locally hosted routing engines external R can called R (Section 13.6.3)Remotely hosted routing engines external entities provide web API can called R (Section 13.6.4)describing , worth outlining ways categorizing routing engines.\nRouting engines can multi-modal, meaning can calculate trips composed one mode transport, .\nMulti-modal routing engines can return results consisting multiple legs, one made different mode transport.\noptimal route residential area commercial area involve 1) walking nearest bus stop, 2) catching bus nearest node destination, 3) walking destination, given set input parameters.\ntransition points three legs commonly referred ‘ingress’ ‘egress’, meaning getting /public transport vehicle.\nMulti-modal routing engines R5 sophisticated larger input data requirements ‘uni-modal’ routing engines OSRM (described Section 13.6.3).major strength multi-modal engines ability represent ‘transit’ (public transport) trips trains, buses etc.\nMulti-model routing engines require input datasets representing public transport networks, typically General Transit Feed Specification (GTFS) files, can processed functions tidytransit gtfstools packages (packages tools working GTFS files available).\nSingle mode routing engines may sufficient projects focused specific (non public) modes transport.\nAnother way classifying routing engines (settings) geographic level outputs: routes, legs segments.","code":""},{"path":"transport.html","id":"route-legs-segments","chapter":"13 Transportation","heading":"13.6.1 Routes, legs and segments","text":"Routing engines can generate outputs three geographic levels routes, legs segments:Route level outputs contain single feature (typically multilinestring associated row data frame representation) per origin-destination pair, meaning single row data per tripLeg level outputs contain single feature associated attributes mode within origin-destination pair, described Section 13.5. trips involving one mode (example driving home work, ignoring short walk car) leg route: car journey. trips involving public transport, legs provide key information. r5r function detailed_itineraries() returns legs , confusingly, sometimes referred ‘segments’Segment level outputs provide detailed information routes, records small section transport network. Typically segments similar length, identical , ways OpenStreetMap. cyclestreets function journey() returns data segment level can aggregated grouping origin destination level data returned route() function stplanrMost routing engines return route level default, although multi-modal engines generally provide outputs leg level (one feature per continuous movement single mode transport).\nSegment level outputs advantage providing detail.\ncyclestreets package returns multiple ‘quietness’ levels per route, enabling identification ‘weakest link’ cycle networks.\nDisadvantages segment level outputs include increased file sizes complexities associated extra detail.Route level results can converted segment level results using function stplanr::overline() (Morgan Lovelace 2020).\nworking segment leg-level data, route-level statistics can returned grouping columns representing trip start end points summarizing/aggregating columns containing segment-level data.","code":""},{"path":"transport.html","id":"memengine","chapter":"13 Transportation","heading":"13.6.2 In-memory routing with R","text":"Routing engines R enable route networks stored R objects memory used basis route calculation.\nOptions include sfnetworks, dodgr cppRouting packages, provide class system represent route networks, topic next section.fast flexible, native R routing options generally harder set-dedicated routing engines realistic route calculation.\nRouting hard problem many hundreds hours put open source routing engines can downloaded hosted locally.\nhand, R-based routing engines may well suited model experiments statistical analysis impacts changes network.\nChanging route network characteristics (weights associated different route segment types), re-calculating routes, analyzing results many scenarios single language benefits research applications.","code":""},{"path":"transport.html","id":"localengine","chapter":"13 Transportation","heading":"13.6.3 Locally hosted dedicated routing engines","text":"Locally hosted routing engines include OpenTripPlanner, Valhalla, R5 (multi-modal), OpenStreetMap Routing Machine (OSRM) (‘uni-modal’).\ncan accessed R packages opentripplanner, valhallr, r5r osrm (Morgan et al. 2019; Pereira et al. 2021).\nLocally hosted routing engines run user’s computer process separate R.\nbenefit speed execution control weighting profile different modes transport.\nDisadvantages include difficulty representing complex networks locally; temporal dynamics (primarily due traffic); need specialized external software.","code":""},{"path":"transport.html","id":"remoteengine","chapter":"13 Transportation","heading":"13.6.4 Remotely hosted dedicated routing engines","text":"Remotely hosted routing engines use web API send queries origins destinations return results.\nRouting services based open source routing engines, OSRM’s publicly available service, work called R locally hosted instances, simply requiring arguments specifying ‘base URLs’ updated.\nHowever, fact external routing services hosted dedicated machine (usually funded commercial company incentives generate accurate routes) can give advantages, including:Provision routing services worldwide (usually least large region)Established routing services usually updated regularly can often respond traffic levelsRouting services usually run dedicated hardware software including systems load balancers ensure consistent performanceDisadvantages remote routing services include speed batch jobs possible (often rely data transfer internet route--route basis), price (Google routing API, example, limits number free queries) licensing issues.\ngoogleway mapbox packages demonstrate approach providing access routing services Google Mapbox, respectively.\nFree (rate limited) routing service include OSRM openrouteservice.org can accessed R osrm openrouteservice packages, latter CRAN.\nalso specific routing services provided CycleStreets.net, cycle journey planner --profit transport technology company “cyclists, cyclists”.\nR users can access CycleStreets routes via package cyclestreets, many routing services lack R interfaces, representing substantial opportunity package development: building R package provide interface web API can rewarding experience.","code":""},{"path":"transport.html","id":"contraction-hierarchies-and-traffic-assigment","chapter":"13 Transportation","heading":"13.6.5 Contraction hierarchies and traffic assigment","text":"Contraction hierarchies traffic assignment advanced important topics transport modeling worth aware , especially want code scale large networks.\nCalculating many routes computationally resource intensive can take hours, leading development several algorithms speed-routing calculations.\nContraction hierarchies well-known algorithm can lead substantial (1000x+ cases) speed-routing tasks, depending network size.\nContraction hierarchies used behind scenes routing engines mentioned previous sections.Traffic assignment problem closely related routing: practice, shortest path two points always fastest, especially congestion.\nprocess takes OD datasets, kind described Section 13.4, assigns traffic segment network, generating route networks kind described Section 13.7.\nestablished solution Wardrop’s principle user equilibrium shows , realistic, congestion considered estimating flows network, reference mathematically defined relationship cost flow (Wardrop 1952).\noptimization problem can solved iterative algorithms implemented cppRouting package, also implements contraction hierarchies fast routing.","code":""},{"path":"transport.html","id":"routing-a-worked-example","chapter":"13 Transportation","heading":"13.6.6 Routing: A worked example","text":"Instead routing desire lines generated Section 13.4, focus subset highly policy relevant.\nRunning computationally intensive operation subset trying process whole dataset often sensible, applies routing.\nRouting can time memory-consuming, resulting large objects, due detailed geometries extra attributes route objects.\ntherefore filter desire lines calculating routes section.Cycling beneficial replaces car trips.\nShort trips (around 5 km, can cycled 15 minutes speed 20 km/hr) relatively high probability cycled, maximum distance increases trips made electric bike (Lovelace et al. 2017).\nconsiderations inform following code chunk filters desire lines returns object desire_lines_short representing OD pairs many (100+) short (2.5 5 km Euclidean distance) trips driven:code st_length() calculated length desire line, described Section 4.2.3.\nfilter() function dplyr filtered desire_lines dataset based criteria outlined , described Section 3.2.1.\nnext stage convert desire lines routes.\ndone using publicly available OSRM service stplanr functions route() route_osrm() code chunk :output routes_short, sf object representing routes transport network suitable cycling (according OSRM routing engine least), one desire line.\nNote: calls external routing engines command work internet connection (sometimes API key stored environment variable, although case).\naddition columns contained desire_lines object, new route dataset contains distance (referring route distance time) duration columns (seconds), provide potentially useful extra information nature route.\nplot desire lines along many short car journeys take place alongside cycling routes.\nMaking width routes proportional number car journeys potentially replaced provides effective way prioritize interventions road network (Lovelace et al. 2017).\nFigure 13.5 shows routes along people drive short distances (see github.com/geocompx source code).96\nFIGURE 13.5: Routes along many (100+) short (<5km Euclidean distance) car journeys made (red) overlaying desire lines representing trips (black) zone centroids (dots).\nVisualizing results interactive map shows many short car trips take place around Bradley Stoke, around 10 km North central Bristol.\neasy find explanations area’s high level car dependency: according Wikipedia, Bradley Stoke “Europe’s largest new town built private investment”, suggesting limited public transport provision.\nFurthermore, town surrounded large (cycling unfriendly) road structures, including M4 M5 motorways (Tallon 2007).many benefits converting travel desire lines routes.\nimportant remember sure many () trips follow exact routes calculated routing engines.\nHowever, route street/way/segment level results can highly policy relevant.\nRoute segment results can enable prioritization investment needed, according available data (Lovelace et al. 2017).","code":"\ndesire_lines$distance_km = as.numeric(st_length(desire_lines)) / 1000\ndesire_lines_short = desire_lines |> \n filter(car_driver >= 100, distance_km <= 5, distance_km >= 2.5)\nroutes_short = route(l = desire_lines_short, route_fun = route_osrm,\n osrm.profile = \"bike\")\n#> "},{"path":"transport.html","id":"route-networks","chapter":"13 Transportation","heading":"13.7 Route networks","text":"\nroutes generally contain data travel behavior, geographic level desire lines OD pairs, route network datasets usually represent physical transport network.\nsegment route network roughly corresponds continuous section street junctions appears , although average length segments depends data source (segments OSM-derived bristol_ways dataset used section average length just 200 m, standard deviation nearly 500 m).\nVariability segment lengths can explained fact rural locations junctions far apart dense urban areas crossings segment breaks every meters.Route networks can input , output , transport data analysis projects, .\ntransport research involves route calculation requires route network dataset internal external routing engines (latter case route network data necessarily imported R).\nHowever, route networks also important outputs many transport research projects: summarizing data potential number trips made particular segments represented route network, can help prioritize investment needed.\ndemonstrate create route networks output derived route level data, imagine simple scenario mode shift.\nImagine 50% car trips 0 3 km route distance replaced cycling, percentage drops 10 percentage points every additional km route distance 20% car trips 6 km replaced cycling car trips 8 km longer replaced cycling.\ncourse unrealistic scenario (Lovelace et al. 2017), useful starting point.\ncase, can model mode shift cars bikes follows:created scenario approximately 4000 trips switched driving cycling, can now model updated modeled cycling activity take place.\n, use function overline() stplanr package.\nfunction breaks linestrings junctions (two linestring geometries meet), calculates aggregate statistics unique route segment (Morgan Lovelace 2020), taking object containing routes names attributes summarize first second argument:outputs two preceding code chunks summarized Figure 13.6 .\nFIGURE 13.6: Illustration percentage car trips switching cycling function distance (left) route network level results function (right).\nTransport networks records segment level, typically attributes road type width, constitute common type route network.\nroute network datasets available worldwide OpenStreetMap, can downloaded packages osmdata osmextract.\nsave time downloading preparing OSM, use bristol_ways object spDataLarge package, sf object LINESTRING geometries attributes representing sample transport network case study region (see ?bristol_ways details), shown output :output shows bristol_ways represents just 6 thousand segments transport network.\ngeographic networks can represented mathematical graphs, nodes network, connected edges.\nnumber R packages developed dealing graphs, notably igraph.\ncan manually convert route network igraph object, geographic attributes lost.\novercome limitation igraph, sfnetworks package (van der Meer et al. 2023), represent route networks simultaneously graphs geographic lines, developed.\ndemonstrate sfnetworks functionality bristol_ways object.output previous code chunk (final output shortened contain important 8 lines due space considerations) shows ways_sfn composite object, containing nodes edges graph spatial form.\nways_sfn class sfnetwork, builds igraph class igraph package.\nexample , ‘edge betweenness’, meaning number shortest paths passing edge, calculated (see ?igraph::betweenness details).\noutput edge betweenness calculation shown Figure 13.7, cycle route network dataset calculated overline() function overlay comparison.\nresults demonstrate graph edge represents segment: segments near center road network highest betweenness values, whereas segments closer central Bristol higher cycling potential, based simplistic datasets.\nFIGURE 13.7: Illustration route network datasets. grey lines represent simplified road network, segment thickness proportional betweenness. green lines represent potential cycling flows (one way) calculated code .\nOne can also find shortest route origins destinations using graph representation route network sfnetworks package.\n\nmethods presented section relatively simple compared possible.\ndual graph/spatial capabilities sfnetworks enable many new powerful techniques can fully covered section.\nsection , however, provide strong starting point exploration research area.\nfinal point example dataset used relatively small.\nmay also worth considering work adapt larger networks: testing methods subset data, ensuring enough RAM help, although ’s also worth exploring tools can transport network analysis optimized large networks, R5 (Alessandretti et al. 2022).","code":"\nuptake = function(x) {\n case_when(\n x <= 3 ~ 0.5,\n x >= 8 ~ 0,\n TRUE ~ (8 - x) / (8 - 3) * 0.5\n )\n}\nroutes_short_scenario = routes_short |> \n mutate(uptake = uptake(distance / 1000)) |> \n mutate(bicycle = bicycle + car_driver * uptake,\n car_driver = car_driver * (1 - uptake))\nsum(routes_short_scenario$bicycle) - sum(routes_short$bicycle)\n#> [1] 692\nroute_network_scenario = overline(routes_short_scenario, attrib = \"bicycle\")#> [plot mode] fit legend/component: Some legend items or map compoments do not fit well, and are therefore rescaled. Set the tmap option 'component.autoscale' to FALSE to disable rescaling.\nsummary(bristol_ways)\n#> highway maxspeed ref geometry \n#> cycleway:1721 Length:6160 Length:6160 LINESTRING :6160 \n#> rail :1017 Class :character Class :character epsg:4326 : 0 \n#> road :3422 Mode :character Mode :character +proj=long...: 0\nbristol_ways$lengths = st_length(bristol_ways)\nways_sfn = as_sfnetwork(bristol_ways)\nclass(ways_sfn)\n#> [1] \"sfnetwork\" \"tbl_graph\" \"igraph\"\nways_sfn\n#> # A sfnetwork with 5728 nodes and 4915 edges\n#> # A directed multigraph with 1013 components with spatially explicit edges\n#> # Node Data: 5,728 × 1 (active)\n#> # Edge Data: 4,915 × 7\n#> from to highway maxspeed ref geometry lengths\n#> [m]\n#> 1 1 2 road B3130 (-2.61 51.4, -2.61 51.4, -2.61 51.… 218.\n#> # … \nways_centrality = ways_sfn |> \n activate(\"edges\") |> \n mutate(betweenness = tidygraph::centrality_edge_betweenness(lengths)) #> [plot mode] fit legend/component: Some legend items or map compoments do not fit well, and are therefore rescaled. Set the tmap option 'component.autoscale' to FALSE to disable rescaling."},{"path":"transport.html","id":"prioritizing-new-infrastructure","chapter":"13 Transportation","heading":"13.8 Prioritizing new infrastructure","text":"section demonstrates geocomputation can create policy relevant outcomes field transport planning.\nidentify promising locations investment sustainable transport infrastructure, using simple approach educational purposes.advantage data driven approach outlined chapter modularity: aspect can useful , feed wider analyses.\nsteps got us stage included identifying short car-dependent commuting routes (generated desire lines) Section 13.6 analysis route network characteristics sfnetworks package Section 13.7.\nfinal code chunk chapter combines strands analysis, overlaying estimates cycling potential previous section top new dataset representing areas within short distance cycling infrastructure.\nnew dataset created code chunk : 1) filters cycleway entities bristol_ways object representing transport network; 2) ‘unions’ individual LINESTRING entities cycleways single multilinestring object (speed buffering); 3) creates 100 m buffer around create polygon.next stage create dataset representing points network high cycling potential little provision cycling.results preceding code chunks shown Figure 13.8, shows routes high levels car dependency high cycling potential cycleways.\nFIGURE 13.8: Potential routes along prioritise cycle infrastructure Bristol reduce car dependency. static map provides overview overlay existing infrastructure routes high car-bike switching potential (left). screenshot interactive map generated qtm() function highlights Whiteladies Road somewhere benefit new cycleway (right).\nmethod limitations: reality, people travel zone centroids always use shortest route algorithm particular mode.\nHowever, results demonstrate geographic data analysis can used highlight places new investment cycleways particularly beneficial, despite simplicity approach.\nanalysis need substantially expanded — including larger input datasets — inform transport planning design practice.","code":"\nexisting_cycleways_buffer = bristol_ways |> \n filter(highway == \"cycleway\") |> # 1) filter out cycleways\n st_union() |> # 2) unite geometries\n st_buffer(dist = 100) # 3) create buffer\nroute_network_no_infra = st_difference(\n route_network_scenario,\n route_network_scenario |> st_set_crs(st_crs(existing_cycleways_buffer)),\n existing_cycleways_buffer\n)\ntmap_mode(\"view\")\nqtm(route_network_no_infra, basemaps = leaflet::providers$Esri.WorldTopoMap,\n lines.lwd = 5)"},{"path":"transport.html","id":"future-directions-of-travel","chapter":"13 Transportation","heading":"13.9 Future directions of travel","text":"chapter provided taste possibilities using geocomputation transport research, explored key geographic elements make-city’s transport system open data reproducible code.\nresults help plan investment needed.Transport systems operate multiple interacting levels, meaning geocomputational methods great potential generate insights work, likely impacts different interventions.\nmuch done area: possible build foundations presented chapter many directions.\nTransport fastest growing source greenhouse gas emissions many countries, set become “largest GHG emitting sector, especially developed countries” (see EURACTIV.com).\nTransport-related emissions unequally distributed across society (unlike food heating) essential well-.\ngreat potential sector rapidly decarbonize demand reduction, electrification vehicle fleet uptake active travel modes walking cycling.\nNew technologies can reduce car dependency enabling car sharing.\n‘Micro-mobility’ systems dockless bike e-scooter schemes also emerging, creating valuable datasets General Bikeshare Feed Specification (GBFS) format, can imported processed gbfs package.\nchanges large impacts accessibility, ability people reach employment service locations need, something can quantified currently scenarios change packages accessibility packages.\nexploration ‘transport futures’ local, regional national levels yield important new insights.Methodologically, foundations presented chapter extended including variables analysis.\nCharacteristics route speed limits, busyness provision protected cycling walking paths linked ‘mode-split’ (proportion trips made different modes transport).\naggregating OpenStreetMap data using buffers geographic data methods presented Chapters 3 4, example, possible detect presence green space close proximity transport routes.\nUsing R’s statistical modeling capabilities, used predict current future levels cycling, example.type analysis underlies Propensity Cycle Tool (PCT), publicly accessible (see www.pct.bike) mapping tool developed R used prioritize investment cycling across England (Lovelace et al. 2017).\nSimilar tools used encourage evidence-based transport policies related topics air pollution public transport access around world.","code":""},{"path":"transport.html","id":"ex-transport","chapter":"13 Transportation","heading":"13.10 Exercises","text":"E1. much analysis presented chapter focused active modes, driving trips?proportion trips desire_lines object made driving?proportion desire_lines straight line length 5 km distance?proportion trips desire lines longer 5 km length made driving?Plot desire lines less 5 km length along 50% trips made car.notice location car dependent yet short desire lines?E2. additional length cycleways result routes presented last Figure, sections beyond 100 m existing cycleways, constructed?E3. proportion trips represented desire_lines accounted routes_short_scenario object?Bonus: proportion trips happen desire lines cross routes_short_scenario?E4. analysis presented chapter designed teaching geocomputation methods can applied transport research.\nreal, government transport consultancy, top 3 things differently?E5. Clearly, routes identified last Figure provide part picture.\nextend analysis?E6. Imagine want extend scenario creating key areas (routes) investment place-based cycling policies car-free zones, cycle parking points reduced car parking strategy.\nraster datasets assist work?Bonus: develop raster layer divides Bristol region 100 cells (10 10) estimate average speed limit roads , bristol_ways dataset (see Chapter 14).","code":""},{"path":"location.html","id":"location","chapter":"14 Geomarketing","heading":"14 Geomarketing","text":"","code":""},{"path":"location.html","id":"prerequisites-12","chapter":"14 Geomarketing","heading":"Prerequisites","text":"chapter requires following packages (tmaptools must also installed):Required data downloaded due course.convenience reader ensure easy reproducibility, made available downloaded data spDataLarge package.","code":"\nlibrary(sf)\nlibrary(dplyr)\nlibrary(purrr)\nlibrary(terra)\nlibrary(osmdata)\nlibrary(spDataLarge)"},{"path":"location.html","id":"introduction-8","chapter":"14 Geomarketing","heading":"14.1 Introduction","text":"chapter demonstrates skills learned Parts II can applied particular domain: geomarketing (sometimes also referred location analysis location intelligence).\nbroad field research commercial application.\ntypical example geomarketing locate new shop.\naim attract visitors , ultimately, make profit.\nalso many non-commercial applications can use technique public benefit, example locate new health services (Tomintz, Clarke, Rigby 2008).People fundamental location analysis, particular likely spend time resources.\nInterestingly, ecological concepts models quite similar used store location analysis.\nAnimals plants can best meet needs certain ‘optimal’ locations, based variables change space (Muenchow et al. (2018); see also Chapter 15).\none great strengths geocomputation GIScience general: concepts methods transferable fields.\nPolar bears, example, prefer northern latitudes temperatures lower food (seals sea lions) plentiful.\nSimilarly, humans tend congregate certain places, creating economic niches (high land prices) analogous ecological niche Arctic.\nmain task location analysis find , based available data, ‘optimal locations’ specific services.\nTypical research questions include:target groups live areas frequent?competing stores services located?many people can easily reach specific stores?existing services - -utilize market potential?market share company specific area?chapter demonstrates geocomputation can answer questions based hypothetical case study based real data.","code":""},{"path":"location.html","id":"case-study","chapter":"14 Geomarketing","heading":"14.2 Case study: bike shops in Germany","text":"Imagine starting chain bike shops Germany.\nstores placed urban areas many potential customers possible.\nAdditionally, hypothetical survey (invented chapter, commercial use!) suggests single young males (aged 20 40) likely buy products: target audience.\nlucky position sufficient capital open number shops.\nplaced?\nConsulting companies (employing geomarketing analysts) happily charge high rates answer questions.\nLuckily, can help open data open source software.\nfollowing sections demonstrate techniques learned first chapters book can applied undertake common steps service location analysis:Tidy input data German census (Section 14.3)Convert tabulated census data raster objects (Section 14.4)Identify metropolitan areas high population densities (Section 14.5)Download detailed geographic data (OpenStreetMap, osmdata) areas (Section 14.6)Create rasters scoring relative desirability different locations using map algebra (Section 14.7)Although applied steps specific case study, generalized many scenarios store location public service provision.","code":""},{"path":"location.html","id":"tidy-the-input-data","chapter":"14 Geomarketing","heading":"14.3 Tidy the input data","text":"German government provides gridded census data either 1 km 100 m resolution.\nfollowing code chunk downloads, unzips reads 1 km data.Please note census_de also available spDataLarge package:census_de object data frame containing 13 variables 360,000 grid cells across Germany.\nwork, need subset : Easting (x) Northing (y), number inhabitants (population; pop), mean average age (mean_age), proportion women (women) average household size (hh_size).\nvariables selected renamed German English code chunk summarized Table 14.1.\n, mutate() used convert values -1 -9 (meaning “unknown”) NA.TABLE 14.1: Categories variable census data Datensatzbeschreibung…xlsx located downloaded file census.zip (see Figure 14.1 spatial distribution).","code":"\ndownload.file(\"https://tinyurl.com/ybtpkwxz\", \n destfile = \"census.zip\", mode = \"wb\")\nunzip(\"census.zip\") # unzip the files\ncensus_de = readr::read_csv2(list.files(pattern = \"Gitter.csv\"))\ndata(\"census_de\", package = \"spDataLarge\")\n# pop = population, hh_size = household size\ninput = select(census_de, x = x_mp_1km, y = y_mp_1km, pop = Einwohner,\n women = Frauen_A, mean_age = Alter_D, hh_size = HHGroesse_D)\n# set -1 and -9 to NA\ninput_tidy = mutate(input, across(.cols = c(pop, women, mean_age, hh_size), \n .fns = ~ifelse(.x %in% c(-1, -9), NA, .x)))"},{"path":"location.html","id":"create-census-rasters","chapter":"14 Geomarketing","heading":"14.4 Create census rasters","text":"preprocessing, data can converted SpatRaster object (see Sections 2.3.4 3.3.1) help rast() function.\nsetting type argument xyz, x y columns input data frame correspond coordinates regular grid.\nremaining columns (: pop, women, mean_age, hh_size) serve values raster layers (Figure 14.1; see also code/14-location-figures.R GitHub repository).\nFIGURE 14.1: Gridded German census data 2011 (see Table 14.1 description classes).\nnext stage reclassify values rasters stored input_ras accordance survey mentioned Section 14.2, using terra function classify(), introduced Section 4.3.3.\ncase population data, convert classes numeric data type using class means.\nRaster cells assumed population 127 value 1 (cells ‘class 1’ contain 3 250 inhabitants) 375 value 2 (containing 250 500 inhabitants), (see Table 14.1).\ncell value 8000 inhabitants chosen ‘class 6’ cells contain 8000 people.\ncourse, approximations true population, precise values.97\nHowever, level detail sufficient delineate metropolitan areas (see next section).contrast pop variable, representing absolute estimates total population, remaining variables re-classified weights corresponding weights used survey.\nClass 1 variable women, instance, represents areas 0 40% population female;\nreclassified comparatively high weight 3 target demographic predominantly male.\nSimilarly, classes containing youngest people highest proportion single households reclassified high weights.Note made sure order reclassification matrices list elements input_ras.\ninstance, first element corresponds cases population.\nSubsequently, -loop applies reclassification matrix corresponding raster layer.\nFinally, code chunk ensures reclass layers name layers input_ras.","code":"\ninput_ras = rast(input_tidy, type = \"xyz\", crs = \"EPSG:3035\")\ninput_ras\n#> class : SpatRaster \n#> dimensions : 868, 642, 4 (nrow, ncol, nlyr)\n#> resolution : 1000, 1000 (x, y)\n#> extent : 4031000, 4673000, 2684000, 3552000 (xmin, xmax, ymin, ymax)\n#> coord. ref. : ETRS89-extended / LAEA Europe (EPSG:3035) \n#> source(s) : memory\n#> names : pop, women, mean_age, hh_size \n#> min values : 1, 1, 1, 1 \n#> max values : 6, 5, 5, 5\nrcl_pop = matrix(c(1, 1, 127, 2, 2, 375, 3, 3, 1250, \n 4, 4, 3000, 5, 5, 6000, 6, 6, 8000), \n ncol = 3, byrow = TRUE)\nrcl_women = matrix(c(1, 1, 3, 2, 2, 2, 3, 3, 1, 4, 5, 0), \n ncol = 3, byrow = TRUE)\nrcl_age = matrix(c(1, 1, 3, 2, 2, 0, 3, 5, 0),\n ncol = 3, byrow = TRUE)\nrcl_hh = rcl_women\nrcl = list(rcl_pop, rcl_women, rcl_age, rcl_hh)\nreclass = input_ras\nfor (i in seq_len(nlyr(reclass))) {\n reclass[[i]] = classify(x = reclass[[i]], rcl = rcl[[i]], right = NA)\n}\nnames(reclass) = names(input_ras)\nreclass # full output not shown\n#> ... \n#> names : pop, women, mean_age, hh_size \n#> min values : 127, 0, 0, 0 \n#> max values : 8000, 3, 3, 3"},{"path":"location.html","id":"define-metropolitan-areas","chapter":"14 Geomarketing","heading":"14.5 Define metropolitan areas","text":"deliberately define metropolitan areas pixels 20 km2 inhabited 500,000 people.\nPixels coarse resolution can rapidly created using aggregate(), introduced Section 5.3.3.\ncommand uses argument fact = 20 reduce resolution result twenty-fold (recall original raster resolution 1 km2).next stage keep cells half million people.Plotting reveals eight metropolitan regions (Figure 14.2).\nregion consists one raster cells.\nnice join cells belonging one region.\nterra’s patches() command exactly .\nSubsequently, .polygons() converts raster object spatial polygons, st_as_sf() converts sf object.\nFIGURE 14.2: aggregated population raster (resolution: 20 km) identified metropolitan areas (golden polygons) corresponding names.\nresulting eight metropolitan areas suitable bike shops (Figure 14.2; see also code/14-location-figures.R creating figure) still missing name.\nreverse geocoding approach can settle problem: given coordinate, finds corresponding address.\nConsequently, extracting centroid coordinate metropolitan area can serve input reverse geocoding API.\nexactly rev_geocode_OSM() function tmaptools package expects.\nSetting additionally .data.frame TRUE give back data.frame several columns referring location including street name, house number city.\nHowever, , interested name city.make sure reader uses exact results, put spDataLarge object metro_names.TABLE 14.2: Result reverse geocoding.Overall, satisfied city column serving metropolitan names (Table 14.2) apart one exception, namely Velbert belongs greater region Düsseldorf.\nHence, replace Velbert Düsseldorf (Figure 14.2).\nUmlauts like ü might lead trouble , example determining bounding box metropolitan area opq() (see ), avoid .","code":"\npop_agg = aggregate(reclass$pop, fact = 20, fun = sum, na.rm = TRUE)\nsummary(pop_agg)\n#> pop \n#> Min. : 127 \n#> 1st Qu.: 39886 \n#> Median : 66008 \n#> Mean : 99503 \n#> 3rd Qu.: 105696 \n#> Max. :1204870 \n#> NA's :447\npop_agg = pop_agg[pop_agg > 500000, drop = FALSE] \nmetros = pop_agg |> \n patches(directions = 8) |>\n as.polygons() |>\n st_as_sf()\nmetro_names = sf::st_centroid(metros, of_largest_polygon = TRUE) |>\n tmaptools::rev_geocode_OSM(as.data.frame = TRUE) |>\n select(city, town, state)\n# smaller cities are returned in column town. To have all names in one column,\n# we move the town name to the city column in case it is NA\nmetro_names = dplyr::mutate(metro_names, city = ifelse(is.na(city), town, city))\nmetro_names = metro_names$city |> \n as.character() |>\n {\\(x) ifelse(x == \"Velbert\", \"Düsseldorf\", x)}() |>\n {\\(x) gsub(\"ü\", \"ue\", x)}()"},{"path":"location.html","id":"points-of-interest","chapter":"14 Geomarketing","heading":"14.6 Points of interest","text":"\nosmdata package provides easy--use access OSM data (see also Section 8.5).\nInstead downloading shops whole Germany, restrict query defined metropolitan areas, reducing computational load providing shop locations areas interest.\nsubsequent code chunk using number functions including:map() (tidyverse equivalent lapply()), iterates eight metropolitan names subsequently define bounding box OSM query function opq() (see Section 8.5)add_osm_feature() specify OSM elements key value shop (see wiki.openstreetmap.org list common key:value pairs)osmdata_sf(), converts OSM data spatial objects (class sf)(), tries two times download data download failed first time98Before running code: please consider download almost 2GB data.\nsave time resources, put output named shops spDataLarge.\nmake available environment run data(\"shops\", package = \"spDataLarge\").highly unlikely shops defined metropolitan areas.\nfollowing condition simply checks least one shop region.\n, recommend try download shops /specific region/s.make sure list element (sf data frame) comes columns99 keep osm_id shop columns help map_dfr loop additionally combines shops one large sf object.Note: shops provided spDataLarge can accessed follows:thing left convert spatial point object raster (see Section 6.4).\nsf object, shops, converted raster parameters (dimensions, resolution, CRS) reclass object.\nImportantly, length() function used count number shops cell.result subsequent code chunk therefore estimate shop density (shops/km2).\nst_transform() used rasterize() ensure CRS inputs match.raster layers (population, women, mean age, household size) poi raster reclassified four classes (see Section 14.4).\nDefining class intervals arbitrary undertaking certain degree.\nOne can use equal breaks, quantile breaks, fixed values others.\n, choose Fisher-Jenks natural breaks approach minimizes within-class variance, result provides input reclassification matrix.","code":"\nshops = purrr::map(metro_names, function(x) {\n message(\"Downloading shops of: \", x, \"\\n\")\n # give the server a bit time\n Sys.sleep(sample(seq(5, 10, 0.1), 1))\n query = osmdata::opq(x) |>\n osmdata::add_osm_feature(key = \"shop\")\n points = osmdata::osmdata_sf(query)\n # request the same data again if nothing has been downloaded\n iter = 2\n while (nrow(points$osm_points) == 0 && iter > 0) {\n points = osmdata_sf(query)\n iter = iter - 1\n }\n # return only the point features\n points$osm_points\n})\n# checking if we have downloaded shops for each metropolitan area\nind = purrr::map_dbl(shops, nrow) == 0\nif (any(ind)) {\n message(\"There are/is still (a) metropolitan area/s without any features:\\n\",\n paste(metro_names[ind], collapse = \", \"), \"\\nPlease fix it!\")\n}\n# select only specific columns\nshops = purrr::map_dfr(shops, select, osm_id, shop)\ndata(\"shops\", package = \"spDataLarge\")\nshops = sf::st_transform(shops, st_crs(reclass))\n# create poi raster\npoi = rasterize(x = shops, y = reclass, field = \"osm_id\", fun = \"length\")\n# construct reclassification matrix\nint = classInt::classIntervals(values(poi), n = 4, style = \"fisher\")\nint = round(int$brks)\nrcl_poi = matrix(c(int[1], rep(int[-c(1, length(int))], each = 2), \n int[length(int)] + 1), ncol = 2, byrow = TRUE)\nrcl_poi = cbind(rcl_poi, 0:3) \n# reclassify\npoi = classify(poi, rcl = rcl_poi, right = NA) \nnames(poi) = \"poi\""},{"path":"location.html","id":"identifying-suitable-locations","chapter":"14 Geomarketing","heading":"14.7 Identifying suitable locations","text":"steps remain combining layers add poi reclass raster stack remove population layer .\nreasoning latter twofold.\nFirst , already delineated metropolitan areas, areas population density average compared rest Germany.\nSecond, though advantageous many potential customers within specific catchment area, sheer number alone might actually represent desired target group.\ninstance, residential tower blocks areas high population density necessarily high purchasing power expensive cycle components.common data science projects, data retrieval ‘tidying’ consumed much overall workload far.\nclean data, final step — calculating final score summing raster layers — can accomplished single line code.instance, score greater 9 might suitable threshold indicating raster cells bike shop placed (Figure 14.3; see also code/14-location-figures.R).\nFIGURE 14.3: Suitable areas (.e., raster cells score > 9) accordance hypothetical survey bike stores Berlin.\n","code":"\n# remove population raster and add poi raster\nreclass = reclass[[names(reclass) != \"pop\"]] |>\n c(poi)\n# calculate the total score\nresult = sum(reclass)"},{"path":"location.html","id":"discussion-and-next-steps","chapter":"14 Geomarketing","heading":"14.8 Discussion and next steps","text":"presented approach typical example normative usage GIS (Longley 2015).\ncombined survey data expert-based knowledge assumptions (definition metropolitan areas, defining class intervals, definition final score threshold).\napproach less suitable scientific research applied analysis provides evidence based indication areas suitable bike shops compared sources information.\nnumber changes approach improve analysis:used equal weights calculating final scores factors, household size, important portion women mean ageWe used points interest related bike shops, --, hardware, bicycle, fishing, hunting, motorcycles, outdoor sports shops (see range shop values available OSM Wiki) may yielded refined resultsData higher resolution may improve output (see exercises)used limited set variables data sources, INSPIRE geoportal data cycle paths OpenStreetMap, may enrich analysis (see also Section 8.5)Interactions remained unconsidered, possible relationships portion men single householdsIn short, analysis extended multiple directions.\nNevertheless, given first impression understanding obtain deal spatial data R within geomarketing context.Finally, point presented analysis merely first step finding suitable locations.\nfar identified areas, 1 1 km size, representing potentially suitable locations bike shop accordance survey.\nSubsequent steps analysis taken:Find optimal location based number inhabitants within specific catchment area.\nexample, shop reachable many people possible within 15 minutes traveling bike distance (catchment area routing).\nThereby, account fact away people shop, unlikely becomes actually visit (distance decay function)Also good idea take account competitors.\n, already bike shop vicinity chosen location, possible customers (sales potential) distributed competitors (Huff 1963; Wieland 2017)need find suitable affordable real estate, e.g., terms accessibility, availability parking spots, desired frequency passers-, big windows, etc.","code":""},{"path":"location.html","id":"exercises-10","chapter":"14 Geomarketing","heading":"14.9 Exercises","text":"E1. Download csv file containing inhabitant information 100 m cell resolution (https://www.zensus2011.de/SharedDocs/Downloads/DE/Pressemitteilung/DemografischeGrunddaten/csv_Bevoelkerung_100m_Gitter.zip?__blob=publicationFile&v=3).\nPlease note unzipped file size 1.23 GB.\nread R can use readr::read_csv.\ntakes 30 seconds machine 16 GB RAM.\ndata.table::fread() might even faster, returns object class data.table().\nUse dplyr::as_tibble() convert tibble.\nBuild inhabitant raster, aggregate cell resolution 1 km, compare difference inhabitant raster (inh) created using class mean values.E2. Suppose bike shop predominantly sold electric bikes older people.\nChange age raster accordingly, repeat remaining analyses compare changes original result.","code":""},{"path":"eco.html","id":"eco","chapter":"15 Ecology","heading":"15 Ecology","text":"","code":""},{"path":"eco.html","id":"prerequisites-13","chapter":"15 Ecology","heading":"Prerequisites","text":"chapter assumes strong grasp geographic data analysis processing, covered Chapters 2 5.\nchapter makes use bridges GIS software, spatial cross-validation, covered Chapters 10 12 respectively.chapter uses following packages:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(data.table) # fast data frame manipulation (used by mlr3)\nlibrary(mlr3) # machine learning (see Chapter 12)\nlibrary(mlr3spatiotempcv) # spatiotemporal resampling \nlibrary(mlr3tuning) # hyperparameter tuning package\nlibrary(mlr3learners) # interface to most important machine learning pkgs\nlibrary(paradox) # defining hyperparameter spaces\nlibrary(ranger) # random forest package\nlibrary(qgisprocess) # bridge to QGIS (Chapter 10)\nlibrary(tree) # decision tree package\nlibrary(vegan) # community ecology package"},{"path":"eco.html","id":"introduction-9","chapter":"15 Ecology","heading":"15.1 Introduction","text":"chapter models floristic gradient fog oases reveal distinctive vegetation belts clearly controlled water availability.\ncase study provides opportunity bring together extend concepts presented previous chapters enhance skills using R geocomputation.Fog oases, locally called lomas, vegetation formations found mountains along coastal deserts Peru Chile.\nSimilar ecosystems can found elsewhere, including deserts Namibia along coasts Yemen Oman (Galletti, Turner, Myint 2016).\nDespite arid conditions low levels precipitation around 30-50 mm per year average, fog deposition increases amount water available plants austral winter, resulting green southern-facing mountain slopes along coastal strip Peru.\nfog, develops temperature inversion caused cold Humboldt current austral winter, provides name habitat.\nEvery years, El Niño phenomenon brings torrential rainfall sun-baked environment, providing tree seedlings chance develop roots long enough survive following arid conditions (Dillon, Nakazawa, Leiva 2003).Unfortunately, fog oases heavily endangered, primarily due agriculture anthropogenic climate change.\nEvidence composition spatial distribution native flora can support efforts protect remaining fragments fog oases (Muenchow, Bräuning, et al. 2013; Muenchow, Hauenstein, et al. 2013).chapter analyze composition spatial distribution vascular plants (referring mostly flowering plants) southern slope Mt. Mongón, lomas mountain near Casma central northern coast Peru (Figure 15.1).\nfield study Mt. Mongón, vascular plants living 100 randomly sampled 4x4 m2 plots austral winter 2011 recorded (Muenchow, Bräuning, et al. 2013).\nsampling coincided strong La Niña event year, shown data published National Oceanic Atmospheric Administration (NOAA).\nled even higher levels aridity usual coastal desert increased fog activity southern slopes Peruvian lomas mountains.\nFIGURE 15.1: Mt. Mongón study area, Muenchow, Schratz, Brenning (2017).\nchapter also demonstrates apply techniques covered previous chapters important applied field: ecology.\nSpecifically, :Load needed data compute environmental predictors (Section 15.2)Extract main floristic gradient species composition matrix help dimension-reducing technique (ordinations; Section 15.3)Model first ordination axis, .e., floristic gradient, function environmental predictors altitude, slope, catchment area NDVI (Section 15.4).\n, make use random forest model — popular machine learning algorithm (Breiman 2001). guarantee optimal prediction, advisable tune beforehand hyperparameters help spatial cross-validation (see Section 12.5.2)Make spatial distribution map floristic composition anywhere study area (Section 15.4.2)","code":""},{"path":"eco.html","id":"data-and-data-preparation","chapter":"15 Ecology","heading":"15.2 Data and data preparation","text":"data needed subsequent analyses available via spDataLarge package.study_area polygon representing outline study area, random_points sf object containing 100 randomly chosen sites.\ncomm community matrix wide data format (Wickham 2014) rows represent visited sites field columns observed species.100The values represent species cover per site, recorded area covered species proportion site area (%; please note one site can >100% due overlapping cover individual plants).\nrownames comm correspond id column random_points.\ndem digital elevation model (DEM) study area, ndvi Normalized Difference Vegetation Index (NDVI) computed red near-infrared channels Landsat scene (see Section 4.3.3 ?spDataLarge::ndvi.tif).\nVisualizing data helps get familiar , shown Figure 15.2 dem overplotted random_points study_area.\nFIGURE 15.2: Study mask (polygon), location sampling sites (black points) DEM background.\nnext step compute variables needed modeling predictive mapping (see Section 15.4.2) also aligning non-metric multidimensional scaling (NMDS) axes main gradient study area, altitude humidity, respectively (see Section 15.3).Specifically, compute catchment slope catchment area digital elevation model using R-GIS bridges (see Chapter 10).\nCurvatures might also represent valuable predictors, exercise section can find impact modeling result.compute catchment area catchment slope, can make use sagang:sagawetnessindex function.101\nqgis_show_help() returns function parameters default values specific geoalgorithm.\n, present selection complete output.Subsequently, can specify needed parameters using R named arguments (see Section 10.2).\nRemember can use path file disk SpatRaster living R’s global environment specify input raster DEM (see Section 10.2).\nSpecifying 1 SLOPE_TYPE makes sure algorithm return catchment slope.\nresulting rasters saved temporary files .sdat extension native SAGA raster format.returns list named ep containing paths computed output rasters.\nLet’s read catchment area well catchment slope multilayer SpatRaster object (see Section 2.3.4).\nAdditionally, add two raster objects , namely dem ndvi.Additionally, catchment area values highly skewed right (hist(ep$carea)).\nlog10-transformation makes distribution normal.convenience reader, added ep spDataLarge:Finally, can extract terrain attributes field observations (see also Section 6.3).","code":"\ndata(\"study_area\", \"random_points\", \"comm\", package = \"spDataLarge\")\ndem = rast(system.file(\"raster/dem.tif\", package = \"spDataLarge\"))\nndvi = rast(system.file(\"raster/ndvi.tif\", package = \"spDataLarge\"))\n# sites 35 to 40 and corresponding occurrences of the first five species in the\n# community matrix\ncomm[35:40, 1:5]\n#> Alon_meri Alst_line Alte_hali Alte_porr Anth_eccr\n#> 35 0 0 0 0.0 1.000\n#> 36 0 0 1 0.0 0.500\n#> 37 0 0 0 0.0 0.125\n#> 38 0 0 0 0.0 3.000\n#> 39 0 0 0 0.0 2.000\n#> 40 0 0 0 0.2 0.125\n# if not already done, enable the saga next generation plugin\nqgisprocess::qgis_enable_plugins(\"processing_saga_nextgen\")\n# show help\nqgisprocess::qgis_show_help(\"sagang:sagawetnessindex\")\n#> Saga wetness index (sagang:sagawetnessindex)\n#> ...\n#> ----------------\n#> Arguments\n#> ----------------\n#> \n#> DEM: Elevation\n#> Argument type: raster\n#> Acceptable values:\n#> - Path to a raster layer\n#> ...\n#> SLOPE_TYPE: Type of Slope\n#> Argument type: enum\n#> Available values:\n#> - 0: [0] local slope\n#> - 1: [1] catchment slope\n#> ...\n#> AREA: Catchment area\n#> Argument type: rasterDestination\n#> Acceptable values:\n#> - Path for new raster layer\n#>... \n#> ----------------\n#> Outputs\n#> ----------------\n#> \n#> AREA: \n#> Catchment area\n#> SLOPE: \n#> Catchment slope\n#> ...\n# environmental predictors: catchment slope and catchment area\nep = qgisprocess::qgis_run_algorithm(\n alg = \"sagang:sagawetnessindex\",\n DEM = dem,\n SLOPE_TYPE = 1, \n SLOPE = tempfile(fileext = \".sdat\"),\n AREA = tempfile(fileext = \".sdat\"),\n .quiet = TRUE)\n# read in catchment area and catchment slope\nep = ep[c(\"AREA\", \"SLOPE\")] |>\n unlist() |>\n rast()\nnames(ep) = c(\"carea\", \"cslope\") # assign better names \norigin(ep) = origin(dem) # make sure rasters have the same origin\nep = c(dem, ndvi, ep) # add dem and ndvi to the multilayer SpatRaster object\nep$carea = log10(ep$carea)\nep = rast(system.file(\"raster/ep.tif\", package = \"spDataLarge\"))\n# terra::extract adds automatically a for our purposes unnecessary ID column\nep_rp = terra::extract(ep, random_points, ID = FALSE)\nrandom_points = cbind(random_points, ep_rp)"},{"path":"eco.html","id":"nmds","chapter":"15 Ecology","heading":"15.3 Reducing dimensionality","text":"Ordinations popular tool vegetation science extract main information, frequently corresponding ecological gradients, large species-plot matrices mostly filled 0s.\nHowever, also used remote sensing, soil sciences, geomarketing many fields.\nunfamiliar ordination techniques need refresher, look Michael W. Palmer’s web page short introduction popular ordination techniques ecology Borcard, Gillet, Legendre (2011) deeper look apply techniques R.\nvegan’s package documentation also helpful resource (vignette(package = \"vegan\")).Principal component analysis (PCA) probably famous ordination technique.\ngreat tool reduce dimensionality one can expect linear relationships variables, joint absence variable two plots (observations) can considered similarity.\nbarely case vegetation data.one, presence plant often follows unimodal, .e. non-linear, relationship along gradient (e.g., humidity, temperature salinity) peak favorable conditions declining ends towards unfavorable conditions.Secondly, joint absence species two plots hardly indication similarity.\nSuppose plant species absent driest (e.g., extreme desert) moistest locations (e.g., tree savanna) sampling.\nreally refrain counting similarity likely thing two completely different environmental settings common terms floristic composition shared absence species (except rare ubiquitous species).Non-metric multidimensional scaling (NMDS) one popular dimension-reducing technique used ecology (von Wehrden et al. 2009).\nNMDS reduces rank-based differences distances objects original matrix distances ordinated objects.\ndifference expressed stress.\nlower stress value, better ordination, .e., low-dimensional representation original matrix.\nStress values lower 10 represent excellent fit, stress values around 15 still good, values greater 20 represent poor fit (McCune, Grace, Urban 2002).\nR, metaMDS() vegan package can execute NMDS.\ninput, expects community matrix sites rows species columns.\nOften ordinations using presence-absence data yield better results (terms explained variance) though prize , course, less informative input matrix (see also Exercises).\ndecostand() converts numerical observations presences absences 1 indicating occurrence species 0 absence species.\nOrdination techniques NMDS require least one observation per site.\nHence, need dismiss sites species found.resulting matrix serves input NMDS.\nk specifies number output axes, , set 4.102\nNMDS iterative procedure trying make ordinated space similar input matrix step.\nmake sure algorithm converges, set number steps 500 using try parameter.stress value 9 represents good result, means reduced ordination space represents large majority variance input matrix.\nOverall, NMDS puts objects similar (terms species composition) closer together ordination space.\nHowever, opposed ordination techniques, axes arbitrary necessarily ordered importance (Borcard, Gillet, Legendre 2011).\nHowever, already know humidity represents main gradient study area (Muenchow, Bräuning, et al. 2013; Muenchow, Schratz, Brenning 2017).\nSince humidity highly correlated elevation, rotate NMDS axes accordance elevation (see also ?MDSrotate details rotating NMDS axes).\nPlotting result reveals first axis , intended, clearly associated altitude (Figure 15.3).\nFIGURE 15.3: Plotting first NMDS axis altitude.\nscores first NMDS axis represent different vegetation formations, .e., floristic gradient, appearing along slope Mt. Mongón.\nspatially visualize , can model NMDS scores previously created predictors (Section 15.2), use resulting model predictive mapping (see next section).","code":"\n# presence-absence matrix\npa = vegan::decostand(comm, \"pa\") # 100 rows (sites), 69 columns (species)\n# keep only sites in which at least one species was found\npa = pa[rowSums(pa) != 0, ] # 84 rows, 69 columns\nset.seed(25072018)\nnmds = vegan::metaMDS(comm = pa, k = 4, try = 500)\nnmds$stress\n#> ...\n#> Run 498 stress 0.08834745 \n#> ... Procrustes: rmse 0.004100446 max resid 0.03041186 \n#> Run 499 stress 0.08874805 \n#> ... Procrustes: rmse 0.01822361 max resid 0.08054538 \n#> Run 500 stress 0.08863627 \n#> ... Procrustes: rmse 0.01421176 max resid 0.04985418 \n#> *** Solution reached\n#> 0.08831395\nelev = dplyr::filter(random_points, id %in% rownames(pa)) |> \n dplyr::pull(dem)\n# rotating NMDS in accordance with altitude (proxy for humidity)\nrotnmds = vegan::MDSrotate(nmds, elev)\n# extracting the first two axes\nsc = vegan::scores(rotnmds, choices = 1:2, display = \"sites\")\n# plotting the first axis against altitude\nplot(y = sc[, 1], x = elev, xlab = \"elevation in m\", \n ylab = \"First NMDS axis\", cex.lab = 0.8, cex.axis = 0.8)"},{"path":"eco.html","id":"modeling-the-floristic-gradient","chapter":"15 Ecology","heading":"15.4 Modeling the floristic gradient","text":"predict floristic gradient spatially, use random forest model.\nRandom forest models frequently applied environmental ecological modeling, often provide best results terms predictive performance (Hengl et al. 2018; Schratz et al. 2019).\n, shortly introduce decision trees bagging, since form basis random forests.\nrefer reader James et al. (2013) detailed description random forests related techniques.introduce decision trees example, first construct response-predictor matrix joining rotated NMDS scores field observations (random_points).\nalso use resulting data frame mlr3 modeling later .Decision trees split predictor space number regions.\nillustrate , apply decision tree data using scores first NMDS axis response (sc) altitude (dem) predictor.\nFIGURE 15.4: Simple example decision tree three internal nodes four terminal nodes.\nresulting tree consists three internal nodes four terminal nodes (Figure 15.4).\nfirst internal node top tree assigns observations 328.5 m left observations right branch.\nobservations falling left branch mean NMDS score -1.198.\nOverall, can interpret tree follows: higher elevation, higher NMDS score becomes.\nmeans simple decision tree already revealed four distinct floristic assemblages.\n-depth interpretation please refer 15.4.2 section.\nDecision trees tendency overfit, mirror closely input data including noise turn leads bad predictive performances (Section 12.4; James et al. (2013)).\nBootstrap aggregation (bagging) ensemble technique can help overcome problem.\nEnsemble techniques simply combine predictions multiple models.\nThus, bagging takes repeated samples input data averages predictions.\nreduces variance overfitting result much better predictive accuracy compared decision trees.\nFinally, random forests extend improve bagging decorrelating trees desirable since averaging predictions highly correlated trees shows higher variance thus lower reliability averaging predictions decorrelated trees (James et al. 2013).\nachieve , random forests use bagging, contrast traditional bagging tree allowed use available predictors, random forests use random sample available predictors.","code":"\n# construct response-predictor matrix\n# id- and response variable\nrp = data.frame(id = as.numeric(rownames(sc)), sc = sc[, 1])\n# join the predictors (dem, ndvi and terrain attributes)\nrp = inner_join(random_points, rp, by = \"id\")\ntree_mo = tree::tree(sc ~ dem, data = rp)\nplot(tree_mo)\ntext(tree_mo, pretty = 0)"},{"path":"eco.html","id":"mlr3-building-blocks","chapter":"15 Ecology","heading":"15.4.1 mlr3 building blocks","text":"code section largely follows steps introduced Section 12.5.2.\ndifferences following:response variable numeric, hence regression task replace classification task Section 12.5.2Instead AUROC can used categorical response variables, use root mean squared error (RMSE) performance measureWe use random forest model instead support vector machine naturally goes along different hyperparametersWe leaving assessment bias-reduced performance measure exercise reader (see Exercises).\nInstead show tune hyperparameters (spatial) predictionsRemember 125,500 models necessary retrieve bias-reduced performance estimates using 100-repeated 5-fold spatial cross-validation random search 50 iterations Section 12.5.2.\nhyperparameter tuning level, found best hyperparameter combination turn used outer performance level predicting test data specific spatial partition (see also Figure 12.6).\ndone five spatial partitions, repeated 100 times yielding total 500 optimal hyperparameter combinations.\none use making spatial distribution maps?\nanswer simple: none .\nRemember, tuning done retrieve bias-reduced performance estimate, best possible spatial prediction.\nlatter, one estimates best hyperparameter combination complete dataset.\nmeans, inner hyperparameter tuning level longer needed makes perfect sense since applying model new data (unvisited field observations) true outcomes unavailable, hence testing impossible case.\nTherefore, tune hyperparameters good spatial prediction complete dataset via 5-fold spatial CV one repetition.\nalready constructed input variables (rp), set specifying mlr3 building blocks (task, learner, resampling).\nspecifying spatial task, use mlr3spatiotempcv package (Schratz et al. 2021 & Section 12.5), since response (sc) numeric, use regression task.Using sf object backend automatically provides geometry information needed spatial partitioning later .\nAdditionally, got rid columns id spri since variables used predictors modeling.\nNext, go construct random forest learner ranger package (Wright Ziegler 2017).opposed , example, support vector machines (see Section 12.5.2), random forests often already show good performances used default values hyperparameters (may one reason popularity).\nStill, tuning often moderately improves model results, thus worth effort (Probst, Wright, Boulesteix 2018).\nrandom forests, hyperparameters mtry, min.node.size sample.fraction determine degree randomness, tuned (Probst, Wright, Boulesteix 2018).\nmtry indicates many predictor variables used tree.\npredictors used, corresponds fact bagging (see beginning Section 15.4).\nsample.fraction parameter specifies fraction observations used tree.\nSmaller fractions lead greater diversity, thus less correlated trees often desirable (see ).\nmin.node.size parameter indicates number observations terminal node least (see also Figure 15.4).\nNaturally, trees computing time become larger, lower min.node.size.Hyperparameter combinations selected randomly fall inside specific tuning limits (created paradox::ps()).\nmtry range 1 number predictors (4) , sample.fraction range 0.2 0.9 min.node.size range 1 10 (Probst, Wright, Boulesteix 2018).defined search space, set specifying tuning via AutoTuner() function.\nSince deal geographic data, make use spatial cross-validation tune hyperparameters (see Sections 12.4 12.5).\nSpecifically, use five-fold spatial partitioning one repetition (rsmp()).\nspatial partitions, run 50 models (trm()) using randomly selected hyperparameter configurations (tnr()) within predefined limits (seach_space) find optimal hyperparameter combination (see also Section 12.5.2 https://mlr3book.mlr-org.com/chapters/chapter4/hyperparameter_optimization.html#sec-autotuner, Becker et al. 2022).\nperformance measure root mean squared error (RMSE).Calling train()-method AutoTuner-object finally runs hyperparameter tuning, find optimal hyperparameter combination specified parameters.","code":"\n# create task\ntask = mlr3spatiotempcv::as_task_regr_st(\n select(rp, -id, -spri),\n target = \"sc\",\n id = \"mongon\"\n)\nlrn_rf = lrn(\"regr.ranger\", predict_type = \"response\")\n# specifying the search space\nsearch_space = paradox::ps(\n mtry = paradox::p_int(lower = 1, upper = ncol(task$data()) - 1),\n sample.fraction = paradox::p_dbl(lower = 0.2, upper = 0.9),\n min.node.size = paradox::p_int(lower = 1, upper = 10)\n)\nautotuner_rf = mlr3tuning::auto_tuner(\n learner = lrn_rf,\n resampling = mlr3::rsmp(\"spcv_coords\", folds = 5), # spatial partitioning\n measure = mlr3::msr(\"regr.rmse\"), # performance measure\n terminator = mlr3tuning::trm(\"evals\", n_evals = 50), # specify 50 iterations\n search_space = search_space, # predefined hyperparameter search space\n tuner = mlr3tuning::tnr(\"random_search\") # specify random search\n)\n# hyperparameter tuning\nset.seed(24092024)\nautotuner_rf$train(task)\nautotuner_rf$tuning_result\n#> mtry sample.fraction min.node.size learner_param_vals x_domain regr.rmse\n#> \n#> 1: 4 0.784 10 0.382"},{"path":"eco.html","id":"predictive-mapping","chapter":"15 Ecology","heading":"15.4.2 Predictive mapping","text":"tuned hyperparameters can now used prediction.\n, need run predict method fitted AutoTuner object.predict method apply model observations used modeling.\nGiven multilayer SpatRaster containing rasters named predictors used modeling, terra::predict() also make spatial distribution maps, .e., predict new data.\nFIGURE 15.5: Predictive mapping floristic gradient clearly revealing distinct vegetation belts.\ncase, terra::predict() support model algorithm, can still make predictions manually.predictive mapping clearly reveals distinct vegetation belts (Figure 15.5).\nPlease refer Muenchow, Hauenstein, et al. (2013) detailed description vegetation belts lomas mountains.\nblue color tones represent -called Tillandsia-belt.\nTillandsia highly adapted genus especially found high quantities sandy quite desertic foot lomas mountains.\nyellow color tones refer herbaceous vegetation belt much higher plant cover compared Tillandsia-belt.\norange colors represent bromeliad belt, features highest species richness plant cover.\ncan found directly beneath temperature inversion (ca. 750-850 m asl) humidity due fog highest.\nWater availability naturally decreases temperature inversion, landscape becomes desertic succulent species (succulent belt; red colors).\nInterestingly, spatial prediction clearly reveals bromeliad belt interrupted interesting finding detected without predictive mapping.","code":"\n# predicting using the best hyperparameter combination\nautotuner_rf$predict(task)\n#> for 84 observations:\n#> row_ids truth response\n#> 1 -1.084 -1.176\n#> 2 -0.975 -1.176\n#> 3 -0.912 -1.168\n#> --- --- ---\n#> 82 0.814 0.594\n#> 83 0.814 0.746\n#> 84 0.808 0.807\npred = terra::predict(ep, model = autotuner_rf, fun = predict)\nnewdata = as.data.frame(as.matrix(ep))\ncolSums(is.na(newdata)) # 0 NAs\n# but assuming there were 0s results in a more generic approach\nind = rowSums(is.na(newdata)) == 0\ntmp = autotuner_rf$predict_newdata(newdata = newdata[ind, ], task = task)\nnewdata[ind, \"pred\"] = data.table::as.data.table(tmp)[[\"response\"]]\npred_2 = ep$dem\n# now fill the raster with the predicted values\npred_2[] = newdata$pred\n# check if terra and our manual prediction is the same\nall(values(pred - pred_2) == 0)"},{"path":"eco.html","id":"conclusions-1","chapter":"15 Ecology","heading":"15.5 Conclusions","text":"chapter ordinated community matrix lomas Mt. Mongón help NMDS (Section 15.3).\nfirst axis, representing main floristic gradient study area, modeled function environmental predictors partly derived R-GIS bridges (Section 15.2).\nmlr3 package provided building blocks spatially tune hyperparameters mtry, sample.fraction min.node.size (Section 15.4.1).\ntuned hyperparameters served input final model turn applied environmental predictors spatial representation floristic gradient (Section 15.4.2).\nresult demonstrates spatially astounding biodiversity middle desert.\nSince lomas mountains heavily endangered, prediction map can serve basis informed decision-making delineating protection zones, making local population aware uniqueness found immediate neighborhood.terms methodology, additional points addressed:interesting also model second ordination axis, subsequently find innovative way visualizing jointly modeled scores two axes one prediction mapIf interested interpreting model ecologically meaningful way, probably use (semi-)parametric models (Muenchow, Bräuning, et al. 2013; . Zuur et al. 2009; . F. Zuur et al. 2017)/\nHowever, least approaches help interpret machine learning models random forests (see, e.g., https://mlr-org.com/posts/2018-04-30-interpretable-machine-learning-iml--mlr/)sequential model-based optimization (SMBO) might preferable random search hyperparameter optimization used chapter (Probst, Wright, Boulesteix 2018)Finally, please note random forest machine learning models frequently used setting lots observations many predictors, much used chapter, unclear variables variable interactions contribute explaining response.\nAdditionally, relationships might highly non-linear.\nuse case, relationship response predictors pretty clear, slight amount non-linearity number observations predictors low.\nHence, might worth trying linear model.\nlinear model much easier explain understand random forest model, therefore preferred (law parsimony), additionally computationally less demanding (see Exercises).\nlinear model cope degree non-linearity present data, one also try generalized additive model (GAM).\npoint toolbox data scientist consists one tool, responsibility select tool best suited task purpose hand.\n, wanted introduce reader random forest modeling use corresponding results predictive mapping purposes.\npurpose, well-studied dataset known relationships response predictors, appropriate.\nHowever, imply random forest model returned best result terms predictive performance.","code":""},{"path":"eco.html","id":"exercises-11","chapter":"15 Ecology","heading":"15.6 Exercises","text":"solutions assume following packages attached (packages attached needed):E1. Run NMDS using percentage data community matrix.\nReport stress value compare stress value retrieved NMDS using presence-absence data.\nmight explain observed difference?E2. Compute predictor rasters used chapter (catchment slope, catchment area), put SpatRaster-object.\nAdd dem ndvi .\nNext, compute profile tangential curvature add additional predictor rasters (hint: grass7:r.slope.aspect).\nFinally, construct response-predictor matrix.\nscores first NMDS axis (result using presence-absence community matrix) rotated accordance elevation represent response variable, joined random_points (use inner join).\ncomplete response-predictor matrix, extract values environmental predictor raster object random_points.E3. Retrieve bias-reduced RMSE random forest linear model using spatial cross-validation.\nrandom forest modeling include estimation optimal hyperparameter combinations (random search 50 iterations) inner tuning loop.\nParallelize tuning level.\nReport mean RMSE use boxplot visualize retrieved RMSEs.\nPlease exercise best solved using mlr3 functions benchmark_grid() benchmark() (see https://mlr3book.mlr-org.com/perf-eval-cmp.html#benchmarking information).","code":""},{"path":"conclusion.html","id":"conclusion","chapter":"16 Conclusion","heading":"16 Conclusion","text":"","code":""},{"path":"conclusion.html","id":"introduction-10","chapter":"16 Conclusion","heading":"16.1 Introduction","text":"Like introduction, concluding chapter contains code chunks.\naim synthesize contents book, reference recurring themes/concepts, inspire future directions application development.\nchapter prerequisites.\nHowever, may get read attempted exercises Part (Foundations), tried advances approaches Part II (Extensions), considered geocomputation can help solve work, research problems, reference chapters Part III (Applications).chapter organized follows.\nSection 16.2 discusses wide range options handling geographic data R.\nChoice key feature open source software; section provides guidance choosing various options.\nSection 16.3 describes gaps book’s contents explains areas research deliberately omitted, others emphasized.\nNext, Section 16.4 provides advice ask good questions get stuck, search solutions online.\nSection 16.5 answers following question: read book, go next?\nSection 16.6 returns wider issues raised Chapter 1.\nconsider geocomputation part wider ‘open source approach’ ensures methods publicly accessible, reproducible supported collaborative communities.\nfinal section book also provides pointers get involved.","code":""},{"path":"conclusion.html","id":"package-choice","chapter":"16 Conclusion","heading":"16.2 Package choice","text":"feature R, open source software general, often multiple ways achieve result.\ncode chunk illustrates using three functions, covered Chapters 3 5, combine 16 regions New Zealand single geometry:Although classes, attributes column names resulting objects nz_u1 nz_u3 differ, geometries identical, verified using base R function identical().103\nuse?\ndepends: former processes geometry data contained nz faster, options performed attribute operations, may useful subsequent steps.\nWhether use base R function aggregate() dplyr function summarise() matter preference, latter readable many.wider point often multiple options choose working geographic data R, even within single package.\nrange options grows R packages considered: achieve result using older sp package, example.\nHowever, based goal providing good advice, recommend using recent, performant future-proof sf package.\napplies packages showcased book, although can helpful (distracting) aware alternatives able justify choice software.common choice, simple answer, tidyverse base R geocomputation.\nfollowing code chunk, example, shows tidyverse base R ways extract Name column nz object, described Chapter 3:raises question: use?\nanswer : depends.\napproach advantages: base R tends stable, well-known, minimal dependencies, often preferred software (package) development.\ntidyverse approach, hand, often preferred interactive programming.\nChoosing two approaches therefore matter preference application.book covers commonly needed functions — base R [ subsetting operator dplyr function select() demonstrated code chunk — many functions working geographic data, packages, mentioned.\nChapter 1 mentions 20+ influential packages working geographic data, handful covered book.\nHundreds packages available working geographic data R, many developed year.\n2024, 160 packages mentioned Spatial Task View countless functions geographic data analysis developed year.rate evolution R’s spatial ecosystem may fast, strategies deal wide range options.\nadvice start learning one approach depth general understanding breadth available options.\nadvice applies equally solving geographic problems R fields knowledge application.\nSection 16.5 covers developments languages.course, packages perform better others task, case ’s important know use.\nbook aimed focus packages future-proof (work long future), high performance (relative R packages), well maintained (user developer communities surrounding ) complementary.\nstill overlaps packages used, illustrated diversity packages making maps, highlighted Chapter 9, example.Overlapping functionality can good.\nnew package similar (identical) functionality compared existing package can increase resilience, performance (partly driven friendly competition mutual learning developers) choice, key benefits geocomputation open source software.\ncontext, deciding combination sf, tidyverse, terra packages use made knowledge alternatives.\nsp ecosystem sf superseded, example, can many things covered book , due age, built many packages.\ntime writing 2024, 463 packages Depend Import sp, slightly 452 October 2018, showing data structures widely used extended many directions.\nequivalent numbers sf 69 2018 431 2024, highlighting package future-proof growing user base developer community (Bivand 2021).\nAlthough best known point pattern analysis, spatstat package also supports raster vector geometries provides powerful functionality spatial statistics (Baddeley Turner 2005).\nmay also worth researching new alternatives development needs met established packages.","code":"\nlibrary(spData)\nnz_u1 = sf::st_union(nz)\nnz_u2 = aggregate(nz[\"Population\"], list(rep(1, nrow(nz))), sum)\nnz_u3 = dplyr::summarise(nz, t = sum(Population))\nidentical(nz_u1, nz_u2$geometry)\n#> [1] TRUE\nidentical(nz_u1, nz_u3$geom)\n#> [1] TRUE\nlibrary(dplyr) # attach a tidyverse package\nnz_name1 = nz[\"Name\"] # base R approach\nnz_name2 = nz |> # tidyverse approach\n select(Name)\nidentical(nz_name1$Name, nz_name2$Name) # check results\n#> [1] TRUE"},{"path":"conclusion.html","id":"gaps","chapter":"16 Conclusion","heading":"16.3 Gaps and overlaps","text":"Geocomputation big area, inevitably gaps book.\nselective, deliberately highlighting certain topics, techniques packages, omitting others.\ntried emphasize topics commonly needed real-world applications geographic data operations, basics coordinate reference systems, read/write data operations visualization techniques.\ntopics themes appear repeatedly, aim building essential skills geocomputation, showing go , advanced topics specific applications.deliberately omitted topics covered -depth elsewhere.\nStatistical modeling spatial data point pattern analysis, spatial interpolation (e.g., kriging) spatial regression, example, mentioned context machine learning Chapter 12 covered detail.\nalready excellent resources methods, including statistically orientated chapters Pebesma Bivand (2023c) books point pattern analysis (Baddeley, Rubak, Turner 2015), Bayesian techniques applied spatial data (Gómez-Rubio 2020; Moraga 2023), books focused particular applications health (Moraga 2019) wildfire severity analysis (Wimberly 2023).\ntopics received limited attention remote sensing using R alongside (rather bridge ) dedicated GIS software.\nmany resources topics, including discussion remote sensing R, Wegmann, Leutner, Dech (2016) GIS-related teaching materials available Marburg University.focused machine learning rather spatial statistical inference Chapters 12 15 abundance quality resources topic.\nresources include . Zuur et al. (2009), . F. Zuur et al. (2017) focus ecological use cases, freely available teaching material code Geostatistics & Open-source Statistical Computing hosted css.cornell.edu/faculty/dgr2.\nR Geographic Data Science provides introduction R geographic data science modeling.largely omitted geocomputation ‘big data’ mean datasets fit high-spec laptop.\ndecision justified fact majority geographic datasets needed common research policy applications fit consumer hardware, large high-resolution remote sensing datasets notable exception (see Section 10.8).\npossible get RAM computer temporarily ‘rent’ compute power available platforms GitHub Codespaces, can used run code book.\nFurthermore, learning solve problems small datasets prerequisite solving problems huge datasets emphasis book getting started, skills learn useful move bigger datasets.\nAnalysis ‘big data’ often involves extracting small amount data database specific statistical analysis.\nSpatial databases, covered Chapter 10, can help analysis datasets fit memory.\n‘Earth observation cloud back-ends’ can accessed R openeo package (Section 10.8.2).\nneed work big geographic datasets, also recommend exploring projects Apache Sedona emerging file formats GeoParquet.","code":""},{"path":"conclusion.html","id":"questions","chapter":"16 Conclusion","heading":"16.4 Getting help","text":"Geocomputation large challenging field, making issues temporary blockers work near inevitable.\nmany cases may just ‘get stuck’ particular point data analysis workflow facing cryptic error messages hard debug.\nmay get unexpected results clues going .\nsection provides pointers help overcome problems, clearly defining problem, searching existing knowledge solutions , approaches solve problem, art asking good questions.\nget stuck particular point, worth first taking step back working approach likely solve issue.\nTrying following steps — skipping steps already taken — provides structured approach problem-solving:Define exactly trying achieve, starting first principles (often sketch, outlined )Diagnose exactly code unexpected results arise, running exploring outputs individual lines code individual components (can run individual parts complex command selecting cursor pressing Ctrl+Enter RStudio, example)Read documentation function diagnosed ‘point failure’ previous step. Simply understanding required inputs functions, running examples often provided bottom help pages, can help solve surprisingly large proportion issues (run command ?terra::rast scroll examples worth reproducing getting started function, example)reading R’s built-documentation, outlined previous step, help solve problem, probably time broader search online see others written issue ’re seeing. See list places search help belowIf previous steps fail, find solution online searches, may time compose question reproducible example post appropriate placeSteps 1 3 outlined fairly self-explanatory , due vastness internet multitude search options, worth considering effective search strategies deciding compose question.","code":""},{"path":"conclusion.html","id":"searching-for-solutions-online","chapter":"16 Conclusion","heading":"16.4.1 Searching for solutions online","text":"Search engines logical place start many issues.\n‘Googling ’ can cases result discovery blog posts, forum messages online content precise issue ’re .\nSimply typing clear description problem/question valid approach , important specific (e.g., reference function package names input dataset sources problem dataset-specific).\ncan also make online searches effective including additional detail:\nUse quotation marks maximize chances ‘hits’ relate exact issue ’re reducing number results returned. example, try fail save GeoJSON file location already exists, get error containing message “GDAL Error 6: DeleteLayer() supported dataset”. specific search query \"GDAL Error 6\" sf likely yield solution searching GDAL Error 6 without quotation marksSet time restraints, example returning content created within last year can useful searching help evolving packageMake use additional search engine features, example restricting searches content hosted CRAN site:r-project.org","code":""},{"path":"conclusion.html","id":"help","chapter":"16 Conclusion","heading":"16.4.2 Places to search for (and ask) for help","text":"cases online searches yield solution, worth asking help.\nmany forums can , including:R’s Special Interest Group Geographic data email list (R-SIG-GEO)GIS Stackexchange website gis.stackexchange.comThe large general purpose programming Q&site stackoverflow.comOnline forums associated particular entity, Posit Community, rOpenSci Discuss web forum forums associated particular software tools Stan forumSoftware development platforms GitHub, hosts issue trackers majority R-spatial packages also, increasingly, built-discussion pages created encourage discussion (just bug reporting) around sfnetworks package (see luukvdmeer/sfnetworks/discussions)Online chat rooms forums associated communities rOpenSci geocompx community (Discord server can ask questions), book part","code":""},{"path":"conclusion.html","id":"reprex","chapter":"16 Conclusion","heading":"16.4.3 Reproducible examples with reprex","text":"terms asking good question, clearly stated question supported accessible fully reproducible example key (see also https://r4ds.hadley.nz/workflow-help.html).\nalso helpful, showing code ‘work’ user’s perspective, explain like see.\nuseful tool creating reproducible examples reprex package.\nhighlight unexpected behavior, can write completely reproducible code demonstrates issue use reprex() function create copy code can pasted forum online space.Imagine trying create map world blue sea green land.\nsimply ask one places outlined previous section.\nHowever, likely get better response provide reproducible example tried far.\nfollowing code creates map world blue sea green land, land filled :post code forum, likely get specific useful response.\nexample, someone might respond following code, demonstrably solves problem, illustrated Figure 16.1:\nFIGURE 16.1: map world green land, illustrating question reproducible example (left) solution (right).\nExercise reader: copy code, run command reprex::reprex() (paste command reprex() function call) paste output forum online space.strength open source collaborative approaches geocomputation generate vast ever evolving body knowledge, book part.\nDemonstrating efforts solve problem, providing reproducible example problem, way contributing body knowledge.","code":"\nlibrary(sf)\nlibrary(spData)\nplot(st_geometry(world), col = \"green\")\nlibrary(sf)\nlibrary(spData)\n# use the bg argument to fill in the land\nplot(st_geometry(world), col = \"green\", bg = \"lightblue\")"},{"path":"conclusion.html","id":"defining-and-sketching-the-problem","chapter":"16 Conclusion","heading":"16.4.4 Defining and sketching the problem","text":"cases, may able find solution problem online, may able formulate question can answered search engine.\nbest starting point cases, developing new geocomputational methodology, may pen paper (equivalent digital sketching tools Excalidraw tldraw allow collaborative sketching rapid sharing ideas).\ncreative early stages methodological development work, software kind can slow thoughts direct away important abstract thoughts.\nFraming question mathematics also highly recommended, reference minimal example can sketch ‘’ versions numerically.\nskills problem warrants , describing approach algebraically can cases help develop effective implementations.","code":""},{"path":"conclusion.html","id":"next","chapter":"16 Conclusion","heading":"16.5 Where to go next?","text":"indicated Section 16.3, book covered fraction R’s geographic ecosystem, much discover.\nprogressed quickly, geographic data models Chapter 2, advanced applications Chapter 15.\nConsolidation skills learned, discovery new packages approaches handling geographic data, application methods new datasets domains suggested future directions.\nsection expands general advice suggesting specific ‘next steps’, highlighted bold .addition learning geographic methods applications R, example reference work cited previous section, deepening understanding R logical next step.\nR’s fundamental classes data.frame matrix foundation sf terra classes, studying improve understanding geographic data.\ncan done reference documents part R, can found command help.start() additional resources subject Wickham (2019) Chambers (2016).Another software-related direction future learning discovering geocomputation languages.\ngood reasons learning R language geocomputation, described Chapter 1, option.104\npossible study Geocomputation : Python, C++, JavaScript, Scala Rust equal depth.\nevolving geospatial capabilities.\nrasterio, example, Python package similar functionality terra package used book.\nSee Geocomputation Python, introduction geocomputation Python.Dozens geospatial libraries developed C++, including well-known libraries GDAL GEOS, less well-known libraries Orfeo Toolbox processing remote sensing (raster) data.\nTurf.js example potential geocomputation JavaScript.\nGeoTrellis provides functions working raster vector data Java-based language Scala.\nWhiteBoxTools provides example rapidly evolving command line GIS implemented Rust.\npackages/libraries/languages advantages geocomputation many discover, documented curated list open source geospatial resources Awesome-Geospatial.geocomputation software, however.\ncan recommend exploring learning new research topics methods academic theoretical perspectives.\nMany methods written yet implemented.\nLearning geographic methods potential applications can therefore rewarding, writing code.\nexample geographic methods increasingly implemented R sampling strategies scientific applications.\nnext step case read-relevant articles area Brus (2018), accompanied reproducible code tutorial content hosted github.com/DickBrus/TutorialSampling4DSM.","code":""},{"path":"conclusion.html","id":"benefit","chapter":"16 Conclusion","heading":"16.6 The open source approach","text":"technical book, makes sense next steps, outlined previous section, also technical.\nHowever, wider issues worth considering final section, returns definition geocomputation.\nOne elements term introduced Chapter 1 geographic methods positive impact.\ncourse, define measure ‘positive’ subjective, philosophical question beyond scope book.\nRegardless worldview, consideration impacts geocomputational work useful exercise:\npotential positive impacts can provide powerful motivation future learning , conversely, new methods can open-many possible fields application.\nconsiderations lead conclusion geocomputation part wider ‘open source approach’.Section 1.1 presented terms mean roughly thing geocomputation, including geographic data science (GDS) ‘GIScience’.\ncapture essence working geographic data, geocomputation advantages: concisely captures ‘computational’ way working geographic data advocated book — implemented code therefore encouraging reproducibility — builds desirable ingredients early definition (Openshaw Abrahart 2000):creative use geographic dataApplication real-world problemsBuilding ‘scientific’ toolsReproducibilityWe added final ingredient: reproducibility barely mentioned early work geocomputation, yet strong case can made vital component first two ingredients.Reproducibility:Encourages creativity shifting focus away basics (readily available shared code) toward applicationsDiscourages people ‘reinventing wheel’: need redo others done methods can used othersMakes research conducive real-world applications, enabling anyone sector apply one’s methods new areasIf reproducibility defining asset geocomputation (command line GIS), worth considering makes reproducible.\nbrings us ‘open source approach’, three main components:command line interface (CLI), encouraging scripts recording geographic work shared reproducedOpen source software, can inspected potentially improved anyone worldAn active user developer community, collaborates self-organizes build complementary modular toolsLike term geocomputation, open source approach technical entity.\ncommunity composed people interacting daily shared aims: produce high-performance tools, free commercial legal restrictions, accessible anyone use.\nopen source approach working geographic data advantages transcend technicalities software works, encouraging learning, collaboration efficient division labor.many ways engage community, especially emergence code hosting sites, GitHub, encourage communication collaboration.\ngood place start simply browsing source code, ‘issues’ ‘commits’ geographic package interest.\nquick glance r-spatial/sf GitHub repository, hosts code underlying sf package, shows 100+ people contributed codebase documentation.\nDozens people contributed asking questions contributing ‘upstream’ packages sf uses.\n1,500 issues closed issue tracker, representing huge amount work make sf faster, stable user-friendly.\nexample, just one package dozens, shows scale intellectual operation underway make R highly effective continuously evolving language geocomputation.instructive watch incessant development activity happen public fora GitHub, even rewarding become active participant.\none greatest features open source approach: encourages people get involved.\nbook result open source approach:\nmotivated amazing developments R’s geographic capabilities last two decades, made practically possible dialogue code-sharing platforms collaboration.\nhope addition disseminating useful methods working geographic data, book inspires take open source approach.","code":""},{"path":"references.html","id":"references","chapter":"References","heading":"References","text":"","code":""}] +[{"path":"index.html","id":"welcome","chapter":"Welcome","heading":"Welcome","text":"online home Geocomputation R, book geographic data analysis, visualization modeling.Note: first edition book published CRC Press R Series.\ncan buy book CRC Press, Amazon, see archived First Edition hosted bookdown.org.Inspired Free Open Source Software Geospatial (FOSS4G) movement, code prose underlying book open, ensuring content reproducible, transparent, accessible.\nHosting source code GitHub allows anyone interact project opening issues contributing new content typo fixes benefit everyone.\nonline version book hosted r.geocompx.org kept --date GitHub Actions.\ncurrent ‘build status’ follows:version book built GH Actions 2024-09-25.book licensed Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.code samples book licensed Creative Commons CC0 1.0 Universal (CC0 1.0).","code":""},{"path":"index.html","id":"how-to-contribute","chapter":"Welcome","heading":"How to contribute?","text":"bookdown makes editing book easy editing wiki, provided GitHub account (sign-github.com).\nlogged-GitHub, click ‘Edit page’ icon right panel book website.\ntake editable version source R Markdown file generated page ’re .raise issue book’s content (e.g., code running) make feature request, check-issue tracker.Maintainers contributors must follow repository’s CODE CONDUCT.","code":""},{"path":"index.html","id":"reproducibility","chapter":"Welcome","heading":"Reproducibility","text":"quickest way reproduce contents book ’re new geographic data R may web browser, thanks Binder.\nClicking link open new window containing RStudio Server web browser, enabling open chapter files running code chunks test code reproducible.see something like image , congratulations, worked!\ncan start exploring Geocomputation R cloud-based environment, noting mybinder.org user guidelines):\nFIGURE 0.1: Screenshot reproducible code contained Geocomputation R running RStudio Server browser served Binder\nreproduce code book computer, need recent version R --date packages.\ncan installed using remotes package.installing book’s dependencies, can rebuild book testing educational purposes.\ndownload unzip clone book’s source code.\nopening geocompr.Rproj project RStudio (opening folder another IDE VS Code), able reproduce contents following command:See project’s GitHub repo full details reproducing book.","code":"\ninstall.packages(\"remotes\")\ninstall.packages('geocompkg', repos = c('https://geocompr.r-universe.dev', 'https://cloud.r-project.org'), dependencies = TRUE, force = TRUE)\nbookdown::serve_book(\".\")"},{"path":"index.html","id":"getting-involved","chapter":"Welcome","heading":"Getting involved","text":"find project use interest, can get involved many ways, :Telling people ‘Starring’ geocompr GitHub repositoryCommunicating book online, via #geocompr hashtag Mastodon (see Guestbook geocompx.org) letting us know courses using bookCiting linking-itBuying copyReviewing , Amazon, Goodreads elsewhereAsking questions content making suggestion GitHub, Mastodon DiscordAnswering questions, least responding people asking clarification reproducible examples demonstrate questionHelping people get started open source software reproducible research general, working geographic data R particular (can excellent way consolidate build skills)Supporting community translations\nSpanish version: https://r.geocompx.org/es/\nFrench version: https://r.geocompx.org/fr/\nJapanese version: http://babayoshihiko.ddns.net/geo/\nSpanish version: https://r.geocompx.org/es/French version: https://r.geocompx.org/fr/Japanese version: http://babayoshihiko.ddns.net/geo/details can found github.com/geocompx/geocompr.globe icon used book created Jean-Marc Viglino licensed CC-4.0 International.\nbook website hosted Netlify.","code":""},{"path":"foreword-1st-edition.html","id":"foreword-1st-edition","chapter":"Foreword (1st Edition)","heading":"Foreword (1st Edition)","text":"‘spatial’ R always broad, seeking provide integrate tools geography, geoinformatics, geocomputation spatial statistics anyone interested joining : joining asking interesting questions, contributing fruitful research questions, writing improving code.\n, ‘spatial’ R always included open source code, open data reproducibility.‘spatial’ R also sought open interaction many branches applied spatial data analysis, also implement new advances data representation methods analysis expose cross-disciplinary scrutiny.\nbook demonstrates, often alternative workflows similar data similar results, may learn comparisons others create understand workflows.\nincludes learning similar communities around Open Source GIS complementary languages Python, Java .R’s wide range spatial capabilities never evolved without people willing share creating adapting.\nmight include teaching materials, software, research practices (reproducible research, open data), combinations .\nR users also benefitted greatly ‘upstream’ open source geo libraries GDAL, GEOS PROJ.book clear example , curious willing join , can find things need match aptitudes.\nadvances data representation workflow alternatives, ever increasing numbers new users often without applied quantitative command line exposure, book kind really needed.\nDespite effort involved, authors supported pressing forward publication., fresh book ready go; authors tried many tutorials workshops, readers instructors able benefit knowing contents continue tried people like .\nEngage authors wider R-spatial community, see value choice building workflows important, enjoy applying learn things care .Roger BivandBergen, September 2018","code":""},{"path":"foreword-2nd-edition.html","id":"foreword-2nd-edition","chapter":"Foreword (2nd Edition)","heading":"Foreword (2nd Edition)","text":"Writing books open source data science software constantly changes uncontrolled ways brave undertaking: feels like running race someone else constantly moves finish line. second edition Geocomputation R timely: catches many recent changes, also embraces new R packages, new topical developments computing landscape. now includes chapter raster-vector interactions, discussing package terra replacing package raster raster (vector) data processing. also keeps tmap package creating high quality maps, completing full rewrite cycle.Besides updating contents book, authors also active helping streamline focus changes software extensively testing , helping improve , writing issues pull requests GitHub, sharing benchmark results, helping improve software documentation.first edition book great success. first book popularize spatial analysis sf package tidyverse. enthusiastic tone reached wide audience, helped people various levels experience solving new problems moving next level. available entirely freely online addition printed volume gave large reach, enabled users try presented methodology datasets. addition , authors encouraged readership reach ways GitHub issues, social media posts, discussions discord channel. led 75 people contributing book’s source code one way , including several providing longer reviews contributing full sections, including Cloud-optimized GeoTIFFs, STAC openEO; sfheaders package; OGC APIs metadata; CycleHire shiny app. Discord, led lively spontaneous discussions threads include topics ranging highly technical “look built”.Beyond , authors initiated companion volume Geocomputation Python, stressing geocomputation happens data science languages, means restricted one . Geocomputation rise, part fostering growing geocomputation community, writing books like one indispensable.Edzer PebesmaMünster, Germany, May 2024","code":""},{"path":"preface.html","id":"preface","chapter":"Preface","heading":"Preface","text":"","code":""},{"path":"preface.html","id":"who-this-book-is-for","chapter":"Preface","heading":"Who this book is for","text":"book people want analyze, visualize model geographic data open source software.\nbased R, statistical programming language powerful data processing, visualization geospatial capabilities.\nbook covers wide range topics interest wide range people many different backgrounds, especially:People learned spatial analysis skills using desktop Geographic Information System (GIS), QGIS, ArcGIS, GRASS GIS SAGA, want access powerful (geo)statistical visualization programming language benefits command line approach (Sherman 2008):\n\nadvent ‘modern’ GIS software, people want point click way life. ’s good, tremendous amount flexibility power waiting command line.\nPeople learned spatial analysis skills using desktop Geographic Information System (GIS), QGIS, ArcGIS, GRASS GIS SAGA, want access powerful (geo)statistical visualization programming language benefits command line approach (Sherman 2008):advent ‘modern’ GIS software, people want point click way life. ’s good, tremendous amount flexibility power waiting command line.Graduate students researchers fields specializing geographic data including Geography, Remote Sensing, Planning, GIS Spatial Data ScienceGraduate students researchers fields specializing geographic data including Geography, Remote Sensing, Planning, GIS Spatial Data ScienceAcademics post-graduate students working geographic data — fields Geology, Regional Science, Biology Ecology, Agricultural Sciences, Archaeology, Epidemiology, Transport Modeling, broadly defined Data Science — require power flexibility R researchAcademics post-graduate students working geographic data — fields Geology, Regional Science, Biology Ecology, Agricultural Sciences, Archaeology, Epidemiology, Transport Modeling, broadly defined Data Science — require power flexibility R researchApplied researchers analysts public, private third-sector organizations need reproducibility, speed flexibility command line language R applications dealing spatial data diverse Urban Transport Planning, Logistics, Geo-marketing (store location analysis) Emergency PlanningApplied researchers analysts public, private third-sector organizations need reproducibility, speed flexibility command line language R applications dealing spatial data diverse Urban Transport Planning, Logistics, Geo-marketing (store location analysis) Emergency PlanningThe book designed intermediate--advanced R users interested geocomputation R beginners prior experience geographic data.\nnew R geographic data, discouraged: provide links materials describe nature spatial data beginner’s perspective Chapter 2 links provided .","code":""},{"path":"preface.html","id":"how-to-read-this-book","chapter":"Preface","heading":"How to read this book","text":"book divided three parts:Part : Foundations, aimed getting --speed geographic data R.Part II: Advanced techniques, including spatial data visualization, bridges GIS software, programming spatial data, statistical learning.Part III: Applications real-world problems, including transportation, geomarketing ecological modeling.chapters get harder one part next.\nrecommend reading chapters Part order tackling advanced topics Part II Part III.\nchapters Part II Part III benefit slightly read order, can read independently interested specific topic.\nmajor barrier geographical analysis R steep learning curve.\nchapters Part aim address providing reproducible code simple datasets ease process getting started.important aspect book teaching/learning perspective exercises end chapter.\nCompleting develop skills equip confidence needed tackle range geospatial problems.\nSolutions exercises can found online booklet accompanies Geocomputation R, hosted r.geocompx.org/solutions.\nlearn booklet created, update solutions files _01-ex.Rmd, see blog post Geocomputation R solutions.\nblog posts examples can found geocompx.org.Impatient readers welcome dive straight practical examples, starting Chapter 2.\nHowever, recommend reading wider context Geocomputation R Chapter 1 first.\nnew R, also recommend learning language attempting run code chunks provided chapter (unless ’re reading book understanding concepts).\nFortunately beginners, R supportive community developed wealth resources can help.\nparticularly recommend three tutorials: R Data Science (Grolemund Wickham 2016) Efficient R Programming (Gillespie Lovelace 2016), introduction R (R Core Team 2021).","code":""},{"path":"preface.html","id":"why-r","chapter":"Preface","heading":"Why R?","text":"Although R steep learning curve, command line approach advocated book can quickly pay .\n’ll learn subsequent chapters, R effective tool tackling wide range geographic data challenges.\nexpect , practice, R become program choice geospatial toolbox many applications.\nTyping executing commands command line , many cases, faster pointing--clicking around graphical user interface (GUI) desktop GIS.\napplications Spatial Statistics modeling, R may realistic way get work done.outlined Section 1.3, many reasons using R geocomputation:\nR well suited interactive use required many geographic data analysis workflows compared languages.\nR excels rapidly growing fields Data Science (includes data carpentry, statistical learning techniques data visualization) Big Data (via efficient interfaces databases distributed computing systems).\nFurthermore, R enables reproducible workflow: sharing scripts underlying analysis allow others build work.\nensure reproducibility book made source code available github.com/geocompx/geocompr.\nfind script files code/ folder generate figures:\ncode generating figure provided main text book, name script file generated provided caption (see example caption Figure 13.2).languages Python, Java C++ can used geocomputation.\nexcellent resources learning geocomputation without R, discussed Section 1.4.\nNone provide unique combination package ecosystem, statistical capabilities, visualization options offered R community.\nFurthermore, teaching use one language (R) depth, book equip concepts confidence needed geocomputation languages.","code":""},{"path":"preface.html","id":"real-world-impact","chapter":"Preface","heading":"Real-world impact","text":"Geocomputation R equip knowledge skills tackle wide range issues, including scientific, societal environmental implications, manifested geographic data.\ndescribed Section 1.1, geocomputation using computers process geographic data, also real-world impact.\nwider context motivations underlying book covered Chapter 1.","code":""},{"path":"preface.html","id":"acknowledgments","chapter":"Preface","heading":"Acknowledgments","text":"Many thanks everyone contributed directly indirectly via code hosting collaboration site GitHub, including following people contributed direct via pull requests: prosoitos, florisvdh, babayoshihiko, katygregg, tibbles--tribbles, Lvulis, rsbivand, iod-ine, KiranmayiV, cuixueqin, defuneste, zmbc, erstearns, FlorentBedecarratsNM, dcooley, darrellcarvalho, marcosci, appelmar, MikeJohnPage, eyesofbambi, krystof236, nickbearman, tylerlittlefield, giocomai, KHwong12, LaurieLBaker, MarHer90, mdsumner, pat-s, sdesabbata, ahmohil, ateucher, annakrystalli, andtheWings, kant, gavinsimpson, Himanshuteli, yutannihilation, howardbaek, jimr1603, jbixon13, olyerickson, yvkschaefer, katiejolly, kwhkim, layik, mpaulacaldas, mtennekes, mvl22, ganes1410, richfitz, VLucet, wdearden, yihui, adambhouston, chihinl, cshancock, e-clin, ec-nebi, gregor-d, jasongrahn, p-kono, pokyah, schuetzingit, tim-salabim, tszberkowitz, vlarmet.\nThanks Marco Sciaini created front cover image first edition Benjamin Nowak created cover image second edition.\nSee code/frontcover.R code/frontcover2.R reproducible code generated visualizations.\nDozens people contributed online, raising commenting issues, providing feedback via social media.\n#geocompr geocompx hashtags live !like thank John Kimmel Lara Spieker CRC Press Taylor & Francis taking ideas early book plan production via four rounds peer review edition.\nreviewers deserve special mention detailed feedback expertise substantially improved book’s structure content.thank Patrick Schratz Alexander Brenning University Jena fruitful discussions contributions Chapters 12 15.\nthank Emmanuel Blondel Food Agriculture Organization United Nations expert contributions section web services;\nMichael Sumner critical contributions many areas book, especially discussion algorithms Chapter 11;\nTim Appelhans, David Cooley Kiranmayi Vadlamudi key contributions visualization chapter (Chapter 9);\nMarius Appel contributions Chapter 10;\nKaty Gregg, proofread every chapter greatly improved readability book.Countless others mentioned contributed myriad ways.\nfinal thank software developers make geocomputation R possible.\nEspecially, Edzer Pebesma (created sf package), Robert Hijmans (created terra) Roger Bivand (laid foundations much R-spatial software) made high performance geographic computing possible R.","code":""},{"path":"intro.html","id":"intro","chapter":"1 Introduction","heading":"1 Introduction","text":"book using power computers things geographic data.\nteaches range spatial skills, including: reading, writing manipulating geographic file formats; making static interactive maps; applying geocomputation support evidence-based decision-making related range geographic phenomena, transport systems ecosystems.\ndemonstrating various geographic operations can linked, ‘code chunks’ intersperse prose, book also teaches reproducible, open thus scientific workflows.book just using wealth existing tools geocomputation: ’s also understanding geographic data structures software needed build new tools.\napproach teach throughout, programming techniques covered Chapter 11 particular, can remove constraints creativity imposed software.\nreading book completing exercises, ready tackle real-world problems, communicate work maps code, contribute open source communities developing tools documentation reproducible geocomputation.last decades, free open source software geospatial (FOSS4G) progressed astonishing rate.\nThanks organizations OSGeo, advanced geographic techniques longer preserve expensive hardware software: anyone can now download run high-performance software geocomputation.\nOpen source Geographic Information Systems (GIS), QGIS, made geographic analysis accessible worldwide.\nGIS software products powerful, tend emphasize graphical user interface (GUI) approach command-line interface (CLI) approach advocated book.\n‘GUI focus’ many GIS products unintended consequence disabling many users making work fully reproducible, problem can overcome calling ‘geoalgorithms’ contained GIS software command line, ’ll see Chapter 10.\nsimplistic comparison different approaches illustrated Table 1.1.TABLE 1.1: Differences emphasis software packages (Graphical User Interface (GUI) Geographic Information Systems (GIS) R).R language providing CLI geocomputation.\ncommand environments powerful geographic capabilities exist, including Python (covered book Geocomputation Python), Julia, JavaScript.\nHowever, R advantages make good language learning geocomputation many geocomputation tasks, especially statistics, modeling visualization, outlined Section 1.2.book also motivated importance reproducibility scientific research.\naims make reproducible geographic data analysis workflows accessible, demonstrate power open geospatial software available command line.\nR provides ways interface languages (Eddelbuettel Balamuta 2018), enabling numerous spatial software libraries called R, explained Section 1.3 demonstrated Chapter 10.\ngoing details software, however, worth taking step back thinking mean geocomputation.","code":""},{"path":"intro.html","id":"what-is-geocomputation","chapter":"1 Introduction","heading":"1.1 What is geocomputation?","text":"define geocomputation asA field research, software development practical application uses geographic data solve problems, focus reproducibility, flexibility tool development.Geocomputation young term, dating back first conference subject 1996.1\ndistinguished geocomputation (time) commonly used term ‘quantitative geography’ emphasis “creative experimental” applications (Longley et al. 1998) development new tools methods.\nwords Stan Openshaw, pioneer field advocate (possibly originator) term, “GeoComputation using various different types geodata developing relevant geo-tools within overall context ‘scientific’ approach” (Openshaw Abrahart 2000).\nBuilding early definition, Geocomputation R goes beyond data analysis modeling include development new tools methods work just interesting academically beneficial.approach differs early definitions geocomputation one important way, however: emphasis reproducibility collaboration.\nturn 21st Century, unrealistic expect readers able reproduce code examples, due barriers preventing access necessary hardware, software data.\nFast-forward today things progressed rapidly.\nAnyone access laptop sufficient RAM (least 8 GB recommended) can install run software geocomputation, reproduce contents book.\nFinancial hardware barriers geocomputation existed 1990s early 2000s, high-performance computers expensive people, removed.2\nGeocomputation also accessible publicly accessible datasets widely available ever , see Chapter 8.\nUnlike early works field, work presented book reproducible using code example data supplied alongside book, R packages spData, installation covered Chapter 2.Geocomputation closely related terms including: Geographic Information Science (GIScience); Geomatics; Geoinformatics; Spatial Information Science; Geoinformation Engineering (Longley 2015); Spatial Data Science (SDS).\nterm shares emphasis ‘scientific’ (implying reproducible falsifiable) approach influenced GIS, although origins main fields application differ.\nSDS, example, emphasizes ‘data science’ skills large datasets, Geoinformatics tends focus data structures.\noverlaps terms larger differences use geocomputation rough synonym encapsulating :\nseek use geographic data applied scientific work.\nUnlike early users term, however, seek imply cohesive academic field called ‘Geocomputation’ (‘GeoComputation’ Stan Openshaw called ).Geocomputation recent term influenced old ideas.\ncan seen part Geography, 2000+ year history (Talbert 2014);\nextension GIS (Neteler Mitasova 2008), emerged 1960s (Coppock Rhind 1991).Geography played important role explaining influencing humanity’s relationship natural world long invention computer.\nfamous explorer, early geographer pioneering polymath Alexander von Humboldt (dozens species, geographic features, places even universities named , influence) illustrates role:\ntravels South America early 1800s resulting observations lay foundations physical geography ecology, also paved way towards policies protect natural world (Wulf 2015).\nbook aims contribute still-evolving ‘Geographic Tradition’ (Livingstone 1992) harnessing power modern computers open source software.book’s links older disciplines reflected suggested titles book: Geography R R GIS.\nadvantages.\nformer conveying applied nature content, something map.\nlatter communicates book using R powerful command-line geographic information system, perform spatial operations geographic data.\nHowever, term GIS connotations fail communicate R’s greatest strengths:\nabilities seamlessly switch geographic non-geographic data processing, modeling visualization tasks enabling reproducibility go far beyond capabilities GIS.\nGeocomputation implies working geographic data reproducible code-driven environment programming new results, methods tools, book .","code":""},{"path":"intro.html","id":"why-open-source","chapter":"1 Introduction","heading":"1.2 Why use open source tools for geocomputation?","text":"Early geographers used variety tools including barometers, compasses sextants advance knowledge world (Wulf 2015).\ninvention marine chronometer 1761 became possible calculate longitude sea, enabling ships take direct routes, example.\nturn century, acute shortage data tools geographic analysis.\n\nNowadays, researchers practitioners limitations cases face opposite problem: much data many tools.\nphones now global positioning (GPS) receiver.\nSensors ranging satellites semi-autonomous vehicles citizen scientists incessantly measure every part world.\nrate data produced can overwhelming, emerging technologies autonomous vehicles generating hundreds even thousands gigabytes data daily.\nRemote sensing datasets satellites large analyze single computer, outlined Chapter 10.\n‘geodata revolution’ drives demand high performance computer hardware efficient, scalable software handle extract signal noise.\nEvolving open source tools can import process subsets vast geographic data stores directly, via application programming interfaces (APIs) via interfaces databases. rapidly changing hardware, software data landscapes, ’s important choose tools future-proof.\nmajor advantage open source software rate development longevity, thousands potential contributors.\nHundreds people submit bug reports suggest new features well documentation improvements open source projects every day — rate evolution proprietary solutions simply keep .linked advantage interoperability.\nproprietary products tend monolithic ‘empires’ difficult maintain (linked previously mentioned advantage), open source software like ‘federation’ modular tools can combined different ways.\nallowed open source data science languages R rapidly incorporate new developments interfaces high performance visualization libraries file formats, proprietary solutions struggle keep .Another major advantage reproducibility.\nable replicate findings vital scientific research, open source software removes important barrier reproducibility enabling others check findings applying methods new contexts using tools.\ncombination using tools can accessed anyone free ability share code data means results work can checked built upon others, huge advantage want work used cited.biggest advantage open source software combined sharing reproducible code many people, however, community.\ncommunity enables get support far quicker often higher quality possible centralized budget-limited support team associated proprietary software.\ncommunity can provide feedback, ideas , discussed Chapter 16), can help develop tools methods.R open source software project, powerful language, ever-evolving community statisticians developers (Wickham 2019).\nR language enabling reproducible geocomputation open source software, outlined Section 1.4).\nMany reasons using R also apply open source languages reproducible data science, Python Julia.\nHowever, R key advantages, outlined Section 1.3.","code":""},{"path":"intro.html","id":"why-use-r-for-geocomputation","chapter":"1 Introduction","heading":"1.3 Why use R for geocomputation?","text":"R multi-platform, open source language environment statistical computing graphics (r-project.org/).\nwide range packages, R also supports advanced geospatial statistics, modeling visualization.\nIntegrated development environments (IDEs) RStudio made R user-friendly many, easing map-making panel dedicated interactive visualization.core, R object-oriented, functional programming language (Wickham 2019) specifically designed interactive interface software (Chambers 2016).\nlatter also includes many ‘bridges’ treasure trove GIS software, ‘geolibraries’ functions (see Chapter 10).\nthus ideal quickly creating ‘geo-tools’, without needing master lower level languages (compared R) C, FORTRAN Java (see Section 1.4).\ncan feel like breaking free metaphorical ‘glass ceiling’ imposed GUI-based proprietary geographic information systems (see Table 1.1 definition GUI).\nFurthermore, R facilitates access languages:\npackages Rcpp reticulate enable access C++ Python code, example.\nmeans R can used ‘bridge’ wide range geospatial programs (see Section 1.4).Another example showing R’s flexibility evolving geographic capabilities interactive map-making.\n’ll see Chapter 9, statement R “limited interactive [plotting] facilities” (Bivand, Pebesma, Gómez-Rubio 2013) longer true.\ndemonstrated following code chunk, creates Figure 1.1 (functions generate plot covered Section 9.4).\nFIGURE 1.1: blue markers indicate authors . basemap tiled image Earthat night provided NASA. Interact online version r.geocompx.org, example zooming clicking pop-ups.\ndifficult produce Figure 1.1 using R (open source language data science) years ago, let alone interactive map.\nillustrates R’s flexibility , thanks developments knitr leaflet, can used interface software, theme recur throughout book.\nuse R code, therefore, enables teaching geocomputation reference reproducible examples representing real-world phenomena, rather just abstract concepts.‘R-spatial stack’ easy install comprehensive, well-maintained highly interoperable packages.\nR ‘batteries included’ statistical functions part base installation hundreds well-maintained packages implementing many cutting edge methods.\nR, can dive get things working surprisingly lines code, enabling focus geographic methods data, rather debugging managing package dependencies.\nparticular strength R ease allows create publication quality interactive maps thanks excellent mapping packages, outlined Chapter 9.","code":"\nlibrary(leaflet)\npopup = c(\"Robin\", \"Jakub\", \"Jannes\")\nleaflet() |>\n addProviderTiles(\"NASAGIBS.ViirsEarthAtNight2012\") |>\n addMarkers(lng = c(-3, 23, 11),\n lat = c(52, 53, 49), \n popup = popup)"},{"path":"intro.html","id":"software-for-geocomputation","chapter":"1 Introduction","heading":"1.4 Software for geocomputation","text":"R powerful language geocomputation, many options geographic data analysis providing thousands geographic functions.\nAwareness languages geocomputation help decide different tool may appropriate specific task, place R wider geospatial ecosystem.\nsection briefly introduces languages C++, Java Python geocomputation, preparation Chapter 10.important feature R (Python) interpreted language.\nadvantageous enables interactive programming Read–Eval–Print Loop (REPL):\ncode entered console immediately executed result printed, rather waiting intermediate stage compilation.\nhand, compiled languages C++ Java tend run faster (compiled).C++ provides basis many GIS packages QGIS, GRASS GIS SAGA, sensible starting point.\nWell-written C++ fast, making good choice performance-critical applications processing large geographic datasets, harder learn Python R.\nC++ become accessible Rcpp package, provides good ‘way ’ C programming R users.\nProficiency low-level languages opens possibility creating new, high-performance ‘geoalgorithms’ better understanding GIS software works (see Chapter 11).\nHowever, necessary learn C++ use R geocomputation.Python important language geocomputation, especially many Desktop GIS GRASS GIS, SAGA QGIS provide Python API (see Chapter 10).\nLike R, Python popular language data science.\nlanguages object-oriented, many areas overlap, leading initiatives reticulate package facilitates access Python R Ursa Labs initiative support portable libraries benefit entire open source data science ecosystem.practice R Python strengths.\nextent use less important domain application communication results.\nLearning either provide head-start learning .\nHowever, major advantages R Python geocomputation.\nincludes much better support geographic raster data model language (see Chapter 2) corresponding visualization possibilities (see Chapters 2 9).\nEqually important, R unparalleled support statistics, including spatial statistics, hundreds packages (unmatched Python) supporting thousands statistical methods.major advantage Python general-purpose programming language.\nused many domains, including desktop software, computer games, websites data science.\nPython often shared language different (geocomputation) communities can seen ‘glue’ holds many GIS programs together.\nMany geoalgorithms, including QGIS ArcMap, can accessed Python command line, making well suited starter language command line GIS.3For spatial statistics predictive modeling, however, R second--none.\nmean must choose either R Python: Python supports common statistical techniques (though R tends support new developments spatial statistics earlier) many concepts learned Python can applied R world.\nLike R, Python also supports geographic data analysis manipulation packages shapely, geopandas, rasterio xarray.","code":""},{"path":"intro.html","id":"r-ecosystem","chapter":"1 Introduction","heading":"1.5 R’s spatial ecosystem","text":"many ways handle geographic data R, dozens packages area.4\nbook endeavor teach state---art field whilst ensuring methods future-proof.\nLike many areas software development, R’s spatial ecosystem rapidly evolving (Figure 1.2).\nR open source, developments can easily build previous work, ‘standing shoulders giants’, Isaac Newton put 1675.\napproach advantageous encourages collaboration avoids ‘reinventing wheel’.\npackage sf (covered Chapter 2), example, builds predecessor sp.surge development time (interest) ‘R-spatial’ followed award grant R Consortium development support simple features, open-source standard model store access vector geometries.\nresulted sf package (covered Section 2.2.1).\nMultiple places reflect immense interest sf.\nespecially true R-sig-Geo Archives, long-standing open access email list containing much R-spatial wisdom accumulated years.\nFIGURE 1.2: Downloads selected R packages working geographic data early 2013 present. y axis shows average number daily downloads popular cloud.r-project.org CRAN mirror 91-day rolling window (log scale).\nnoteworthy shifts wider R community, exemplified data processing package dplyr (released 2014), influenced shifts R’s spatial ecosystem.\nAlongside packages shared style emphasis ‘tidy data’ (including, e.g., ggplot2), dplyr placed tidyverse ‘metapackage’ late 2016.\ntidyverse approach, focus long-form data fast intuitively named functions, become immensely popular.\nled demand ‘tidy geographic data’ partly met sf.\nobvious feature tidyverse tendency packages work harmony.\nequivalent ‘geoverse’, modern R-spatial ecosystem consolidated around sf, illustrated key packages depend shown Table 1.2, terra, taught book.\nstack highly interoperable packages languages, outlined Chapter 10.TABLE 1.2: top 5 downloaded packages depend sf, terms average number downloads per day previous month. 2023-11-14 , 526 packages import sf.","code":""},{"path":"intro.html","id":"history-of-r-spatial","chapter":"1 Introduction","heading":"1.6 History of R-spatial","text":"many benefits using modern spatial packages sf, value understanding history R’s spatial capabilities.\nMany functions, use cases teaching materials contained older packages, many still useful, provided know look.\nR’s spatial capabilities originated early spatial packages S language (Bivand Gebhardt 2000).\n1990s saw development numerous S scripts handful packages spatial statistics.\nyear 2000, R packages various spatial methods, including “point pattern analysis, geostatistics, exploratory spatial data analysis spatial econometrics” (Bivand Neteler 2000).\n, notably spatial, sgeostat splancs still available CRAN (B. S. Rowlingson Diggle 1993; B. Rowlingson Diggle 2017; Venables Ripley 2002; Majure Gebhardt 2016).\nKey spatial packages described Ripley (2001), outlined R packages spatial smoothing interpolation (Akima Gebhardt 2016; Jr Diggle 2016) point pattern analysis (B. Rowlingson Diggle 2017; Baddeley, Rubak, Turner 2015).\nOne (spatstat) still actively maintained, 20 years first release.following commentary outlined future prospects spatial statistics (Bivand 2001), setting stage development popular spdep package (Bivand 2017).\nNotably, commentary mentioned need standardization spatial interfaces, efficient mechanisms exchanging data GIS, handling spatial metadata coordinate reference systems (CRS).\naims largely achieved.maptools (Bivand Lewin-Koh 2017) another important package time, provided interface shapelib library reading Shapefile file format fed sp.\nextended review spatial packages proposed class system support “data objects offered GDAL”, including fundamental point, line, polygon, raster types, interfaces external libraries (Bivand 2003).\nlarge extent, ideas realized packages rgdal sp, providing foundation seminal book Applied Spatial Data Analysis R (ASDAR) (Bivand, Pebesma, Gómez-Rubio 2013), first published 2008.\nR’s spatial capabilities evolved substantially since , still build ideas early pioneers.\nInterfaces GDAL PROJ, example, still power R’s high-performance geographic data /O CRS transformation capabilities, outlined Chapters 7 8, respectively.rgdal, released 2003, provided GDAL bindings R greatly enhanced ability import data previously unavailable geographic data formats.\ninitial release supported raster drivers, subsequent enhancements provided support coordinate reference systems (via PROJ library), reprojections import vector file formats.\nMany additional capabilities developed Barry Rowlingson released rgdal codebase 2006, described B. Rowlingson et al. (2003) R-help email list.sp package, released 2005, significant advancement R’s spatial capabilities.\nintroduced classes generic methods handling geographic coordinates, including points, lines, polygons, grids, well attribute data.\nS4 class system, sp stores information bounding box, coordinate reference system (CRS), attributes slots within Spatial objects.\nallows efficient data operations geographic data.\npackage also provided generic methods like summary() plot() working geographic data.following decade, sp classes rapidly became popular geographic data R number packages depended increased around 20 2008 100 2013 (Bivand, Pebesma, Gómez-Rubio 2013).\n2019 500 packages imported sp.\nAlthough number packages depend sp decreased since release sf still used prominent R packages, including gstat (spatial spatiotemporal geostatistics) geosphere (spherical trigonometry) (Pebesma Graeler 2023; Hijmans 2016).rgdal sp solved many spatial issues, rgeos developed Google Summer Code project 2010 (Bivand Rundel 2023) geometry operations undertaken sp objects.\nFunctions gIntersection() enabled users find spatial relationships geographic objects modify geometries (see Chapter 5 details geometric operations sf).\nlimitation sp ecosystem limited support raster data.\novercome raster, first released 2010 (Hijmans 2023b).\nraster’s class system functions enabled range raster operations, capabilities now implemented terra package, supersedes raster, outlined Section 2.3.\nimportant capability raster terra ability work datasets large fit RAM supporting -disk operations.\nraster terra also supports map algebra, described Section 4.3.2.parallel developments class systems methods, came support R interface dedicated GIS software.\nGRASS (Bivand 2000) follow-packages spgrass6, rgrass7 rgrass prominent examples direction (Bivand 2016a, 2016b, 2023).\nexamples bridges R GIS include bridges QGIS via qgisprocess (Dunnington et al. 2024), SAGA via Rsagacmd (Pawley 2023) RSAGA (Brenning, Bangs, Becker 2022) ArcGIS via RPyGeo (Brenning 2012a, first published 2008), (see Chapter 10).Visualization focus initially, bulk R-spatial development focused analysis geographic operations.\nsp provided methods map-making using base lattice plotting system, demand growing advanced map-making capabilities.\nRgoogleMaps first released 2009, allowed overlay R spatial data top ‘basemap’ tiles online services Google Maps OpenStreetMap (Loecher Ropkins 2015).\nfollowed ggmap package added similar ‘basemap’ tiles capabilities ggplot2 (Kahle Wickham 2013).\nThough ggmap facilitated map-making ggplot2, utility limited need fortify spatial objects, means converting long data frames.\nworks well points, computationally inefficient lines polygons, since coordinate (vertex) converted row, leading huge data frames represent complex geometries.\nAlthough geographic visualization tended focus vector data, raster visualization supported raster received boost release rasterVis (Lamigueiro 2018).\nSince map-making R become hot topic, dedicated packages tmap, leaflet mapview gaining popularity, highlighted Chapter 9.Since 2018, First Edition Geocomputation R published, development geographic R packages accelerated.\nterra, successor raster package, firstly released 2020 (Hijmans 2023c), bringing several benefits R users working raster datasets: faster straightforward user interface predecessor, described Section 2.3.mid-2021, sf started using S2 spherical geometry engine geometry operations unprojected datasets, described Section 2.2.9.\nAdditional ways representing working geographic data R since 2018 developed, including stars lidR packages (Pebesma 2021; Roussel et al. 2020).\ndevelopments motivated emergence new technologies, standards software outside R environment (Bivand 2021).\nMajor updates PROJ library beginning 2018 forced replacement ‘proj-string’ representations coordinate reference systems ‘Well Known Text’, described Section 2.4 Chapter 7.\nSince publication first version Geocomputation R 2018, several packages spatial data visualization developed improved.\nrayshader package, example, enables development striking easy--animate 3D visualizations via raytracing multiple hill-shading methods (Morgan-Wall 2021).\npopular ggplot2 package gained new spatial capabilities, thanks work ggspatial package, provides scale bars north arrows (Dunnington 2021).\ngganimate enables smooth customizable spatial animations (Pedersen Robinson 2020).Existing visualization packages also improved rewritten.\nLarge raster objects automatically downscaled tmap high-performance interactive maps now possible thanks packages including leafgl mapdeck.\n\nmapsf package (successor cartography) rewritten reduce dependencies improve performance (Giraud 2021); tmap underwent major update Version 4, internal code revised.late 2021, planned retirement rgdal, rgeos maptools announced October 2023 archived CRAN.\nretirement end 2023 large impact existing workflows applying packages, also influenced packages depend .\nModern R packages sf terra, described Chapter 2 provide strong future-proof foundation geocomputation build book.","code":""},{"path":"intro.html","id":"exercises","chapter":"1 Introduction","heading":"1.7 Exercises","text":"E1. Think terms ‘GIS’, ‘GDS’ ‘geocomputation’ described . () best describes work like using geo* methods software ?E2. Provide three reasons using scriptable language R geocomputation instead using graphical user interface (GUI) based GIS QGIS.E3. year 2000, Stan Openshaw wrote geocomputation involved “practical work beneficial useful” others. Think practical problem possible solutions informed new evidence derived analysis, visualization modeling geographic data. pen paper (computational equivalent) sketch inputs possible outputs illustrating geocomputation help.","code":""},{"path":"spatial-class.html","id":"spatial-class","chapter":"2 Geographic data in R","heading":"2 Geographic data in R","text":"","code":""},{"path":"spatial-class.html","id":"prerequisites","chapter":"2 Geographic data in R","heading":"Prerequisites","text":"first practical chapter book, therefore comes software requirements.\nneed access computer recent version R installed (R 4.3.2 later version).\nrecommend reading prose also running code chapter build geocomputational skills.keep track learning journey, may worth starting creating new folder computer save R scripts, outputs things related Geocomputation R go.\ncan also download clone source code underlying book support learning.\nstrongly recommend using R integrated development environment (IDE) RStudio (quicker get running) VS Code (requires additional setup).new R, recommend following introductory R resources Hands Programming R Introduction R dive Geocomputation R code.\nresources cover detail install R, simply involves downloading latest version Comprehensive R Archive Network (CRAN).\nSee note information installing R geocomputation Mac Linux.\nOrganize work projects give scripts sensible names chapter-02.R (equivalent RMarkdown Quarto file names) document code learn.\ngot good set-, ’s time run code!\nUnless already packages installed, first thing install foundational R packages used chapter, following commands:5The packages needed reproduce Part 1 book can installed following command: remotes::install_github(\"geocompx/geocompkg\").\ncommand uses function install_packages() remotes package install source code hosted GitHub code hosting, version collaboration platform.\nfollowing command install dependencies required reproduce entire book (warning: may take several minutes): remotes::install_github(\"geocompx/geocompkg\", dependencies = TRUE).packages needed run code presented chapter can ‘loaded’ (technically attached) library() function follows:output library(sf) reports versions key geographic libraries GEOS package using, outlined Section 2.2.1.packages installed contain data used book:","code":"\ninstall.packages(\"sf\")\ninstall.packages(\"terra\")\ninstall.packages(\"spData\")\ninstall.packages(\"spDataLarge\", repos = \"https://nowosad.r-universe.dev\")\nlibrary(sf) # classes and functions for vector data\n#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE\nlibrary(terra) # classes and functions for raster data\nlibrary(spData) # load geographic data\nlibrary(spDataLarge) # load larger geographic data"},{"path":"spatial-class.html","id":"intro-spatial-class","chapter":"2 Geographic data in R","heading":"2.1 Introduction","text":"chapter provide explanations fundamental geographic data models: vector raster.\nintroduce theory behind data model disciplines predominate, demonstrating implementation R.vector data model represents world using points, lines polygons.\ndiscrete, well-defined borders, meaning vector datasets usually high level precision (necessarily accuracy see Section 2.5).\nraster data model divides surface cells constant size.\nRaster datasets basis background images used web-mapping vital source geographic data since origins aerial photography satellite-based remote sensing devices.\nRasters aggregate spatially specific features given resolution, meaning consistent space scalable (many worldwide raster datasets available).use?\nanswer likely depends domain application:Vector data tends dominate social sciences human settlements tend discrete bordersRaster dominates many environmental sciences partially reliance remote sensing dataThere much overlap fields raster vector datasets can used together:\necologists demographers, example, commonly use vector raster data.\nFurthermore, possible convert two forms (see Chapter 6).\nWhether work involves use vector raster datasets, worth understanding underlying data model using , discussed subsequent chapters.\nbook uses sf terra packages work vector data raster datasets, respectively.","code":""},{"path":"spatial-class.html","id":"vector-data","chapter":"2 Geographic data in R","heading":"2.2 Vector data","text":"geographic vector data model based points located within coordinate reference system (CRS).\nPoints can represent self-standing features (e.g., location bus stop) can linked together form complex geometries lines polygons.\npoint geometries contain two dimensions (much less prominent 3-dimensional geometries contain additional \\(z\\) value, typically representing height sea level).system, example, London can represented coordinates c(-0.1, 51.5).\nmeans location -0.1 degrees east 51.5 degrees north origin.\norigin case 0 degrees longitude (Prime Meridian) 0 degrees latitude (Equator) geographic (‘lon/lat’) CRS (Figure 2.1, left panel).\npoint also approximated projected CRS ‘Easting/Northing’ values c(530000, 180000) British National Grid, meaning London located 530 km East 180 km North \\(origin\\) CRS.\ncan verified visually: slightly 5 ‘boxes’ — square areas bounded gray grid lines 100 km width — separate point representing London origin (Figure 2.1, right panel).location National Grid’s origin, sea beyond South West Peninsular, ensures locations UK positive Easting Northing values.6\nCRSs, described Section 2.4 Chapter 7 , purposes section, sufficient know coordinates consist two numbers representing distance origin, usually \\(x\\) \\(y\\) dimensions.\nFIGURE 2.1: Illustration vector (point) data location London (red X) represented reference origin (blue circle). left plot represents geographic CRS origin 0° longitude latitude. right plot represents projected CRS origin located sea west South West Peninsula.\nsf package provides classes geographic vector data consistent command line interface important low level libraries geocomputation:GDAL, reading, writing manipulating wide range geographic data formats, covered Chapter 8PROJ, powerful library coordinate system transformations, underlies content covered Chapter 7GEOS, planar geometry engine operations calculating buffers centroids data projected CRS, covered Chapter 5S2, spherical geometry engine written C++ developed Google, via s2 package, covered Section 2.2.9 Chapter 7Information interfaces printed sf first time package loaded: message appears library(sf) command beginning chapter tells us versions linked GEOS, GDAL PROJ libraries (vary computers time) whether S2 interface turned .\nNowadays, take granted, however, tight integration different geographic libraries makes reproducible geocomputation possible first place.neat feature sf can change default geometry engine used unprojected data: ‘switching ’ S2 can done command sf::sf_use_s2(FALSE), meaning planar geometry engine GEOS used default geometry operations, including geometry operations unprojected data.\nsee Section 2.2.9, planar geometry based 2 dimensional space.\nPlanar geometry engines GEOS assume ‘flat’ (projected) coordinates spherical geometry engines S2 assume unprojected (lon/lat) coordinates.section introduces sf classes preparation subsequent chapters (Chapters 5 8 cover GEOS GDAL interface, respectively).","code":""},{"path":"spatial-class.html","id":"intro-sf","chapter":"2 Geographic data in R","heading":"2.2.1 An introduction to simple features","text":"Simple features open standard developed endorsed Open Geospatial Consortium (OGC), --profit organization whose activities revisit later chapter (Section 8.2).\nSimple features hierarchical data model represents wide range geometry types.\n18 geometry types supported specification, 7 used vast majority geographic research (see Figure 2.2);\ncore geometry types fully supported R package sf (Pebesma 2018).7\nFIGURE 2.2: Simple feature types fully supported sf.\nsf can represent common vector geometry types (raster data classes supported sf): points, lines, polygons respective ‘multi’ versions (group together features type single feature).\nsf also supports geometry collections, can contain multiple geometry types single object.\nsf provides functionality () previously provided three packages — sp data classes (Pebesma Bivand 2023a), rgdal data read/write via interface GDAL PROJ (Bivand, Keitt, Rowlingson 2023) rgeos spatial operations via interface GEOS (Bivand Rundel 2023).re-iterate message Chapter 1, geographic R packages long history interfacing lower level libraries, sf continues tradition unified interface recent versions GEOS geometry operations, GDAL library reading writing geographic data files, PROJ library representing transforming projected coordinate reference systems.\ns2, R interface Google’s spherical geometry library, s2, sf also access fast accurate “measurements operations non-planar geometries” (Bivand 2021).\nSince sf version 1.0.0, launched June 2021, s2 functionality now used default geometries geographic (longitude/latitude) coordinate systems, unique feature sf differs spatial libraries support GEOS geometry operations Python package GeoPandas.\ndiscuss s2 subsequent chapters.sf’s ability integrate multiple powerful libraries geocomputation single framework notable achievement reduces ‘barriers entry’ world reproducible geographic data analysis high-performance libraries.\nsf’s functionality well documented website r-spatial.github.io/sf/ contains 7 vignettes.\ncan viewed offline follows:first vignette explains, simple feature objects R stored data frame, geographic data occupying special column, usually named ‘geom’ ‘geometry’.\nuse world dataset provided spData (Bivand, Nowosad, Lovelace 2023), loaded beginning chapter, show sf objects work.\nworld ‘sf data frame’ containing spatial attribute columns, names returned function names() (last column example contains geographic information).contents geom column give sf objects spatial powers: world$geom ‘list column’ contains coordinates country polygons.\nsf objects can plotted quickly function plot().\nAlthough part R’s default installation (base R), plot() generic extended packages.\nsf contains non-exported (hidden users time) plot.sf() function called behind scenes following command, creates Figure 2.3.\nFIGURE 2.3: spatial plot world using sf package, facet attribute.\nNote instead creating single map default geographic objects, GIS programs , plot()ing sf objects results map variable datasets.\nbehavior can useful exploring spatial distribution different variables discussed Section 2.2.3.broadly, treating geographic objects regular data frames spatial powers many advantages, especially already used working data frames.\ncommonly used summary() function, example, provides useful overview variables within world object.Although selected one variable summary() command, also outputs report geometry.\ndemonstrates ‘sticky’ behavior geometry columns sf objects, meaning geometry kept unless user deliberately removes , ’ll see Section 3.2.\nresult provides quick summary non-spatial spatial data contained world: mean average life expectancy 71 years (ranging less 51 83 years median 73 years) across countries.also worth taking deeper look basic behavior contents simple feature object, can usefully thought ‘spatial data frame’.sf objects easy subset: code shows return object containing first two rows first three columns world object.\noutput shows two major differences compared regular data.frame: inclusion additional geographic metadata (Geometry type, Dimension, Bounding box coordinate reference system information), presence ‘geometry column’, named geom:may seem rather complex, especially class system supposed ‘simple’!\nHowever, good reasons organizing things way using sf work vector geographic datasets.describing geometry type sf package supports, worth taking step back understand building blocks sf objects.\nSection 2.2.5 shows simple features objects data frames, special geometry columns.\nspatial columns often called geom geometry: world$geom refers spatial element world object described .\ngeometry columns ‘list columns’ class sfc (see Section 2.2.7).\nturn, sfc objects composed one objects class sfg: simple feature geometries describe Section 2.2.6.\nunderstand spatial components simple features work, vital understand simple feature geometries.\nreason cover currently supported simple features geometry type Section 2.2.4 moving describe can represented R using sf objects, based sfg sfc objects.","code":"\nvignette(package = \"sf\") # see which vignettes are available\nvignette(\"sf1\") # an introduction to the package\nclass(world)\n#> [1] \"sf\" \"tbl_df\" \"tbl\" \"data.frame\"\nnames(world)\n#> [1] \"iso_a2\" \"name_long\" \"continent\" \"region_un\" \"subregion\" \"type\" \n#> [7] \"area_km2\" \"pop\" \"lifeExp\" \"gdpPercap\" \"geom\"\nplot(world)\nsummary(world[\"lifeExp\"])\n#> lifeExp geom \n#> Min. :50.6 MULTIPOLYGON :177 \n#> 1st Qu.:65.0 epsg:4326 : 0 \n#> Median :72.9 +proj=long...: 0 \n#> Mean :70.9 \n#> 3rd Qu.:76.8 \n#> Max. :83.6 \n#> NA's :10\nworld_mini = world[1:2, 1:3]\nworld_mini\n#> Simple feature collection with 2 features and 3 fields\n#> Geometry type: MULTIPOLYGON\n#> Dimension: XY\n#> Bounding box: xmin: -180 ymin: -18.3 xmax: 180 ymax: -0.95\n#> Geodetic CRS: WGS 84\n#> iso_a2 name_long continent geom\n#> 1 FJ Fiji Oceania MULTIPOLYGON (((-180 -16.6,...\n#> 2 TZ Tanzania Africa MULTIPOLYGON (((33.9 -0.95,..."},{"path":"spatial-class.html","id":"why-simple-features","chapter":"2 Geographic data in R","heading":"2.2.2 Why simple features?","text":"Simple features widely supported data model underlies data structures many GIS applications including QGIS PostGIS.\nmajor advantage using data model ensures work cross-transferable setups, example importing exporting spatial databases.\nspecific question R perspective “use sf package”?\nmany reasons (linked advantages simple features model):Fast reading writing dataEnhanced plotting performancesf objects can treated data frames operationssf function names relatively consistent intuitive (begin st_)sf functions can combined |> operator works well tidyverse collection R packages.sf’s support tidyverse packages exemplified read_sf(), function importing geographic vector data covered detail Section 8.3.1.\nUnlike function st_read(), returns attributes stored base R data.frame (emits verbose messages, shown code chunk ), read_sf() silently returns data tidyverse tibble.\ndemonstrated :described Chapter 3, shows manipulate sf objects tidyverse functions, sf now go-package analysis spatial vector data R.\nspatstat, package ecosystem provides numerous functions spatial statistics, terra vector geographic data classes, neither level uptake sf working vector data.\nMany popular packages build sf, shown rise popularity terms number downloads per day, shown Section 1.5 previous chapter.","code":"\nworld_dfr = st_read(system.file(\"shapes/world.shp\", package = \"spData\"))\n#> Reading layer `world' from data source \n#> `/usr/local/lib/R/site-library/spData/shapes/world.shp' using driver `ESRI Shapefile'\n#> Simple feature collection with 177 features and 10 fields\n#> Geometry type: MULTIPOLYGON\n#> Dimension: XY\n#> Bounding box: xmin: -180 ymin: -89.9 xmax: 180 ymax: 83.6\n#> Geodetic CRS: WGS 84\nworld_tbl = read_sf(system.file(\"shapes/world.shp\", package = \"spData\"))\nclass(world_dfr)\n#> [1] \"sf\" \"data.frame\"\nclass(world_tbl)\n#> [1] \"sf\" \"tbl_df\" \"tbl\" \"data.frame\""},{"path":"spatial-class.html","id":"basic-map","chapter":"2 Geographic data in R","heading":"2.2.3 Basic map-making","text":"Basic maps created sf plot().\ndefault creates multi-panel plot, one sub-plot variable object, illustrated left-hand panel Figure 2.4.\nlegend ‘key’ continuous color produced object plotted single variable (see right-hand panel).\nColors can also set col =, although create continuous palette legend.\n\nFIGURE 2.4: Plotting sf, multiple variables (left) single variable (right).\nPlots added layers existing images setting add = TRUE.8\ndemonstrate , provide insight contents Chapters 3 4 attribute spatial data operations, subsequent code chunk filters countries Asia combines single feature:can now plot Asian continent map world.\nNote first plot must one facet add = TRUE work.\nfirst plot key, reset = FALSE must used:\nFIGURE 2.5: plot Asia added layer top countries worldwide.\nvarious ways modify maps sf’s plot() method.\nsf extends base R plotting methods, plot()’s arguments work sf objects (see ?graphics::plot ?par information arguments main =).9\nFigure 2.6 illustrates flexibility overlaying circles, whose diameters (set cex =) represent country populations, map world.\nunprojected version figure can created following commands (see exercises end chapter script 02-contplot.R reproduce Figure 2.6):\nFIGURE 2.6: Country continents (represented fill color) 2015 populations (represented circles, area proportional population).\ncode uses function st_centroid() convert one geometry type (polygons) another (points) (see Chapter 5), aesthetics varied cex argument.\nsf’s plot method also arguments specific geographic data.\nexpandBB, example, can used plot sf object context:\ntakes numeric vector length four expands bounding box plot relative zero following order: bottom, left, top, right.\nused plot India context giant Asian neighbors, emphasis China east, following code chunk, generates Figure 2.7 (see exercises adding text plots):10\nFIGURE 2.7: India context, demonstrating expandBB argument.\nNote use lwd emphasize India plotting code.\nSee Section 9.2 visualization techniques representing range geometry types, subject next section.","code":"\nplot(world[3:6])\nplot(world[\"pop\"])\nworld_asia = world[world$continent == \"Asia\", ]\nasia = st_union(world_asia)\nplot(world[\"pop\"], reset = FALSE)\nplot(asia, add = TRUE, col = \"red\")\nplot(world[\"continent\"], reset = FALSE)\ncex = sqrt(world$pop) / 10000\nworld_cents = st_centroid(world, of_largest = TRUE)\nplot(st_geometry(world_cents), add = TRUE, cex = cex)\nindia = world[world$name_long == \"India\", ]\nplot(st_geometry(india), expandBB = c(0, 0.2, 0.1, 1), col = \"gray\", lwd = 3)\nplot(st_geometry(world_asia), add = TRUE)"},{"path":"spatial-class.html","id":"geometry","chapter":"2 Geographic data in R","heading":"2.2.4 Geometry types","text":"Geometries basic building blocks simple features.\nSimple features R can take one 18 geometry types supported sf package.\nchapter focus seven commonly used types: POINT, LINESTRING, POLYGON, MULTIPOINT, MULTILINESTRING, MULTIPOLYGON GEOMETRYCOLLECTION.Generally, well-known binary (WKB) well-known text (WKT) standard encoding simple feature geometries.\nWKB representations usually hexadecimal strings easily readable computers.\nGIS spatial databases use WKB transfer store geometry objects.\nWKT, hand, human-readable text markup description simple features.\nformats exchangeable, present one, naturally choose WKT representation.basis geometry type point.\npoint simply coordinate 2D, 3D 4D space (see vignette(\"sf1\") information) (Figure 2.8, left panel):\nPOINT (5 2)\nlinestring sequence points straight line connecting points, example (Figure 2.8, middle panel):LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2)polygon sequence points form closed, non-intersecting ring.\nClosed means first last point polygon coordinates (Figure 2.8, right panel).11\nPolygon without hole: POLYGON ((1 5, 2 2, 4 1, 4 4, 1 5))\nFIGURE 2.8: Illustration point, linestring polygon geometries.\nfar created geometries one geometric entity per feature.\nSimple feature standard also allows multiple geometries single type exist within single feature within “multi” version geometry type (Figure 2.9):Multipoint: MULTIPOINT (5 2, 1 3, 3 4, 3 2)Multilinestring: MULTILINESTRING ((1 5, 4 4, 4 1, 2 2, 3 2), (1 2, 2 4))Multipolygon: MULTIPOLYGON (((1 5, 2 2, 4 1, 4 4, 1 5), (0 2, 1 2, 1 3, 0 3, 0 2)))\nFIGURE 2.9: Illustration multi* geometries.\nFinally, geometry collection can contain combination geometries including (multi)points linestrings (see Figure 2.10):\nGeometry collection: GEOMETRYCOLLECTION (MULTIPOINT (5 2, 1 3, 3 4, 3 2), LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2))\nFIGURE 2.10: Illustration geometry collection.\n","code":""},{"path":"spatial-class.html","id":"sf","chapter":"2 Geographic data in R","heading":"2.2.5 The sf class","text":"Simple features consist two main parts: geometries non-geographic attributes.\nFigure 2.11 shows sf object created – geometries come sfc object, attributes taken data.frame tibble.12\nFIGURE 2.11: Building blocks sf objects.\nNon-geographic attributes represent name feature attributes measured values, groups, things.\nillustrate attributes, represent temperature 25°C London June 21st, 2023.\nexample contains geometry (coordinates), three attributes three different classes (place name, temperature date).13\nObjects class sf represent data combining attributes (data.frame) simple feature geometry column (sfc).\ncreated st_sf() illustrated , creates London example described :just happened? First, coordinates used create simple feature geometry (sfg).\nSecond, geometry converted simple feature geometry column (sfc), CRS.\nThird, attributes stored data.frame, combined sfc object st_sf().\nresults sf object, demonstrated (output omitted):result shows sf objects actually two classes, sf data.frame.\nSimple features simply data frames (square tables), spatial attributes stored list column, usually called geometry geom, described Section 2.2.1.\nduality central concept simple features:\ntime sf can treated behaves like data.frame.\nSimple features , essence, data frames spatial extension.","code":"\nlnd_point = st_point(c(0.1, 51.5)) # sfg object\nlnd_geom = st_sfc(lnd_point, crs = \"EPSG:4326\") # sfc object\nlnd_attrib = data.frame( # data.frame object\n name = \"London\",\n temperature = 25,\n date = as.Date(\"2023-06-21\")\n)\nlnd_sf = st_sf(lnd_attrib, geometry = lnd_geom) # sf object\nlnd_sf\n#> Simple feature collection with 1 features and 3 fields\n#> ...\n#> name temperature date geometry\n#> 1 London 25 2023-06-21 POINT (0.1 51.5)\nclass(lnd_sf)\n#> [1] \"sf\" \"data.frame\""},{"path":"spatial-class.html","id":"sfg","chapter":"2 Geographic data in R","heading":"2.2.6 Simple feature geometries (sfg)","text":"sfg class represents different simple feature geometry types R: point, linestring, polygon (‘multi’ equivalents, multipoints) geometry collection.\nUsually spared tedious task creating geometries since can simply import already existing spatial file.\nHowever, set functions create simple feature geometry objects (sfg) scratch needed.\nnames functions simple consistent, start st_ prefix end name geometry type lowercase letters:point: st_point()linestring: st_linestring()polygon: st_polygon()multipoint: st_multipoint()multilinestring: st_multilinestring()multipolygon: st_multipolygon()geometry collection: st_geometrycollection()sfg objects can created three base R data types:numeric vector: single pointA matrix: set points, row represents point, multipoint linestringA list: collection objects matrices, multilinestrings geometry collectionsThe function st_point() creates single points numeric vectors:results show XY (2D coordinates), XYZ (3D coordinates) XYZM (3D additional variable, typically measurement accuracy) point types created vectors length 2, 3, 4, respectively.\nXYM type must specified using dim argument (short dimension).contrast, use matrices case multipoint (st_multipoint()) linestring (st_linestring()) objects:Finally, use lists creation multilinestrings, (multi-)polygons geometry collections:","code":"\nst_point(c(5, 2)) # XY point\n#> POINT (5 2)\nst_point(c(5, 2, 3)) # XYZ point\n#> POINT Z (5 2 3)\nst_point(c(5, 2, 1), dim = \"XYM\") # XYM point\n#> POINT M (5 2 1)\nst_point(c(5, 2, 3, 1)) # XYZM point\n#> POINT ZM (5 2 3 1)\n# the rbind function simplifies the creation of matrices\n## MULTIPOINT\nmultipoint_matrix = rbind(c(5, 2), c(1, 3), c(3, 4), c(3, 2))\nst_multipoint(multipoint_matrix)\n#> MULTIPOINT ((5 2), (1 3), (3 4), (3 2))\n## LINESTRING\nlinestring_matrix = rbind(c(1, 5), c(4, 4), c(4, 1), c(2, 2), c(3, 2))\nst_linestring(linestring_matrix)\n#> LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2)\n## POLYGON\npolygon_list = list(rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5)))\nst_polygon(polygon_list)\n#> POLYGON ((1 5, 2 2, 4 1, 4 4, 1 5))\n## POLYGON with a hole\npolygon_border = rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5))\npolygon_hole = rbind(c(2, 4), c(3, 4), c(3, 3), c(2, 3), c(2, 4))\npolygon_with_hole_list = list(polygon_border, polygon_hole)\nst_polygon(polygon_with_hole_list)\n#> POLYGON ((1 5, 2 2, 4 1, 4 4, 1 5), (2 4, 3 4, 3 3, 2 3, 2 4))\n## MULTILINESTRING\nmultilinestring_list = list(rbind(c(1, 5), c(4, 4), c(4, 1), c(2, 2), c(3, 2)), \n rbind(c(1, 2), c(2, 4)))\nst_multilinestring(multilinestring_list)\n#> MULTILINESTRING ((1 5, 4 4, 4 1, 2 2, 3 2), (1 2, 2 4))\n## MULTIPOLYGON\nmultipolygon_list = list(list(rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5))),\n list(rbind(c(0, 2), c(1, 2), c(1, 3), c(0, 3), c(0, 2))))\nst_multipolygon(multipolygon_list)\n#> MULTIPOLYGON (((1 5, 2 2, 4 1, 4 4, 1 5)), ((0 2, 1 2, 1 3, 0 3, 0 2)))\n## GEOMETRYCOLLECTION\ngeometrycollection_list = list(st_multipoint(multipoint_matrix),\n st_linestring(linestring_matrix))\nst_geometrycollection(geometrycollection_list)\n#> GEOMETRYCOLLECTION (MULTIPOINT (5 2, 1 3, 3 4, 3 2),\n#> LINESTRING (1 5, 4 4, 4 1, 2 2, 3 2))"},{"path":"spatial-class.html","id":"sfc","chapter":"2 Geographic data in R","heading":"2.2.7 Simple feature columns (sfc)","text":"One sfg object contains single simple feature geometry.\nsimple feature geometry column (sfc) list sfg objects, additionally able contain information coordinate reference system use.\ninstance, combine two simple features one object two features, can use st_sfc() function.\nimportant since sfc represents geometry column sf data frames:cases, sfc object contains objects geometry type.\nTherefore, convert sfg objects type polygon simple feature geometry column, also end sfc object type polygon, can verified st_geometry_type().\nEqually, geometry column multilinestrings result sfc object type multilinestring:also possible create sfc object sfg objects different geometry types:mentioned , sfc objects can additionally store information coordinate reference systems (CRS).\ndefault value NA (Available), can verified st_crs():geometries sfc objects must CRS.\nCRS can specified crs argument st_sfc() (st_sf()), takes CRS identifier provided text string, crs = \"EPSG:4326\" (see Section 7.2 CRS representations details means).","code":"\n# sfc POINT\npoint1 = st_point(c(5, 2))\npoint2 = st_point(c(1, 3))\npoints_sfc = st_sfc(point1, point2)\npoints_sfc\n#> Geometry set for 2 features \n#> Geometry type: POINT\n#> Dimension: XY\n#> Bounding box: xmin: 1 ymin: 2 xmax: 5 ymax: 3\n#> CRS: NA\n#> POINT (5 2)\n#> POINT (1 3)\n# sfc POLYGON\npolygon_list1 = list(rbind(c(1, 5), c(2, 2), c(4, 1), c(4, 4), c(1, 5)))\npolygon1 = st_polygon(polygon_list1)\npolygon_list2 = list(rbind(c(0, 2), c(1, 2), c(1, 3), c(0, 3), c(0, 2)))\npolygon2 = st_polygon(polygon_list2)\npolygon_sfc = st_sfc(polygon1, polygon2)\nst_geometry_type(polygon_sfc)\n#> [1] POLYGON POLYGON\n#> 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE\n# sfc MULTILINESTRING\nmultilinestring_list1 = list(rbind(c(1, 5), c(4, 4), c(4, 1), c(2, 2), c(3, 2)), \n rbind(c(1, 2), c(2, 4)))\nmultilinestring1 = st_multilinestring((multilinestring_list1))\nmultilinestring_list2 = list(rbind(c(2, 9), c(7, 9), c(5, 6), c(4, 7), c(2, 7)), \n rbind(c(1, 7), c(3, 8)))\nmultilinestring2 = st_multilinestring((multilinestring_list2))\nmultilinestring_sfc = st_sfc(multilinestring1, multilinestring2)\nst_geometry_type(multilinestring_sfc)\n#> [1] MULTILINESTRING MULTILINESTRING\n#> 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE\n# sfc GEOMETRY\npoint_multilinestring_sfc = st_sfc(point1, multilinestring1)\nst_geometry_type(point_multilinestring_sfc)\n#> [1] POINT MULTILINESTRING\n#> 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE\nst_crs(points_sfc)\n#> Coordinate Reference System: NA\n# Set the CRS with an identifier referring to an 'EPSG' CRS code:\npoints_sfc_wgs = st_sfc(point1, point2, crs = \"EPSG:4326\")\nst_crs(points_sfc_wgs) # print CRS (only first 4 lines of output shown)\n#> Coordinate Reference System:\n#> User input: EPSG:4326 \n#> wkt:\n#> GEOGCRS[\"WGS 84\",\n#> ..."},{"path":"spatial-class.html","id":"sfheaders","chapter":"2 Geographic data in R","heading":"2.2.8 The sfheaders package","text":"\nsfheaders R package speeds-construction, conversion manipulation sf objects (Cooley 2020).\nfocuses building sf objects vectors, matrices data frames, rapidly, without depending sf library; exposing underlying C++ code header files (hence name, sfheaders).\napproach enables others extend using compiled fast-running code.\nEvery core sfheaders function corresponding C++ implementation, described Cpp vignette.\npeople, R functions sufficient benefit computational speed package.\nsfheaders developed separately sf, aims fully compatible, creating valid sf objects type described preceding sections.simplest use case sfheaders demonstrated code chunks examples building sfg, sfc, sf objects showing:vector converted sfg_POINTA matrix converted sfg_LINESTRINGA data frame converted sfg_POLYGONWe start creating simplest possible sfg object, single coordinate pair, assigned vector named v:example shows sfg object v_sfg_sfh printed sf loaded, demonstrating underlying structure.\nsf loaded (case ), result command indistinguishable sf objects:next examples shows sfheaders creates sfg objects matrices data frames:Reusing objects v, m, df can also build simple feature columns (sfc) follows (outputs shown):Similarly, sf objects can created follows:examples CRS (coordinate reference system) defined.\nplan calculations geometric operations using sf functions, encourage set CRS (see Chapter 7 details):sfheaders also good ‘deconstructing’ ‘reconstructing’ sf objects, meaning converting geometry columns data frames contain data coordinates vertex geometry feature (multi-feature) ids.\nfast reliable ‘casting’ geometry columns different types, topic covered Chapter 5.\nBenchmarks, package’s documentation test code developed book, show much faster sf package operations.","code":"\nv = c(1, 1)\nv_sfg_sfh = sfheaders::sfg_point(obj = v)\nv_sfg_sfh # printing without sf loaded\n#> [,1] [,2]\n#> [1,] 1 1\n#> attr(,\"class\")\n#> [1] \"XY\" \"POINT\" \"sfg\" \nv_sfg_sf = st_point(v)\nprint(v_sfg_sf) == print(v_sfg_sfh)\n#> POINT (1 1)\n#> POINT (1 1)\n#> [1] TRUE\n# matrices\nm = matrix(1:8, ncol = 2)\nsfheaders::sfg_linestring(obj = m)\n#> LINESTRING (1 5, 2 6, 3 7, 4 8)\n# data frames\ndf = data.frame(x = 1:4, y = 4:1)\nsfheaders::sfg_polygon(obj = df)\n#> POLYGON ((1 4, 2 3, 3 2, 4 1, 1 4))\nsfheaders::sfc_point(obj = v)\nsfheaders::sfc_linestring(obj = m)\nsfheaders::sfc_polygon(obj = df)\nsfheaders::sf_point(obj = v)\nsfheaders::sf_linestring(obj = m)\nsfheaders::sf_polygon(obj = df)\ndf_sf = sfheaders::sf_polygon(obj = df)\nst_crs(df_sf) = \"EPSG:4326\""},{"path":"spatial-class.html","id":"s2","chapter":"2 Geographic data in R","heading":"2.2.9 Spherical geometry operations with S2","text":"Spherical geometry engines based fact world round simple mathematical procedures geocomputation, calculating straight line two points area enclosed polygon, assume planar (projected) geometries.\nSince sf version 1.0.0, R supports spherical geometry operations ‘box’ (default), thanks interface Google’s S2 spherical geometry engine via s2 interface package\n.\nS2 perhaps best known example Discrete Global Grid System (DGGS).\nAnother example H3 global hexagonal hierarchical spatial index (Bondaruk, Roberts, Robertson 2020).Although potentially useful describing locations anywhere Earth using character strings, main benefit sf’s interface S2 provision drop-functions calculations distance, buffer, area calculations, described sf’s built documentation can opened command vignette(\"sf7\").sf can run two modes respect S2: .\ndefault S2 geometry engine turned , can verified following command:example consequences turning geometry engine shown , creating buffers around india object created earlier chapter (note warnings emitted S2 turned ) (Figure 2.12):\nFIGURE 2.12: Example consequences turning S2 geometry engine. representations buffer around India created command purple polygon object created S2 switched , resulting buffer 1 m. larger light green polygon created S2 switched , resulting buffer 1 degree, accurate.\nright panel Figure 2.12 incorrect buffer 1 degree return equal distance around india polygon (explanation issue, read Section 7.4).Throughout book assume S2 turned , unless explicitly stated.\nTurn following command.","code":"\nsf_use_s2()\n#> [1] TRUE\nindia_buffer_with_s2 = st_buffer(india, 1) # 1 meter\nsf_use_s2(FALSE)\n#> Spherical geometry (s2) switched off\nindia_buffer_without_s2 = st_buffer(india, 1) # 1 degree\n#> Warning in st_buffer.sfc(st_geometry(x), dist, nQuadSegs, endCapStyle =\n#> endCapStyle, : st_buffer does not correctly buffer longitude/latitude data\n#> dist is assumed to be in decimal degrees (arc_degrees).\nsf_use_s2(TRUE)\n#> Spherical geometry (s2) switched on"},{"path":"spatial-class.html","id":"raster-data","chapter":"2 Geographic data in R","heading":"2.3 Raster data","text":"spatial raster data model represents world continuous grid cells (often also called pixels; Figure 2.13:).\ndata model often refers -called regular grids, cell , constant size – focus regular grids book .\nHowever, several types grids exist, including rotated, sheared, rectilinear, curvilinear grids (see Chapter 1 Pebesma Bivand (2023b) Chapter 2 Tennekes Nowosad (2022)).raster data model usually consists raster header\nmatrix (rows columns) representing equally spaced cells (often also called pixels; Figure 2.13:).14\nraster header defines coordinate reference system, extent origin.\norigin (starting point) frequently coordinate lower left corner matrix (terra package, however, uses upper left corner, default (Figure 2.13:B)).\nheader defines extent via number columns, number rows cell size resolution.resolution can calculated follows:\\[\n\\text{resolution} = \\frac{\\text{xmax} - \\text{xmin}}{\\text{ncol}}, \\frac{\\text{ymax} - \\text{ymin}}{\\text{nrow}}\n\\]Starting origin, can easily access modify single cell either using ID cell (Figure 2.13:B) explicitly specifying rows columns.\nmatrix representation avoids storing explicitly coordinates four corner points (fact stores one coordinate, namely origin) cell corner case rectangular vector polygons.\nmap algebra (Section 4.3.2) makes raster processing much efficient faster vector data processing.contrast vector data, cell one raster layer can hold single value.15\nvalue might continuous categorical (Figure 2.13:C).\nFIGURE 2.13: Raster data types: () cell IDs, (B) cell values, (C) colored raster map.\nRaster maps usually represent continuous phenomena elevation, temperature, population density spectral data.\nDiscrete features soil land-cover classes can also represented raster data model.\nuses raster datasets illustrated Figure 2.14, shows borders discrete features may become blurred raster datasets.\nDepending nature application, vector representations discrete features may suitable.\nFIGURE 2.14: Examples continuous categorical rasters.\n","code":""},{"path":"spatial-class.html","id":"r-packages-for-working-with-raster-data","chapter":"2 Geographic data in R","heading":"2.3.1 R packages for working with raster data","text":"last two decades, several packages reading processing raster datasets developed.\noutlined Section 1.6, chief among raster, led step change R’s raster capabilities launched 2010 premier package space development terra stars.\nrecently developed packages provide powerful performant functions working raster datasets substantial overlap possible use cases.\nbook focus terra, replaces older (cases) slower raster.\nlearning terra’s class system works, section describes similarities differences terra stars; knowledge help decide appropriate different situations.First, terra focuses common raster data model (regular grids), stars also allows storing less popular models (including regular, rotated, sheared, rectilinear, curvilinear grids).\nterra usually handles one multilayered rasters16, stars package provides ways store raster data cubes – raster object many layers (e.g., bands), many moments time (e.g., months), many attributes (e.g., sensor type sensor type B).\nImportantly, packages, layers elements data cube must spatial dimensions extent.\nSecond, packages allow either read raster data memory just read metadata – usually done automatically based input file size.\nHowever, store raster values differently.\nterra based C++ code mostly uses C++ pointers.\nstars stores values lists arrays smaller rasters just file path larger ones.\nThird, stars functions closely related vector objects functions sf, terra uses class objects vector data, namely SpatVector, also accepts sf ones.17\nFourth, packages different approach various functions work objects.\nterra package mostly relies large number built-functions, function specific purpose (e.g., resampling cropping).\nhand, stars uses built-functions (usually names starting st_), existing dplyr functions (e.g., filter() slice()), also methods existing R functions (e.g., split() aggregate()).Importantly, straightforward convert objects terra stars (using st_as_stars()) way round (using rast()).\nalso encourage read Pebesma Bivand (2023b) comprehensive introduction stars package.","code":""},{"path":"spatial-class.html","id":"an-introduction-to-terra","chapter":"2 Geographic data in R","heading":"2.3.2 An introduction to terra","text":"\nterra package supports raster objects R.\nprovides extensive set functions create, read, export, manipulate process raster datasets.\nterra’s functionality largely mature raster package, differences: terra functions usually computationally efficient raster equivalents.\nhand, raster class system popular used many packages.\ncan seamlessly translate two types object ensure backwards compatibility older scripts packages, example, functions raster(), stack(), brick() raster package (see previous chapter evolution R packages working geographic data).addition functions raster data manipulation, terra provides many low-level functions can form foundation developing new tools working raster datasets.\nterra also lets work large raster datasets large fit main memory.\ncase, terra provides possibility divide raster smaller chunks, processes iteratively instead loading whole raster file RAM.illustration terra concepts, use datasets spDataLarge (Nowosad Lovelace 2023).\nconsists raster objects one vector object covering area Zion National Park (Utah, USA).\nexample, srtm.tif digital elevation model area (details, see documentation ?srtm).\nFirst, let’s create SpatRaster object named my_rast:Typing name raster console, print raster header (dimensions, resolution, extent, CRS) additional information (class, data source, summary raster values):Dedicated functions report component: dim() returns number rows, columns layers; ncell() number cells (pixels); res() spatial resolution; ext() spatial extent; crs() coordinate reference system (raster reprojection covered Section 7.8).\ninMemory() reports whether raster data stored memory disk, sources specifies file location.","code":"\nraster_filepath = system.file(\"raster/srtm.tif\", package = \"spDataLarge\")\nmy_rast = rast(raster_filepath)\nclass(my_rast)\n#> [1] \"SpatRaster\"\n#> attr(,\"package\")\n#> [1] \"terra\"\nmy_rast\n#> class : SpatRaster \n#> dimensions : 457, 465, 1 (nrow, ncol, nlyr)\n#> resolution : 0.000833, 0.000833 (x, y)\n#> extent : -113, -113, 37.1, 37.5 (xmin, xmax, ymin, ymax)\n#> coord. ref. : lon/lat WGS 84 (EPSG:4326) \n#> source : srtm.tif \n#> name : srtm \n#> min value : 1024 \n#> max value : 2892"},{"path":"spatial-class.html","id":"basic-map-raster","chapter":"2 Geographic data in R","heading":"2.3.3 Basic map-making","text":"Similar sf package, terra also provides plot() methods classes.\n\nFIGURE 2.15: Basic raster plot.\nseveral approaches plotting raster data R outside scope section, including:plotRGB() function terra package create plot based three layers SpatRaster objectPackages tmap create static interactive maps raster vector objects (see Chapter 9)Functions, example levelplot() rasterVis package, create facets, common technique visualizing change time","code":"\nplot(my_rast)"},{"path":"spatial-class.html","id":"raster-classes","chapter":"2 Geographic data in R","heading":"2.3.4 Raster classes","text":"\nSpatRaster class represents rasters object terra.\neasiest way create raster object R read-raster file disk server (Section 8.3.2).\nterra package supports numerous drivers help GDAL library.\nRasters files usually read entirely RAM, exception header pointer file .Rasters can also created scratch using rast() function.\nillustrated subsequent code chunk, results new SpatRaster object.\nresulting raster consists 36 cells (6 columns 6 rows specified nrows ncols) centered around Prime Meridian Equator (see xmin, xmax, ymin ymax parameters).\nValues (vals) assigned cell: 1 cell 1, 2 cell 2, .\nRemember: rast() fills cells row-wise (unlike matrix()) starting upper left corner, meaning top row contains values 1 6, second 7 12, etc.\nways creating raster objects, see ?rast.Given number rows columns well extent (xmin, xmax, ymin, ymax), resolution 0.5.\nunit resolution underlying CRS.\n, degrees, default CRS raster objects WGS84.\nHowever, one can specify CRS crs argument.SpatRaster class also handles multiple layers, typically correspond single multispectral satellite file time-series rasters.nlyr() retrieves number layers stored SpatRaster object:multilayer raster objects, layers can selected [[ $ operators, example commands multi_rast[[\"landsat_1\"]] multi_rast$landsat_1.\nterra::subset() can also used select layers.\naccepts layer number name second argument:opposite operation, combining several SpatRaster objects one, can done using c function:","code":"\nsingle_raster_file = system.file(\"raster/srtm.tif\", package = \"spDataLarge\")\nsingle_rast = rast(raster_filepath)\nnew_raster = rast(nrows = 6, ncols = 6, \n xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5,\n vals = 1:36)\nmulti_raster_file = system.file(\"raster/landsat.tif\", package = \"spDataLarge\")\nmulti_rast = rast(multi_raster_file)\nmulti_rast\n#> class : SpatRaster \n#> dimensions : 1428, 1128, 4 (nrow, ncol, nlyr)\n#> resolution : 30, 30 (x, y)\n#> extent : 301905, 335745, 4111245, 4154085 (xmin, xmax, ymin, ymax)\n#> coord. ref. : WGS 84 / UTM zone 12N (EPSG:32612) \n#> source : landsat.tif \n#> names : landsat_1, landsat_2, landsat_3, landsat_4 \n#> min values : 7550, 6404, 5678, 5252 \n#> max values : 19071, 22051, 25780, 31961\nnlyr(multi_rast)\n#> [1] 4\nmulti_rast3 = subset(multi_rast, 3)\nmulti_rast4 = subset(multi_rast, \"landsat_4\")\nmulti_rast34 = c(multi_rast3, multi_rast4)"},{"path":"spatial-class.html","id":"crs-intro","chapter":"2 Geographic data in R","heading":"2.4 Coordinate Reference Systems","text":"\nVector raster spatial data types share concepts intrinsic spatial data.\nPerhaps fundamental Coordinate Reference System (CRS), defines spatial elements data relate surface Earth (bodies).\nCRSs either geographic projected, introduced beginning chapter (see Figure 2.1).\nsection explains type, laying foundations Chapter 7, provides deep dive setting, transforming querying CRSs.","code":""},{"path":"spatial-class.html","id":"geographic-coordinate-reference-systems","chapter":"2 Geographic data in R","heading":"2.4.1 Geographic coordinate reference systems","text":"\nGeographic coordinate reference systems identify location Earth’s surface using two values — longitude latitude (Figure 2.17, left panel).\nLongitude location East-West direction angular distance Prime Meridian plane.\nLatitude angular distance North South equatorial plane.\nDistances geographic CRSs therefore measured meters.\nimportant consequences, demonstrated Section 7.surface Earth geographic coordinate reference systems represented spherical ellipsoidal surface.\nSpherical models assume Earth perfect sphere given radius – advantage simplicity , time, inaccurate Earth exactly sphere.\nEllipsoidal models slightly accurate, defined two parameters: equatorial radius polar radius.\nsuitable Earth compressed: equatorial radius around 11.5 km longer polar radius (Maling 1992).18Ellipsoids part wider component CRSs: datum.\ncontains information ellipsoid use precise relationship coordinates location Earth’s surface.\ntwo types datum — geocentric (WGS84) local (NAD83).\ncan see examples two types datums Figure 2.16.\nBlack lines represent geocentric datum, whose center located Earth’s center gravity optimized specific location.\nlocal datum, shown purple dashed line, ellipsoidal surface shifted align surface particular location.\nallow local variations Earth’s surface, example due large mountain ranges, accounted local CRS.\ncan seen Figure 2.16, local datum fitted area Philippines, misaligned rest planet’s surface.\ndatums Figure 2.16 put top geoid - model global mean sea level.19\nFIGURE 2.16: Geocentric local geodetic datums shown top geoid (false color vertical exaggeration 10,000 scale factor). Image geoid adapted work Ince et al. (2019).\n","code":""},{"path":"spatial-class.html","id":"projected-coordinate-reference-systems","chapter":"2 Geographic data in R","heading":"2.4.2 Projected coordinate reference systems","text":"\nprojected CRSs based geographic CRS, described previous section, rely map projections convert three-dimensional surface Earth Easting Northing (x y) values projected CRS.\nProjected CRSs based Cartesian coordinates implicitly flat surface (Figure 2.17, right panel).\norigin, x y axes, linear unit measurement meters.transition done without adding deformations.\nTherefore, properties Earth’s surface distorted process, area, direction, distance, shape.\nprojected coordinate reference system can preserve one two properties.\nProjections often named based property preserve: equal-area preserves area, azimuthal preserve direction, equidistant preserve distance, conformal preserve local shape.three main groups projection types - conic, cylindrical, planar (azimuthal).\nconic projection, Earth’s surface projected onto cone along single line tangency two lines tangency.\nDistortions minimized along tangency lines rise distance lines projection.\nTherefore, best suited maps mid-latitude areas.\ncylindrical projection maps surface onto cylinder.\nprojection also created touching Earth’s surface along single line tangency two lines tangency.\nCylindrical projections used often mapping entire world.\nplanar projection projects data onto flat surface touching globe point along line tangency.\ntypically used mapping polar regions.\nsf_proj_info(type = \"proj\") gives list available projections supported PROJ library.quick summary different projections, types, properties, suitability can found www.geo-projections.com.\nexpand CRSs explain project one CRS another Chapter 7.\nnow, sufficient know:coordinate systems key component geographic objectsKnowing CRS data , whether geographic (lon/lat) projected (typically meters), important consequences R handles spatial geometry operationsCRSs sf objects can queried function st_crs(), CRSs terra objects can queried function crs()\nFIGURE 2.17: Examples geographic (WGS 84; left) projected (NAD83 / UTM zone 12N; right) coordinate systems vector data type.\n","code":""},{"path":"spatial-class.html","id":"units","chapter":"2 Geographic data in R","heading":"2.5 Units","text":"important feature CRSs contain information spatial units.\nClearly, vital know whether house’s measurements feet meters, applies maps.\ngood cartographic practice add scale bar distance indicator onto maps demonstrate relationship distances page screen distances ground.\nLikewise, important formally specify units geometry data cells measured provide context, ensure subsequent calculations done context.novel feature geometry data sf objects native support units.\nmeans distance, area geometric calculations sf return values come units attribute, defined units package (Pebesma, Mailund, Hiebert 2016).\nadvantageous, preventing confusion caused different units (CRSs use meters, use feet) providing information dimensionality.\ndemonstrated code chunk , calculates area Luxembourg:\noutput units square meters (m2), showing result represents two-dimensional space.\ninformation, stored attribute (interested readers can discover attributes(st_area(luxembourg))), can feed subsequent calculations use units, population density (measured people per unit area, typically per km2).\nReporting units prevents confusion.\ntake Luxembourg example, units remained unspecified, one incorrectly assume units hectares.\ntranslate huge number digestible size, tempting divide results million (number square meters square kilometer):However, result incorrectly given square meters.\nsolution set correct units units package:Units equal importance case raster data.\nHowever, far sf spatial package supports units, meaning people working raster data approach changes units analysis (example, converting pixel widths imperial decimal units) care.\nmy_rast object (see ) uses WGS84 projection decimal degrees units.\nConsequently, resolution also given decimal degrees know , since res() function simply returns numeric vector.used UTM projection, units change., res() command gives back numeric vector without unit, forcing us know unit UTM projection meters.","code":"\nluxembourg = world[world$name_long == \"Luxembourg\", ]\nst_area(luxembourg) # requires the s2 package in recent versions of sf\n#> 2.41e+09 [m^2]\nst_area(luxembourg) / 1000000\n#> 2409 [m^2]\nunits::set_units(st_area(luxembourg), km^2)\n#> 2409 [km^2]\nres(my_rast)\n#> [1] 0.000833 0.000833\nrepr = project(my_rast, \"EPSG:26912\")\nres(repr)\n#> [1] 83.5 83.5"},{"path":"spatial-class.html","id":"ex2","chapter":"2 Geographic data in R","heading":"2.6 Exercises","text":"E1. Use summary() geometry column world data object included spData package. output tell us :geometry type?number countries?coordinate reference system (CRS)?E2. Run code ‘generated’ map world Section 2.2.3 (Basic map-making).\nFind two similarities two differences image computer book.cex argument (see ?plot)?cex set sqrt(world$pop) / 10000?Bonus: experiment different ways visualize global population.E3. Use plot() create maps Nigeria context (see Section 2.2.3).Adjust lwd, col expandBB arguments plot().Challenge: read documentation text() annotate map.E4. Create empty SpatRaster object called my_raster 10 columns 10 rows.\nAssign random values 0 10 new raster plot .E5. Read-raster/nlcd.tif file spDataLarge package.\nkind information can get properties file?E6. Check CRS raster/nlcd.tif file spDataLarge package.\nkind information can learn ?","code":""},{"path":"attr.html","id":"attr","chapter":"3 Attribute data operations","heading":"3 Attribute data operations","text":"","code":""},{"path":"attr.html","id":"prerequisites-1","chapter":"3 Attribute data operations","heading":"Prerequisites","text":"chapter requires following packages installed attached:relies spData, loads datasets used code examples chapter:Also ensure installed tidyr package, tidyverse part, want run data ‘tidying’ operations Section 3.2.5.","code":"\nlibrary(sf) # vector data package introduced in Chapter 2\nlibrary(terra) # raster data package introduced in Chapter 2\nlibrary(dplyr) # tidyverse package for data frame manipulation\nlibrary(spData) # spatial data package introduced in Chapter 2"},{"path":"attr.html","id":"introduction","chapter":"3 Attribute data operations","heading":"3.1 Introduction","text":"\nAttribute data non-spatial information associated geographic (geometry) data.\nbus stop provides simple example: position typically represented latitude longitude coordinates (geometry data), addition name.\nElephant & Castle / New Kent Road stop London, example coordinates -0.098 degrees longitude 51.495 degrees latitude can represented POINT (-0.098 51.495) sfc representation described Chapter 2.\nAttributes, name, POINT feature (use simple features terminology) topic chapter.\nAnother example elevation value (attribute) specific grid cell raster data.\nUnlike vector data model, raster data model stores coordinate grid cell indirectly, meaning distinction attribute spatial information less clear.\nillustrate point, think pixel 3rd row 4th column raster matrix.\nspatial location defined index matrix: move origin four cells x direction (typically east right maps) three cells y direction (typically south ).\nraster’s resolution defines distance x- y-step specified header.\nheader vital component raster datasets specifies pixels relate spatial coordinates (see also Chapter 4).chapter teaches manipulate geographic objects based attributes names bus stops vector dataset elevations pixels raster dataset.\nvector data, means techniques subsetting aggregation (see Sections 3.2.1 3.2.3).\nSections 3.2.4 3.2.5 demonstrate join data onto simple feature objects using shared ID create new variables, respectively.\noperations spatial equivalent:\n[ operator base R, example, works equally subsetting objects based attribute spatial objects; can also join attributes two geographic datasets using spatial joins.\ngood news: skills developed chapter cross-transferable.deep dive various types vector attribute operations next section, raster attribute data operations covered.\nCreation raster layers containing continuous categorical attributes extraction cell values one layer (raster subsetting) (Section 3.3.1) demonstrated.\nSection 3.3.2 provides overview ‘global’ raster operations can used summarize entire raster datasets.\nChapter 4 extends methods presented spatial world.","code":""},{"path":"attr.html","id":"vector-attribute-manipulation","chapter":"3 Attribute data operations","heading":"3.2 Vector attribute manipulation","text":"\nGeographic vector datasets well supported R thanks sf class, extends base R’s data.frame.\nLike data frames, sf objects one column per attribute variable (‘name’) one row per observation feature (e.g., per bus station).\nsf objects differ basic data frames geometry column class sfc can contain range geographic entities (single ‘multi’ point, line, polygon features) per row.\ndescribed Chapter 2, demonstrated generic methods plot() summary() work sf objects.\nsf also provides generics allow sf objects behave like regular data frames, shown printing class’s methods:Many (aggregate(), cbind(), merge(), rbind() [) manipulating data frames.\nrbind(), example, binds rows data frames together, one ‘top’ .\n$<- creates new columns.\nkey feature sf objects store spatial non-spatial data way, columns data.frame.geometry column sf objects typically called geometry geom name can used.\nfollowing command, example, creates geometry column named g:st_sf(data.frame(n = world$name_long), g = world$geom)sf objects can also extend tidyverse classes data frames, tbl_df tbl.\nThus sf enables full power R’s data analysis capabilities unleashed geographic data, whether use base R tidyverse functions data analysis.\nsf objects can also used high-performance data processing package data.table although, documented issue Rdatatable/data.table#2273, fully compatible sf objects.\nusing capabilities worth re-capping discover basic properties vector data objects.\nLet’s start using base R functions learn world dataset spData package:\nworld contains ten non-geographic columns (one geometry list column) almost 200 rows representing world’s countries.\nfunction st_drop_geometry() keeps attributes data sf object, words removing geometry:Dropping geometry column working attribute data can useful; data manipulation processes can run faster work attribute data geometry columns always needed.\ncases, however, makes sense keep geometry column, explaining column ‘sticky’ (remains attribute operations unless specifically dropped).\nNon-spatial data operations sf objects change object’s geometry appropriate (e.g., dissolving borders adjacent polygons following aggregation).\nBecoming skilled geographic attribute data manipulation means becoming skilled manipulating data frames.many applications, tidyverse package dplyr (Wickham et al. 2023) offers effective approach working data frames.\nTidyverse compatibility advantage sf predecessor sp, pitfalls avoid (see supplementary tidyverse-pitfalls vignette geocompx.org details).","code":"\nmethods(class = \"sf\") # methods for sf objects, first 12 shown\n#> [1] [ [[<- $<- aggregate \n#> [5] as.data.frame cbind coerce filter \n#> [9] identify initialize merge plot \nclass(world) # it's an sf object and a (tidy) data frame\n#> [1] \"sf\" \"tbl_df\" \"tbl\" \"data.frame\"\ndim(world) # it is a 2 dimensional object, with 177 rows and 11 columns\n#> [1] 177 11\nworld_df = st_drop_geometry(world)\nclass(world_df)\n#> [1] \"tbl_df\" \"tbl\" \"data.frame\"\nncol(world_df)\n#> [1] 10"},{"path":"attr.html","id":"vector-attribute-subsetting","chapter":"3 Attribute data operations","heading":"3.2.1 Vector attribute subsetting","text":"Base R subsetting methods include operator [ function subset().\nkey dplyr subsetting functions filter() slice() subsetting rows, select() subsetting columns.\napproaches preserve spatial components attribute data sf objects, using operator $ dplyr function pull() return single attribute column vector lose geometry data, see.\nsection focuses subsetting sf data frames; details subsetting vectors non-geographic data frames recommend reading section section 2.7 Introduction R (R Core Team 2021) Chapter 4 Advanced R Programming (Wickham 2019), respectively.\n[ operator can subset rows columns.\nIndices placed inside square brackets placed directly data frame object name specify elements keep.\ncommand object[, j] means ‘return rows represented columns represented j’, j typically contain integers TRUEs FALSEs (indices can also character strings, indicating row column names).\nobject[5, 1:3], example, means ’return data containing 5th row columns 1 3: result data frame 1 row 3 columns, fourth geometry column ’s sf object.\nLeaving j empty returns rows columns, world[1:5, ] returns first five rows 11 columns.\nexamples demonstrate subsetting base R.\nGuess number rows columns sf data frames returned command check results computer (see end chapter exercises):demonstration utility using logical vectors subsetting shown code chunk .\ncreates new object, small_countries, containing nations whose surface area smaller 10,000 km2.intermediary i_small (short index representing small countries) logical vector can used subset seven smallest countries world surface area.\nconcise command, omits intermediary object, generates result:base R function subset() provides another way achieve result:\nBase R functions mature, stable widely used, making rock solid choice, especially contexts reproducibility reliability key.\ndplyr functions enable ‘tidy’ workflows people (authors book included) find intuitive productive interactive data analysis, especially combined code editors RStudio enable auto-completion column names.\nKey functions subsetting data frames (including sf data frames) dplyr functions demonstrated .select() selects columns name position.\nexample, select two columns, name_long pop, following command:Note: equivalent command base R (world[, c(\"name_long\", \"pop\")]), sticky geom column remains.\nselect() also allows selecting range columns help : operator:can remove specific columns - operator:Subset rename columns time new_name = old_name syntax:worth noting command concise base R equivalent, requires two lines code:select() also works ‘helper functions’ advanced subsetting operations, including contains(), starts_with() num_range() (see help page ?select details).dplyr verbs return data frame, can extract single column vector pull().\ncan get result base R list subsetting operators $ [[, three following commands return numeric vector:slice() row-equivalent select().\nfollowing code chunk, example, selects rows 1 6:filter() dplyr’s equivalent base R’s subset() function.\nkeeps rows matching given criteria, e.g., countries area certain threshold, high average life expectancy, shown following examples:standard set comparison operators can used filter() function, illustrated Table 3.1:TABLE 3.1: Comparison operators return Booleans (TRUE/FALSE).","code":"\nworld[1:6, ] # subset rows by position\nworld[, 1:3] # subset columns by position\nworld[1:6, 1:3] # subset rows and columns by position\nworld[, c(\"name_long\", \"pop\")] # columns by name\nworld[, c(T, T, F, F, F, F, F, T, T, F, F)] # by logical indices\nworld[, 888] # an index representing a non-existent column\ni_small = world$area_km2 < 10000\nsummary(i_small) # a logical vector\n#> Mode FALSE TRUE \n#> logical 170 7\nsmall_countries = world[i_small, ]\nsmall_countries = world[world$area_km2 < 10000, ]\nsmall_countries = subset(world, area_km2 < 10000)\nworld1 = select(world, name_long, pop)\nnames(world1)\n#> [1] \"name_long\" \"pop\" \"geom\"\n# all columns between name_long and pop (inclusive)\nworld2 = select(world, name_long:pop)\n# all columns except subregion and area_km2 (inclusive)\nworld3 = select(world, -subregion, -area_km2)\nworld4 = select(world, name_long, population = pop)\nworld5 = world[, c(\"name_long\", \"pop\")] # subset columns by name\nnames(world5)[names(world5) == \"pop\"] = \"population\" # rename column manually\npull(world, pop)\nworld$pop\nworld[[\"pop\"]]\nslice(world, 1:6)\nworld7 = filter(world, area_km2 < 10000) # countries with a small area\nworld7 = filter(world, lifeExp > 82) # with high life expectancy"},{"path":"attr.html","id":"chaining-commands-with-pipes","chapter":"3 Attribute data operations","heading":"3.2.2 Chaining commands with pipes","text":"\nKey workflows using dplyr functions ‘pipe’ operator %>% (since R 4.1.0 native pipe |>), takes name Unix pipe | (Grolemund Wickham 2016).\nPipes enable expressive code: output previous function becomes first argument next function, enabling chaining.\nillustrated , countries Asia filtered world dataset, next object subset columns (name_long continent) first five rows (result shown).chunk shows pipe operator allows commands written clear order:\nrun top bottom (line--line) left right.\nalternative piped operations nested function calls, harder read:Another alternative split operations multiple self-contained lines, recommended developing new R packages, approach advantage saving intermediate results distinct names can later inspected debugging purposes (approach disadvantages verbose cluttering global environment undertaking interactive analysis):approach advantages disadvantages, importance depend programming style applications.\ninteractive data analysis, focus chapter, find piped operations fast intuitive, especially combined RStudio/VSCode shortcuts creating pipes auto-completing variable names.","code":"\nworld7 = world |>\n filter(continent == \"Asia\") |>\n select(name_long, continent) |>\n slice(1:5)\nworld8 = slice(\n select(\n filter(world, continent == \"Asia\"),\n name_long, continent),\n 1:5)\nworld9_filtered = filter(world, continent == \"Asia\")\nworld9_selected = select(world9_filtered, continent)\nworld9 = slice(world9_selected, 1:5)"},{"path":"attr.html","id":"vector-attribute-aggregation","chapter":"3 Attribute data operations","heading":"3.2.3 Vector attribute aggregation","text":"\nAggregation involves summarizing data one ‘grouping variables’, typically columns data frame aggregated (geographic aggregation covered next chapter).\nexample attribute aggregation calculating number people per continent based country-level data (one row per country).\nworld dataset contains necessary ingredients: columns pop continent, population grouping variable, respectively.\naim find sum() country populations continent, resulting smaller data frame (aggregation form data reduction can useful early step working large datasets).\ncan done base R function aggregate() follows:result non-spatial data frame six rows, one per continent, two columns reporting name population continent (see Table 3.2 results top 3 populous continents).aggregate() generic function means behaves differently depending inputs.\nsf provides method aggregate.sf() activated automatically x sf object argument provided:resulting world_agg2 object spatial object containing 8 features representing continents world (open ocean).\ngroup_by() |> summarize() dplyr equivalent aggregate(), variable name provided group_by() function specifying grouping variable information summarized passed summarize() function, shown :approach may seem complex benefits: flexibility, readability, control new column names.\nflexibility illustrated command , calculates population also area number countries continent:previous code chunk Pop, Area N column names result, sum() n() aggregating functions.\naggregating functions return sf objects rows representing continents geometries containing multiple polygons representing land mass associated islands (works thanks geometric operation ‘union’, explained Section 5.2.7).\nLet’s combine learned far dplyr functions, chaining multiple commands summarize attribute data countries worldwide continent.\nfollowing command calculates population density (mutate()), arranges continents number countries contain (arrange()), keeps 3 populous continents (slice_max()), result presented Table 3.2):TABLE 3.2: top 3 populous continents ordered number countries.","code":"\nworld_agg1 = aggregate(pop ~ continent, FUN = sum, data = world,\n na.rm = TRUE)\nclass(world_agg1)\n#> [1] \"data.frame\"\nworld_agg2 = aggregate(world[\"pop\"], by = list(world$continent), FUN = sum, \n na.rm = TRUE)\nclass(world_agg2)\n#> [1] \"sf\" \"data.frame\"\nnrow(world_agg2)\n#> [1] 8\nworld_agg3 = world |>\n group_by(continent) |>\n summarize(pop = sum(pop, na.rm = TRUE))\nworld_agg4 = world |> \n group_by(continent) |> \n summarize(Pop = sum(pop, na.rm = TRUE), Area = sum(area_km2), N = n())\nworld_agg5 = world |> \n st_drop_geometry() |> # drop the geometry for speed\n select(pop, continent, area_km2) |> # subset the columns of interest \n group_by(continent) |> # group by continent and summarize:\n summarize(Pop = sum(pop, na.rm = TRUE), Area = sum(area_km2), N = n()) |>\n mutate(Density = round(Pop / Area)) |> # calculate population density\n slice_max(Pop, n = 3) |> # keep only the top 3\n arrange(desc(N)) # arrange in order of n. countries"},{"path":"attr.html","id":"vector-attribute-joining","chapter":"3 Attribute data operations","heading":"3.2.4 Vector attribute joining","text":"Combining data different sources common task data preparation.\nJoins combining tables based shared ‘key’ variable.\ndplyr multiple join functions including left_join() inner_join() — see vignette(\"two-table\") full list.\nfunction names follow conventions used database language SQL (Grolemund Wickham 2016, chap. 13); using join non-spatial datasets sf objects focus section.\ndplyr join functions work data frames sf objects, important difference geometry list column.\nresult data joins can either sf data.frame object.\ncommon type attribute join spatial data takes sf object first argument adds columns data.frame specified second argument.\ndemonstrate joins, combine data coffee production world dataset.\ncoffee data data frame called coffee_data spData package (see ?coffee_data details).\nthree columns:\nname_long names major coffee-producing nations coffee_production_2016 coffee_production_2017 contain estimated values coffee production units 60-kg bags year.\n‘left join’, preserves first dataset, merges world coffee_data.input datasets share ‘key variable’ (name_long) join worked without using argument (see ?left_join details).\nresult sf object identical original world object two new variables (column indices 11 12) coffee production.\ncan plotted map, illustrated Figure 3.1, generated plot() function .\nFIGURE 3.1: World coffee production (thousand 60-kg bags) country, 2017. Source: International Coffee Organization.\njoining work, ‘key variable’ must supplied datasets.\ndefault, dplyr uses variables matching names.\ncase, coffee_data world objects contained variable called name_long, explaining message Joining '= join_by(name_long)'.\nmajority cases variable names , two options:Rename key variable one objects match.Use argument specify joining variables.latter approach demonstrated renamed version coffee_data.Note name original object kept, meaning world_coffee new object world_coffee2 identical.\nAnother feature result number rows original dataset.\nAlthough 47 rows data coffee_data, 177 country records kept intact world_coffee world_coffee2:\nrows original dataset match assigned NA values new coffee production variables.\nwant keep countries match key variable?\ncase inner join can used.Note result inner_join() 45 rows compared 47 coffee_data.\nhappened remaining rows?\ncan identify rows match using setdiff() function follows:result shows Others accounts one row present world dataset name Democratic Republic Congo accounts :\nabbreviated, causing join miss .\nfollowing command uses string matching (regex) function stringr package confirm Congo, Dem. Rep. .fix issue, create new version coffee_data update name.\ninner_join()ing updated data frame returns result 46 coffee-producing nations.also possible join direction: starting non-spatial dataset adding variables simple features object.\ndemonstrated , starts coffee_data object adds variables original world dataset.\ncontrast previous joins, result another simple feature object, data frame form tidyverse tibble:\noutput join tends match first argument.section covers majority joining use cases.\ninformation, recommend reading chapter Relational data Grolemund Wickham (2016), join vignette geocompkg package accompanies book, documentation describing joins data.table packages.\nAdditionally, spatial joins covered next chapter (Section 4.2.5).","code":"\nworld_coffee = left_join(world, coffee_data)\n#> Joining with `by = join_by(name_long)`\nclass(world_coffee)\n#> [1] \"sf\" \"tbl_df\" \"tbl\" \"data.frame\"\nnames(world_coffee)\n#> [1] \"iso_a2\" \"name_long\" \"continent\" \n#> [4] \"region_un\" \"subregion\" \"type\" \n#> [7] \"area_km2\" \"pop\" \"lifeExp\" \n#> [10] \"gdpPercap\" \"geom\" \"coffee_production_2016\"\n#> [13] \"coffee_production_2017\"\nplot(world_coffee[\"coffee_production_2017\"])\ncoffee_renamed = rename(coffee_data, nm = name_long)\nworld_coffee2 = left_join(world, coffee_renamed, by = join_by(name_long == nm))\nworld_coffee_inner = inner_join(world, coffee_data)\n#> Joining with `by = join_by(name_long)`\nnrow(world_coffee_inner)\n#> [1] 45\nsetdiff(coffee_data$name_long, world$name_long)\n#> [1] \"Congo, Dem. Rep. of\" \"Others\"\ndrc = stringr::str_subset(world$name_long, \"Dem*.+Congo\")\ndrc\n#> [1] \"Democratic Republic of the Congo\"\ncoffee_data$name_long[grepl(\"Congo,\", coffee_data$name_long)] = drc\nworld_coffee_match = inner_join(world, coffee_data)\n#> Joining with `by = join_by(name_long)`\nnrow(world_coffee_match)\n#> [1] 46\ncoffee_world = left_join(coffee_data, world)\n#> Joining with `by = join_by(name_long)`\nclass(coffee_world)\n#> [1] \"tbl_df\" \"tbl\" \"data.frame\""},{"path":"attr.html","id":"vec-attr-creation","chapter":"3 Attribute data operations","heading":"3.2.5 Creating attributes and removing spatial information","text":"\nOften, like create new column based already existing columns.\nexample, want calculate population density country.\nneed divide population column, pop, area column, area_km2 unit area square kilometers.\nUsing base R, can type:\nAlternatively, can use one dplyr functions - mutate() transmute().\nmutate() adds new columns penultimate position sf object (last one reserved geometry):difference mutate() transmute() latter drops existing columns (except sticky geometry column).\nunite() tidyr package (provides many useful functions reshaping datasets, including pivot_longer()) pastes together existing columns.\nexample, want combine continent region_un columns new column named con_reg.\nAdditionally, can define separator (: colon :) defines values input columns joined, original columns removed (: TRUE).resulting sf object new column called con_reg representing continent region country, e.g., South America:Americas Argentina South America countries.\ntidyr’s separate() function opposite unite(): splits one column multiple columns using either regular expression character positions.\ndplyr function rename() base R function setNames() useful renaming columns.\nfirst replaces old name new one.\nfollowing command, example, renames lengthy name_long column simply name:\nsetNames() changes column names , requires character vector name matching column.\nillustrated , outputs world object, short names:\nattribute data operations preserve geometry simple features.\nSometimes makes sense remove geometry, example speed-aggregation.\nst_drop_geometry(), manually commands select(world, -geom), shown .20","code":"\nworld_new = world # do not overwrite our original data\nworld_new$pop_dens = world_new$pop / world_new$area_km2\nworld_new2 = world |> \n mutate(pop_dens = pop / area_km2)\nworld_unite = world |>\n tidyr::unite(\"con_reg\", continent:region_un, sep = \":\", remove = TRUE)\nworld_separate = world_unite |>\n tidyr::separate(con_reg, c(\"continent\", \"region_un\"), sep = \":\")\nworld |> \n rename(name = name_long)\nnew_names = c(\"i\", \"n\", \"c\", \"r\", \"s\", \"t\", \"a\", \"p\", \"l\", \"gP\", \"geom\")\nworld_new_names = world |>\n setNames(new_names)\nworld_data = world |> st_drop_geometry()\nclass(world_data)\n#> [1] \"tbl_df\" \"tbl\" \"data.frame\""},{"path":"attr.html","id":"manipulating-raster-objects","chapter":"3 Attribute data operations","heading":"3.3 Manipulating raster objects","text":"contrast vector data model underlying simple features (represents points, lines polygons discrete entities space), raster data represent continuous surfaces.\nsection shows raster objects work creating scratch, building Section 2.3.2.\nunique structure, subsetting operations raster datasets work different way, demonstrated Section 3.3.1.\nfollowing code recreates raster dataset used Section 2.3.4, result illustrated Figure 3.2.\ndemonstrates rast() function works create example raster named elev (representing elevations).result raster object 6 rows 6 columns (specified nrow ncol arguments), minimum maximum spatial extent x y direction (xmin, xmax, ymin, ymax).\nvals argument sets values cell contains: numeric data ranging 1 36 case.\nRaster objects can also contain categorical values class logical factor variables R.\nfollowing code creates raster datasets shown Figure 3.2:\nraster object stores corresponding look-table “Raster Attribute Table” (RAT) list data frames, can viewed cats(grain) (see ?cats() information).\nelement list layer raster.\nalso possible use function levels() retrieving adding new replacing existing factor levels.\nFIGURE 3.2: Raster datasets numeric (left) categorical values (right).\n","code":"\nelev = rast(nrows = 6, ncols = 6,\n xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5,\n vals = 1:36)\ngrain_order = c(\"clay\", \"silt\", \"sand\")\ngrain_char = sample(grain_order, 36, replace = TRUE)\ngrain_fact = factor(grain_char, levels = grain_order)\ngrain = rast(nrows = 6, ncols = 6, \n xmin = -1.5, xmax = 1.5, ymin = -1.5, ymax = 1.5,\n vals = grain_fact)\ngrain2 = grain # do not overwrite the original data\nlevels(grain2) = data.frame(value = c(0, 1, 2), wetness = c(\"wet\", \"moist\", \"dry\"))\nlevels(grain2)\n#> [[1]]\n#> value wetness\n#> 1 0 wet\n#> 2 1 moist\n#> 3 2 dry"},{"path":"attr.html","id":"raster-subsetting","chapter":"3 Attribute data operations","heading":"3.3.1 Raster subsetting","text":"Raster subsetting done base R operator [, accepts variety inputs:\nRow-column indexingCell IDsCoordinatesAnother spatial objectHere, show first two options since can considered non-spatial operations.\nneed spatial object subset another output spatial object, refer spatial subsetting.\nTherefore, latter two options shown next chapter (see Section 4.3.1).\nfirst two subsetting options demonstrated commands —\nreturn value top left pixel raster object elev (results shown).Subsetting multilayered raster objects return cell value(s) layer.\nexample, two_layers = c(grain, elev); two_layers[1] returns data frame one row two columns — one layer.\nextract values can also use values().Cell values can modified overwriting existing values conjunction subsetting operation.\nfollowing code chunk, example, sets upper left cell elev 0 (results shown):Leaving square brackets empty shortcut version values() retrieving values raster.\nMultiple cells can also modified way:Replacing values multilayered rasters can done matrix many columns layers rows replaceable cells (results shown):","code":"\n# row 1, column 1\nelev[1, 1]\n# cell ID 1\nelev[1]\nelev[1, 1] = 0\nelev[]\nelev[1, c(1, 2)] = 0\ntwo_layers = c(grain, elev) \ntwo_layers[1] = cbind(c(1), c(4))\ntwo_layers[]"},{"path":"attr.html","id":"summarizing-raster-objects","chapter":"3 Attribute data operations","heading":"3.3.2 Summarizing raster objects","text":"terra contains functions extracting descriptive statistics entire rasters.\nPrinting raster object console typing name returns minimum maximum values raster.\nsummary() provides common descriptive statistics – minimum, maximum, quartiles number NAs continuous rasters number cells class categorical rasters.\nsummary operations standard deviation (see ) custom summary statistics can calculated global().\nAdditionally, freq() function allows get frequency table categorical values.Raster value statistics can visualized variety ways.\nSpecific functions boxplot(), density(), hist() pairs() work also raster objects, demonstrated histogram created command (shown).\ncase desired visualization function work raster objects, one can extract raster data plotted help values() (Section 3.3.1).Descriptive raster statistics belong -called global raster operations.\ntypical raster processing operations part map algebra scheme, covered next chapter (Section 4.3.2).\nfunction names clash packages (e.g., function \nname extract() exist terra \ntidyr packages). may lead unexpected results\nloading packages different order. addition calling\nfunctions verbosely full namespace (e.g.,\ntidyr::extract()) avoid attaching packages \nlibrary(), another way prevent function name clashes \nunloading offending package detach(). \nfollowing command, example, unloads terra\npackage (can also done package tab resides\ndefault right-bottom pane RStudio):\ndetach(“package:terra”, unload = TRUE, force = TRUE). \nforce argument makes sure package detached\neven packages depend . , however, may lead \nrestricted usability packages depending detached package, \ntherefore recommended.\n","code":"\nglobal(elev, sd)\nfreq(grain)\n#> layer value count\n#> 1 1 clay 10\n#> 2 1 silt 13\n#> 3 1 sand 13\nhist(elev)"},{"path":"attr.html","id":"exercises-1","chapter":"3 Attribute data operations","heading":"3.4 Exercises","text":"exercises use us_states us_states_df datasets spData package.\nmust attached package, packages used attribute operations chapter (sf, dplyr, terra) commands library(spData) attempting exercises:us_states spatial object (class sf), containing geometry attributes (including name, region, area, population) states within contiguous United States.\nus_states_df data frame (class data.frame) containing name additional variables (including median income poverty level, years 2010 2015) US states, including Alaska, Hawaii Puerto Rico.\ndata comes United States Census Bureau, documented ?us_states ?us_states_df.E1. Create new object called us_states_name contains NAME column us_states object using either base R ([) tidyverse (select()) syntax.\nclass new object makes geographic?E2. Select columns us_states object contain population data.\nObtain result using different command (bonus: try find three ways obtaining result).\nHint: try use helper functions, contains matches dplyr (see ?contains).E3. Find states following characteristics (bonus find plot ):Belong Midwest region.Belong West region, area 250,000 km2and 2015 population greater 5,000,000 residents (hint: may need use function units::set_units() .numeric()).Belong South region, area larger 150,000 km2 total population 2015 larger 7,000,000 residents.E4. total population 2015 us_states dataset?\nminimum maximum total population 2015?E5. many states region?E6. minimum maximum total population 2015 region?\ntotal population 2015 region?E7. Add variables us_states_df us_states, create new object called us_states_stats.\nfunction use ?\nvariable key datasets?\nclass new object?E8. us_states_df two rows us_states.\ncan find ? (hint: try use dplyr::anti_join() function)E9. population density 2015 state?\npopulation density 2010 state?E10. much population density changed 2010 2015 state?\nCalculate change percentages map .E11. Change columns’ names us_states lowercase. (Hint: helper functions - tolower() colnames() may help.)E12. Using us_states us_states_df create new object called us_states_sel.\nnew object two variables - median_income_15 geometry.\nChange name median_income_15 column Income.E13. Calculate change number residents living poverty level 2010 2015 state. (Hint: See ?us_states_df documentation poverty level columns.)\nBonus: Calculate change percentage residents living poverty level state.E14. minimum, average maximum state’s number people living poverty line 2015 region?\nBonus: region largest increase people living poverty line?E15. Create raster scratch nine rows columns resolution 0.5 decimal degrees (WGS84).\nFill random numbers.\nExtract values four corner cells.E16. common class example raster grain?E17. Plot histogram boxplot dem.tif file spDataLarge package (system.file(\"raster/dem.tif\", package = \"spDataLarge\")).","code":"\nlibrary(sf)\nlibrary(dplyr)\nlibrary(terra)\nlibrary(spData)\ndata(us_states)\ndata(us_states_df)"},{"path":"spatial-operations.html","id":"spatial-operations","chapter":"4 Spatial data operations","heading":"4 Spatial data operations","text":"","code":""},{"path":"spatial-operations.html","id":"prerequisites-2","chapter":"4 Spatial data operations","heading":"Prerequisites","text":"chapter requires packages used Chapter 3:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)"},{"path":"spatial-operations.html","id":"introduction-1","chapter":"4 Spatial data operations","heading":"4.1 Introduction","text":"Spatial operations, including spatial joins vector datasets local focal operations raster datasets, vital part geocomputation.\nchapter shows spatial objects can modified multitude ways based location shape.\nMany spatial operations non-spatial (attribute) equivalent, concepts subsetting joining datasets demonstrated previous chapter applicable .\nespecially true vector operations: Section 3.2 vector attribute manipulation provides basis understanding spatial counterpart, namely spatial subsetting (covered Section 4.2.1).\nSpatial joining (Sections 4.2.5, 4.2.6 4.2.8) aggregation (Section 4.2.7) also non-spatial counterparts, covered previous chapter.Spatial operations differ non-spatial operations number ways, however:\nspatial joins, example, can done number ways — including matching entities intersect within certain distance target dataset — attribution joins discussed Section 3.2.4 previous chapter can done one way (except using fuzzy joins, described documentation fuzzyjoin package).\nDifferent types spatial relationship objects, including intersects disjoint, described Sections 4.2.2 4.2.4.\nAnother unique aspect spatial objects distance: spatial objects related space, distance calculations can used explore strength relationship, described context vector data Section 4.2.3.Spatial operations raster objects include subsetting — covered Section 4.3.1.\nMap algebra covers range operations modify raster cell values, without reference surrounding cell values.\nconcept map algebra, vital many applications, introduced Section 4.3.2; local, focal zonal map algebra operations covered sections 4.3.3, 4.3.4, 4.3.5, respectively.\nGlobal map algebra operations, generate summary statistics representing entire raster dataset, distance calculations rasters, discussed Section 4.3.6.\nNext, relation map algebra vector operations discussed Section 4.3.7.\nfinal section exercises (4.3.8) process merging two raster datasets discussed demonstrated reference reproducible example.","code":""},{"path":"spatial-operations.html","id":"spatial-vec","chapter":"4 Spatial data operations","heading":"4.2 Spatial operations on vector data","text":"section provides overview spatial operations vector geographic data represented simple features sf package.\nSection 4.3 presents spatial operations raster datasets using classes functions terra package.","code":""},{"path":"spatial-operations.html","id":"spatial-subsetting","chapter":"4 Spatial data operations","heading":"4.2.1 Spatial subsetting","text":"Spatial subsetting process taking spatial object returning new object containing features relate space another object.\nAnalogous attribute subsetting (covered Section 3.2.1), subsets sf data frames can created square bracket ([) operator using syntax x[y, , op = st_intersects], x sf object subset rows returned, y ‘subsetting object’ , op = st_intersects optional argument specifies topological relation (also known binary predicate) used subsetting.\ndefault topological relation used op argument provided st_intersects(): command x[y, ] identical x[y, , op = st_intersects] shown x[y, , op = st_disjoint] (meaning topological relations described next section).\nfilter() function tidyverse can also used approach verbose, see examples .\ndemonstrate spatial subsetting, use nz nz_height datasets spData package, contain geographic data 16 main regions 101 highest points New Zealand, respectively (Figure 4.1), projected coordinate reference system.\nfollowing code chunk creates object representing Canterbury, uses spatial subsetting return high points region.\nFIGURE 4.1: Illustration spatial subsetting red triangles representing 101 high points New Zealand, clustered near central Canterbuy region (left). points Canterbury created [ subsetting operator (highlighted gray, right).\nLike attribute subsetting, command x[y, ] (equivalent nz_height[canterbury, ]) subsets features target x using contents source object y.\nInstead y vector class logical integer, however, spatial subsetting x y must geographic objects.\nSpecifically, objects used spatial subsetting way must class sf sfc: nz nz_height geographic vector data frames class sf, result operation returns another sf object representing features target nz_height object intersect (case high points located within) canterbury region.Various topological relations can used spatial subsetting determine type spatial relationship features target object must subsetting object selected.\ninclude touches, crosses within, see shortly Section 4.2.2.\ndefault setting st_intersects ‘catch ’ topological relation return features target touch, cross within source ‘subsetting’ object.\nAlternative spatial operators can specified op = argument, demonstrated following command returns opposite st_intersects(), points intersect Canterbury (see Section 4.2.2).many applications, ’ll need know spatial subsetting vector data: just works.\nimpatient learn topological relations, beyond st_intersects() st_disjoint(), skip next section (4.2.2).\n’re interested details, including ways subsetting, read .Another way spatial subsetting uses objects returned topological operators.\nobjects can useful right, example exploring graph network relationships contiguous regions, can also used subsetting, demonstrated code chunk .code chunk creates object class sgbp (sparse geometry binary predicate, list length x spatial operation) converts logical vector sel_logical (containing TRUE FALSE values, something can also used dplyr’s filter function).\nfunction lengths() identifies features nz_height intersect objects y.\ncase 1 greatest possible value complex operations one use method subset features intersect , example, 2 features source object.result can also achieved sf function st_filter() created increase compatibility sf objects dplyr data manipulation code:point, three identical (row names) versions canterbury_height, one created using [ operator, one created via intermediary selection object, another using sf’s convenience function st_filter().\n\n\nnext section explores different types spatial relation, also known binary predicates, can used identify whether two features spatially related .","code":"\ncanterbury = nz |> filter(Name == \"Canterbury\")\ncanterbury_height = nz_height[canterbury, ]\nnz_height[canterbury, , op = st_disjoint]\nsel_sgbp = st_intersects(x = nz_height, y = canterbury)\nclass(sel_sgbp)\n#> [1] \"sgbp\" \"list\"\nsel_sgbp\n#> Sparse geometry binary predicate list of length 101, where the\n#> predicate was `intersects'\n#> first 10 elements:\n#> 1: (empty)\n#> 2: (empty)\n#> 3: (empty)\n#> 4: (empty)\n#> 5: 1\n#> 6: 1\n....\nsel_logical = lengths(sel_sgbp) > 0\ncanterbury_height2 = nz_height[sel_logical, ]\ncanterbury_height3 = nz_height |>\n st_filter(y = canterbury, .predicate = st_intersects)"},{"path":"spatial-operations.html","id":"topological-relations","chapter":"4 Spatial data operations","heading":"4.2.2 Topological relations","text":"Topological relations describe spatial relationships objects.\n“Binary topological relationships”, give full name, logical statements (answer can TRUE FALSE) spatial relationships two objects defined ordered sets points (typically forming points, lines polygons) two dimensions (Egenhofer Herring 1990).\nmay sound rather abstract , indeed, definition classification topological relations based mathematical foundations first published book form 1966 (Spanier 1995), field algebraic topology continuing 21st century (Dieck 2008).Despite mathematical origins, topological relations can understood intuitively reference visualizations commonly used functions test common types spatial relationships.\nFigure 4.2 shows variety geometry pairs associated relations.\nthird fourth pairs Figure 4.2 (left right ) demonstrate , relations, order important.\nrelations equals, intersects, crosses, touches overlaps symmetrical, meaning function(x, y) true, function(y, x) also true, relations order geometries important contains within .\nNotice geometry pair “DE-9IM” string FF2F11212, described next section.\n\nFIGURE 4.2: Topological relations vector geometries, inspired Figures 1 2 Egenhofer Herring (1990). relations function(x, y) true printed geometry pair, x represented pink y represented blue. nature spatial relationship pair described Dimensionally Extended 9-Intersection Model string.\nsf, functions testing different types topological relations called ‘binary predicates’, described vignette Manipulating Simple Feature Geometries, can viewed command vignette(\"sf3\"), help page ?geos_binary_pred.\nsee topological relations work practice, let’s create simple reproducible example, building relations illustrated Figure 4.2 consolidating knowledge vector geometries represented previous chapter (Section 2.2.4).\nNote create tabular data representing coordinates (x y) polygon vertices, use base R function cbind() create matrix representing coordinates points, POLYGON, finally sfc object, described Chapter 2:create additional geometries demonstrate spatial relations following commands , plotted top polygon created , relate space one another, shown Figure 4.3.\nNote use function st_as_sf() argument coords efficiently convert data frame containing columns representing coordinates sf object containing points:\nFIGURE 4.3: Points, line polygon objects arranged illustrate topological relations.\nsimple query : points point_sf intersect way polygon polygon_sfc?\nquestion can answered inspection (points 1 3 touching within polygon, respectively).\nquestion can answered spatial predicate st_intersects() follows:result match intuition:\npositive (1) results returned first third point, negative result (represented empty vector) second outside polygon’s border.\nmay unexpected result comes form list vectors.\nsparse matrix output registers relation one exists, reducing memory requirements topological operations multi-feature objects.\nsaw previous section, dense matrix consisting TRUE FALSE values returned sparse = FALSE.output row represents feature target (argument x) object column represents feature selecting object (y).\ncase, one feature y object polygon_sfc result, can used subsetting saw Section 4.2.1, one column.st_intersects() returns TRUE even cases features just touch: intersects ‘catch-’ topological operation identifies many types spatial relation, illustrated Figure 4.2.\nrestrictive questions include points lie within polygon, features contain shared boundary y?\ncan answered follows (results shown):Note although first point touches boundary polygon, within ; third point within polygon touch part border.\nopposite st_intersects() st_disjoint(), returns objects spatially relate way selecting object (note [, 1] converts result vector).function st_is_within_distance() detects features almost touch selection object, additional dist argument.\ncan used set close target objects need selected.\n‘within distance’ binary spatial predicate demonstrated code chunk , results show every point within 0.2 units polygon.Note although point 2 0.2 units distance nearest vertex polygon_sfc, still selected distance set 0.2.\ndistance measured nearest edge, case part polygon lies directly point 2 Figure 4.3.\n(can verify actual distance point 2 polygon 0.13 command st_distance(point_sf, polygon_sfc).)","code":"\npolygon_matrix = cbind(\n x = c(0, 0, 1, 1, 0),\n y = c(0, 1, 1, 0.5, 0)\n)\npolygon_sfc = st_sfc(st_polygon(list(polygon_matrix)))\npoint_df = data.frame(\n x = c(0.2, 0.7, 0.4),\n y = c(0.1, 0.2, 0.8)\n)\npoint_sf = st_as_sf(point_df, coords = c(\"x\", \"y\"))\nst_intersects(point_sf, polygon_sfc)\n#> Sparse geometry binary predicate... `intersects'\n#> 1: 1\n#> 2: (empty)\n#> 3: 1\nst_intersects(point_sf, polygon_sfc, sparse = FALSE)\n#> [,1]\n#> [1,] TRUE\n#> [2,] FALSE\n#> [3,] TRUE\nst_within(point_sf, polygon_sfc)\nst_touches(point_sf, polygon_sfc)\nst_disjoint(point_sf, polygon_sfc, sparse = FALSE)[, 1]\n#> [1] FALSE TRUE FALSE\nst_is_within_distance(point_sf, polygon_sfc, dist = 0.2, sparse = FALSE)[, 1]\n#> [1] TRUE TRUE TRUE"},{"path":"spatial-operations.html","id":"distance-relations","chapter":"4 Spatial data operations","heading":"4.2.3 Distance relations","text":"topological relations presented previous section binary (feature either intersects another ) distance relations continuous.\ndistance two sf objects calculated st_distance(), also used behind scenes Section 4.2.6 distance-based joins.\nillustrated code chunk , finds distance highest point New Zealand geographic centroid Canterbury region, created Section 4.2.1:\ntwo potentially surprising things result:units, telling us distance 100,000 meters, 100,000 inches, measure distanceIt returned matrix, even though result contains single valueThis second feature hints another useful feature st_distance(), ability return distance matrices combinations features objects x y.\nillustrated command , finds distances first three features nz_height Otago Canterbury regions New Zealand represented object co.Note distance second third features nz_height second feature co zero.\ndemonstrates fact distances points polygons refer distance part polygon:\nsecond third points nz_height Otago, can verified plotting (result shown):","code":"\nnz_highest = nz_height |> slice_max(n = 1, order_by = elevation)\ncanterbury_centroid = st_centroid(canterbury)\nst_distance(nz_highest, canterbury_centroid)\n#> Units: [m]\n#> [,1]\n#> [1,] 115540\nco = filter(nz, grepl(\"Canter|Otag\", Name))\nst_distance(nz_height[1:3, ], co)\n#> Units: [m]\n#> [,1] [,2]\n#> [1,] 123537 15498\n#> [2,] 94283 0\n#> [3,] 93019 0\nplot(st_geometry(co)[2])\nplot(st_geometry(nz_height)[2:3], add = TRUE)"},{"path":"spatial-operations.html","id":"DE-9IM-strings","chapter":"4 Spatial data operations","heading":"4.2.4 DE-9IM strings","text":"Underlying binary predicates demonstrated previous section Dimensionally Extended 9-Intersection Model (DE-9IM).\ncryptic name suggests, easy topic understand, worth knowing underlies many spatial operations enables creation custom spatial predicates.\nmodel originally labelled “DE + 9IM” inventors, referring “dimension intersections boundaries, interiors, exteriors two features” (Clementini Di Felice 1995), now referred DE-9IM (Shen, Chen, Liu 2018).\nDE-9IM applicable 2-dimensional objects (points, lines polygons) Euclidean space, meaning model (software implementing GEOS) assumes working data projected coordinate reference system, described Chapter 7.demonstrate DE-9IM strings work, let’s take look various ways first geometry pair Figure 4.2 relate.\nFigure 4.4 illustrates 9 intersection model (9IM) shows intersections every combination object’s interior, boundary exterior: component first object x arranged columns component y arranged rows, facetted graphic created intersections element highlighted.\nFIGURE 4.4: Illustration Dimensionally Extended 9 Intersection Model (DE-9IM) works. Colors legend represent overlap different components. thick lines highlight 2 dimensional intesections, e.g., boundary object x interior object y, shown middle top facet.\nDE-9IM strings derived dimension type relation.\ncase red intersections Figure 4.4 dimensions 0 (points), 1 (lines), 2 (polygons), shown Table 4.1.TABLE 4.1: Table showing relations interiors, boundaries exteriors geometries x y.Flattening matrix ‘row-wise’ (meaning concatenating first row, second, third) results string 212111212.\nAnother example serve demonstrate system:\nrelation shown Figure 4.2 (third polygon pair third column 1st row) can defined DE-9IM system follows:intersections interior larger object x interior, boundary exterior y dimensions 2, 1 2 respectivelyThe intersections boundary larger object x interior, boundary exterior y dimensions F, F 1 respectively, ‘F’ means ‘false’, objects disjointThe intersections exterior x interior, boundary exterior y dimensions F, F 2 respectively: exterior larger object touch interior boundary y, exterior smaller larger objects cover areaThese three components, concatenated, create string 212, FF1, FF2.\nresult obtained function st_relate() (see source code chapter see geometries Figure 4.2 created):Understanding DE-9IM strings allows new binary spatial predicates developed.\nhelp page ?st_relate contains function definitions ‘queen’ ‘rook’ relations polygons share border point, respectively.\n‘Queen’ relations mean ‘boundary-boundary’ relations (cell second column second row Table 4.1, 5th element DE-9IM string) must empty, corresponding pattern F***T****, ‘rook’ relations element must 1 (meaning linear intersection).\nimplemented follows:Building object x created previously, can use newly created functions find elements grid ‘queen’ ‘rook’ relation middle square grid follows:\nFIGURE 4.5: Demonstration custom binary spatial predicates finding ‘queen’ (left) ‘rook’ (right) relations central square grid 9 geometries.\n","code":"\nxy2sfc = function(x, y) st_sfc(st_polygon(list(cbind(x, y))))\nx = xy2sfc(x = c(0, 0, 1, 1, 0), y = c(0, 1, 1, 0.5, 0))\ny = xy2sfc(x = c(0.7, 0.7, 0.9, 0.7), y = c(0.8, 0.5, 0.5, 0.8))\nst_relate(x, y)\n#> [,1] \n#> [1,] \"212FF1FF2\"\nst_queen = function(x, y) st_relate(x, y, pattern = \"F***T****\")\nst_rook = function(x, y) st_relate(x, y, pattern = \"F***1****\")\ngrid = st_make_grid(x, n = 3)\ngrid_sf = st_sf(grid)\ngrid_sf$queens = lengths(st_queen(grid, grid[5])) > 0\nplot(grid, col = grid_sf$queens)\ngrid_sf$rooks = lengths(st_rook(grid, grid[5])) > 0\nplot(grid, col = grid_sf$rooks)"},{"path":"spatial-operations.html","id":"spatial-joining","chapter":"4 Spatial data operations","heading":"4.2.5 Spatial joining","text":"Joining two non-spatial datasets relies shared ‘key’ variable, described Section 3.2.4.\nSpatial data joining applies concept, instead relies spatial relations, described previous section.\nattribute data, joining adds new columns target object (argument x joining functions), source object (y).\nprocess illustrated following example: imagine ten points randomly distributed across Earth’s surface ask, points land, countries ?\nImplementing idea reproducible example build geographic data handling skills show spatial joins work.\nstarting point create points randomly scattered Earth’s surface.scenario illustrated Figure 4.6 shows random_points object (top left) lacks attribute data, world (top right) attributes, including country names shown sample countries legend.\nSpatial joins implemented st_join(), illustrated code chunk .\noutput random_joined object illustrated Figure 4.6 (bottom left).\ncreating joined dataset, use spatial subsetting create world_random, contains countries contain random points, verify number country names returned joined dataset four (Figure 4.6, top right panel).\nFIGURE 4.6: Illustration spatial join. new attribute variable added random points (top left) source world object (top right) resulting data represented final panel.\ndefault, st_join() performs left join, meaning result object containing rows x including rows match y (see Section 3.2.4), can also inner joins setting argument left = FALSE.\nLike spatial subsetting, default topological operator used st_join() st_intersects(), can changed setting join argument (see ?st_join details).\nexample demonstrates addition column polygon layer point layer, approach works regardless geometry types.\ncases, example x contains polygons, match multiple objects y, spatial joins result duplicate features creating new row match y.","code":"\nset.seed(2018) # set seed for reproducibility\n(bb = st_bbox(world)) # the world's bounds\n#> xmin ymin xmax ymax \n#> -180.0 -89.9 180.0 83.6\nrandom_df = data.frame(\n x = runif(n = 10, min = bb[1], max = bb[3]),\n y = runif(n = 10, min = bb[2], max = bb[4])\n)\nrandom_points = random_df |> \n st_as_sf(coords = c(\"x\", \"y\"), crs = \"EPSG:4326\") # set coordinates and CRS\nworld_random = world[random_points, ]\nnrow(world_random)\n#> [1] 4\nrandom_joined = st_join(random_points, world[\"name_long\"])"},{"path":"spatial-operations.html","id":"non-overlapping-joins","chapter":"4 Spatial data operations","heading":"4.2.6 Distance-based joins","text":"Sometimes two geographic datasets intersect still strong geographic relationship due proximity.\ndatasets cycle_hire cycle_hire_osm, already attached spData package, provide good example.\nPlotting shows often closely related touch, shown Figure 4.7, base version created following code :\ncan check points using st_intersects() shown :\nFIGURE 4.7: spatial distribution cycle hire points London based official data (blue) OpenStreetMap data (red).\nImagine need join capacity variable cycle_hire_osm onto official ‘target’ data contained cycle_hire.\nnon-overlapping join needed.\nsimplest method use binary predicate st_is_within_distance(), demonstrated using threshold distance 20 m.\nOne can set threshold distance metric units also unprojected data (e.g., lon/lat CRSs WGS84), spherical geometry engine (S2) enabled, sf default (see Section 2.2.9).shows 438 points target object cycle_hire within threshold distance cycle_hire_osm.\nretrieve values associated respective cycle_hire_osm points?\nsolution st_join(), additional dist argument (set 20 m ):Note number rows joined result greater target.\ncycle hire stations cycle_hire multiple matches cycle_hire_osm.\naggregate values overlapping points return mean, can use aggregation methods learned Chapter 3, resulting object number rows target.capacity nearby stations can verified comparing plot capacity source cycle_hire_osm data results new object (plots shown):result join used spatial operation change attribute data associated simple features; geometry associated feature remained unchanged.","code":"\nplot(st_geometry(cycle_hire), col = \"blue\")\nplot(st_geometry(cycle_hire_osm), add = TRUE, pch = 3, col = \"red\")\nany(st_intersects(cycle_hire, cycle_hire_osm, sparse = FALSE))\n#> [1] FALSE\nsel = st_is_within_distance(cycle_hire, cycle_hire_osm, \n dist = units::set_units(20, \"m\"))\nsummary(lengths(sel) > 0)\n#> Mode FALSE TRUE \n#> logical 304 438\nz = st_join(cycle_hire, cycle_hire_osm, st_is_within_distance, \n dist = units::set_units(20, \"m\"))\nnrow(cycle_hire)\n#> [1] 742\nnrow(z)\n#> [1] 762\nz = z |> \n group_by(id) |> \n summarize(capacity = mean(capacity))\nnrow(z) == nrow(cycle_hire)\n#> [1] TRUE\nplot(cycle_hire_osm[\"capacity\"])\nplot(z[\"capacity\"])"},{"path":"spatial-operations.html","id":"spatial-aggr","chapter":"4 Spatial data operations","heading":"4.2.7 Spatial aggregation","text":"attribute data aggregation, spatial data aggregation condenses data: aggregated outputs fewer rows non-aggregated inputs.\nStatistical aggregating functions, mean average sum, summarise multiple values variable, return single value per grouping variable.\nSection 3.2.3 demonstrated aggregate() group_by() |> summarize() condense data based attribute variables, section shows functions work spatial objects.\nReturning example New Zealand, imagine want find average height high points region: geometry source (y nz case) defines values target object (x nz_height) grouped.\ncan done single line code base R’s aggregate() method.result previous command sf object geometry (spatial) aggregating object (nz), can verify command identical(st_geometry(nz), st_geometry(nz_agg)).\nresult previous operation illustrated Figure 4.8, shows average value features nz_height within New Zealand’s 16 regions.\nresult can also generated piping output st_join() ‘tidy’ functions group_by() summarize() follows:\nFIGURE 4.8: Average height top 101 high points across regions New Zealand.\nresulting nz_agg objects geometry aggregating object nz new column summarizing values x region using function mean().\nfunctions used instead mean() , including median(), sd() functions return single value per group.\nNote: one difference aggregate() group_by() |> summarize() approaches former results NA values unmatching region names latter preserves region names.\n‘tidy’ approach thus flexible terms aggregating functions column names results.\nAggregating operations also create new geometries covered Section 5.2.7.","code":"\nnz_agg = aggregate(x = nz_height, by = nz, FUN = mean)\nnz_agg2 = st_join(x = nz, y = nz_height) |>\n group_by(Name) |>\n summarize(elevation = mean(elevation, na.rm = TRUE))"},{"path":"spatial-operations.html","id":"incongruent","chapter":"4 Spatial data operations","heading":"4.2.8 Joining incongruent layers","text":"Spatial congruence important concept related spatial aggregation.\naggregating object (refer y) congruent target object (x) two objects shared borders.\nOften case administrative boundary data, whereby larger units — Middle Layer Super Output Areas (MSOAs) UK districts many European countries — composed many smaller units.Incongruent aggregating objects, contrast, share common borders target (Qiu, Zhang, Zhou 2012).\nproblematic spatial aggregation (spatial operations) illustrated Figure 4.9: aggregating centroid sub-zone return accurate results.\nAreal interpolation overcomes issue transferring values one set areal units another, using range algorithms including simple area weighted approaches sophisticated approaches ‘pycnophylactic’ methods (Waldo R. Tobler 1979).\nFIGURE 4.9: Illustration congruent (left) incongruent (right) areal units respect larger aggregating zones (translucent red borders).\nspData package contains dataset named incongruent (colored polygons black borders right panel Figure 4.9) dataset named aggregating_zones (two polygons translucent blue border right panel Figure 4.9).\nLet us assume value column incongruent refers total regional income million Euros.\ncan transfer values underlying nine spatial polygons two polygons aggregating_zones?simplest useful method area weighted spatial interpolation, transfers values incongruent object new column aggregating_zones proportion area overlap: larger spatial intersection input output features, larger corresponding value.\nimplemented st_interpolate_aw(), demonstrated code chunk .case meaningful sum values intersections falling aggregating zones since total income -called spatially extensive variable (increases area), assuming income evenly distributed across smaller zones (hence warning message ).\ndifferent spatially intensive variables average income percentages, increase area increases.\nst_interpolate_aw() works equally spatially intensive variables: set extensive parameter FALSE use average rather sum function aggregation.","code":"\niv = incongruent[\"value\"] # keep only the values to be transferred\nagg_aw = st_interpolate_aw(iv, aggregating_zones, extensive = TRUE)\n#> Warning in st_interpolate_aw.sf(iv, aggregating_zones, extensive = TRUE):\n#> st_interpolate_aw assumes attributes are constant or uniform over areas of x\nagg_aw$value\n#> [1] 19.6 25.7"},{"path":"spatial-operations.html","id":"spatial-ras","chapter":"4 Spatial data operations","heading":"4.3 Spatial operations on raster data","text":"section builds Section 3.3, highlights various basic methods manipulating raster datasets, demonstrate advanced explicitly spatial raster operations, uses objects elev grain manually created Section 3.3.\nreader’s convenience, datasets can also found spData package.","code":"\nelev = rast(system.file(\"raster/elev.tif\", package = \"spData\"))\ngrain = rast(system.file(\"raster/grain.tif\", package = \"spData\"))"},{"path":"spatial-operations.html","id":"spatial-raster-subsetting","chapter":"4 Spatial data operations","heading":"4.3.1 Spatial subsetting","text":"previous chapter (Section 3.3) demonstrated retrieve values associated specific cell IDs row column combinations.\nRaster objects can also extracted location (coordinates) spatial objects.\nuse coordinates subsetting, one can ‘translate’ coordinates cell ID terra function cellFromXY().\nalternative use terra::extract() (careful, also function called extract() tidyverse) extract values.\nmethods demonstrated find value cell covers point located coordinates 0.1, 0.1.\nRaster objects can also subset another raster object, demonstrated code chunk :amounts retrieving values first raster object (case elev) fall within extent second raster (: clip), illustrated Figure 4.10.\nFIGURE 4.10: Original raster (left). Raster mask (middle). Output masking raster (right).\nexample returned values specific cells, many cases spatial outputs subsetting operations raster datasets needed.\ncan done setting drop argument [ operator FALSE.\ncode returns first two cells elev, .e., first two cells top row, raster object (first 2 lines output shown):Another common use case spatial subsetting raster logical (NA) values used mask another raster extent resolution, illustrated Figure 4.10.\ncase, [ mask() functions can used (results shown).code chunk , created mask object called rmask values randomly assigned NA TRUE.\nNext, want keep values elev TRUE rmask.\nwords, want mask elev rmask.approach can also used replace values (e.g., expected wrong) NA.operations fact Boolean local operations since compare cell-wise two rasters.\nnext subsection explores related operations detail.","code":"\nid = cellFromXY(elev, xy = matrix(c(0.1, 0.1), ncol = 2))\nelev[id]\n# the same as\nterra::extract(elev, matrix(c(0.1, 0.1), ncol = 2))\nclip = rast(xmin = 0.9, xmax = 1.8, ymin = -0.45, ymax = 0.45,\n resolution = 0.3, vals = rep(1, 9))\nelev[clip]\n# we can also use extract\n# terra::extract(elev, ext(clip))\nelev[1:2, drop = FALSE] # spatial subsetting with cell IDs\n#> class : SpatRaster \n#> dimensions : 1, 2, 1 (nrow, ncol, nlyr)\n#> ...\n# create raster mask\nrmask = elev\nvalues(rmask) = sample(c(NA, TRUE), 36, replace = TRUE)\n# spatial subsetting\nelev[rmask, drop = FALSE] # with [ operator\n# we can also use mask\n# mask(elev, rmask)\nelev[elev < 20] = NA"},{"path":"spatial-operations.html","id":"map-algebra","chapter":"4 Spatial data operations","heading":"4.3.2 Map algebra","text":"\nterm ‘map algebra’ coined late 1970s describe “set conventions, capabilities, techniques” analysis geographic raster (although less prominently) vector data (Tomlin 1994).\ncontext, define map algebra narrowly, operations modify summarize raster cell values, reference surrounding cells, zones, statistical functions apply every cell.Map algebra operations tend fast, raster datasets implicitly store coordinates, hence old adage “raster faster vector corrector”.\nlocation cells raster datasets can calculated using matrix position resolution origin dataset (stored header).\nprocessing, however, geographic position cell barely relevant long make sure cell position still processing.\nAdditionally, two raster datasets share extent, projection resolution, one treat matrices processing.way map algebra works terra package.\nFirst, headers raster datasets queried (cases map algebra operations work one dataset) checked ensure datasets compatible.\nSecond, map algebra retains -called one--one locational correspondence, meaning cells move.\ndiffers matrix algebra, values change position, example multiplying dividing matrices.Map algebra (cartographic modeling raster data) divides raster operations four subclasses (Tomlin 1990), working one several grids simultaneously:Local per-cell operationsFocal neighborhood operations.\noften output cell value result 3 x 3 input cell blockZonal operations similar focal operations, surrounding pixel grid new values computed can irregular sizes shapesGlobal per-raster operations.\nmeans output cell derives value potentially one several entire rastersThis typology classifies map algebra operations number cells used pixel processing step type output.\nsake completeness, mention raster operations can also classified discipline terrain, hydrological analysis, image classification.\nfollowing sections explain type map algebra operations can used, reference worked examples.","code":""},{"path":"spatial-operations.html","id":"local-operations","chapter":"4 Spatial data operations","heading":"4.3.3 Local operations","text":"\nLocal operations comprise cell--cell operations one several layers.\nincludes adding subtracting values raster, squaring multiplying rasters.\nRaster algebra also allows logical operations finding raster cells greater specific value (5 example ).\nterra package supports operations , demonstrated (Figure 4.11):\nFIGURE 4.11: Examples different local operations elev raster object: adding two rasters, squaring, applying logarithmic transformation, performing logical operation.\nAnother good example local operations classification intervals numeric values groups grouping digital elevation model low (class 1), middle (class 2) high elevations (class 3).\nUsing classify() command, need first construct reclassification matrix, first column corresponds lower second column upper end class.\nthird column represents new value specified ranges column one two., assign raster values ranges 0–12, 12–24 24–36 reclassified take values 1, 2 3, respectively.classify() function can also used want reduce number classes categorical rasters.\nperform several additional reclassifications Chapter 14.Apart applying arithmetic operators directly, one can also use app(), tapp() lapp() functions.\nefficient, hence, preferable presence large raster datasets.\nAdditionally, allow save output file directly.\napp() function applies function cell raster used summarize (e.g., calculating sum) values multiple layers one layer.\ntapp() extension app(), allowing us select subset layers (see index argument) want perform certain operation.\nFinally, lapp() function allows us apply function cell using layers arguments – application lapp() presented .calculation normalized difference vegetation index (NDVI) well-known local (pixel--pixel) raster operation.\nreturns raster values -1 1; positive values indicate presence living plants (mostly > 0.2).\nNDVI calculated red near-infrared (NIR) bands remotely sensed imagery, typically satellite systems Landsat Sentinel.\nVegetation absorbs light heavily visible light spectrum, especially red channel, reflecting NIR light. ’s NVDI formula:\\[\n\\begin{split}\nNDVI&= \\frac{\\text{NIR} - \\text{Red}}{\\text{NIR} + \\text{Red}}\\\\\n\\end{split}\n\\]Let’s calculate NDVI multispectral satellite file Zion National Park.raster object four satellite bands Landsat 8 satellite — blue, green, red, near-infrared (NIR).\nImportantly, Landsat level-2 products stored integers save disk space, thus need convert floating-point numbers calculations.\npurpose, need apply scaling factor (0.0000275) add offset (-0.2) original values.21The proper values now range 0 1.\ncase , probably due presence clouds atmospheric effects, thus need replace 0 0.next step implement NDVI formula R function:function accepts two numerical arguments, nir red, returns numerical vector NDVI values.\ncan used fun argument lapp().\njust need remember function expects two bands (four original raster), need NIR, red order.\nsubset input raster multi_rast[[c(4, 3)]] calculations.result, shown right panel Figure 4.12, can compared RGB image area (left panel Figure).\nallows us see largest NDVI values connected northern areas dense forest, lowest values related lake north snowy mountain ridges.\nFIGURE 4.12: RGB image (left) NDVI values (right) calculated example satellite file Zion National Park\nPredictive mapping another interesting application local raster operations.\nresponse variable corresponds measured observed points space, example, species richness, presence landslides, tree disease crop yield.\nConsequently, can easily retrieve space- airborne predictor variables various rasters (elevation, pH, precipitation, temperature, land cover, soil class, etc.).\nSubsequently, model response function predictors using lm(), glm(), gam() machine-learning technique.\nSpatial predictions raster objects can therefore made applying estimated coefficients predictor raster values, summing output raster values (see Chapter 15).","code":"\nelev + elev\nelev^2\nlog(elev)\nelev > 5\nrcl = matrix(c(0, 12, 1, 12, 24, 2, 24, 36, 3), ncol = 3, byrow = TRUE)\nrcl\n#> [,1] [,2] [,3]\n#> [1,] 0 12 1\n#> [2,] 12 24 2\n#> [3,] 24 36 3\nrecl = classify(elev, rcl = rcl)\nmulti_raster_file = system.file(\"raster/landsat.tif\", package = \"spDataLarge\")\nmulti_rast = rast(multi_raster_file)\nmulti_rast = (multi_rast * 0.0000275) - 0.2\nmulti_rast[multi_rast < 0] = 0\nndvi_fun = function(nir, red){\n (nir - red) / (nir + red)\n}\nndvi_rast = lapp(multi_rast[[c(4, 3)]], fun = ndvi_fun)"},{"path":"spatial-operations.html","id":"focal-operations","chapter":"4 Spatial data operations","heading":"4.3.4 Focal operations","text":"\nlocal functions operate one cell, though possibly multiple layers, focal operations take account central (focal) cell neighbors.\nneighborhood (also named kernel, filter moving window) consideration typically size 3--3 cells (central cell eight surrounding neighbors), can take size (necessarily rectangular) shape defined user.\nfocal operation applies aggregation function cells within specified neighborhood, uses corresponding output new value central cell, moves next central cell (Figure 4.13).\nnames operation spatial filtering convolution (Burrough, McDonnell, Lloyd 2015).R, can use focal() function perform spatial filtering.\ndefine shape moving window matrix whose values correspond weights (see w parameter code chunk ).\nSecondly, fun parameter lets us specify function wish apply neighborhood.\n, choose minimum, summary function, including sum(), mean(), var() can used.function also accepts additional arguments, example, remove NAs process (na.rm = TRUE) (na.rm = FALSE).\nFIGURE 4.13: Input raster (left) resulting output raster (right) due focal operation - finding minimum value 3--3 moving windows.\ncan quickly check output meets expectations.\nexample, minimum value always upper left corner moving window (remember created input raster row-wise incrementing cell values one starting upper left corner).\nexample, weighting matrix consists 1s, meaning cell weight output, can changed.Focal functions filters play dominant role image processing.\nLow-pass smoothing filters use mean function remove extremes.\ncase categorical data, can replace mean mode, common value.\ncontrast, high-pass filters accentuate features.\nline detection Laplace Sobel filters might serve example .\nCheck focal() help page use R (also used exercises end chapter).Terrain processing, calculation topographic characteristics slope, aspect flow directions, relies focal functions.\nterrain() can used calculate metrics, although terrain algorithms, including Zevenbergen Thorne method compute slope, implemented terra function.\nMany algorithms — including curvatures, contributing areas wetness indices — implemented open source desktop geographic information system (GIS) software.\nChapter 10 shows access GIS functionality within R.","code":"\nr_focal = focal(elev, w = matrix(1, nrow = 3, ncol = 3), fun = min)"},{"path":"spatial-operations.html","id":"zonal-operations","chapter":"4 Spatial data operations","heading":"4.3.5 Zonal operations","text":"\nJust like focal operations, zonal operations apply aggregation function multiple raster cells.\nHowever, second raster, usually categorical values, defines zonal filters (‘zones’) case zonal operations, opposed neighborhood window case focal operations presented previous section.\nConsequently, raster cells defining zonal filter necessarily neighbors.\ngrain size raster good example, illustrated right panel Figure 3.2: different grain sizes spread irregularly throughout raster.\nFinally, result zonal operation summary table grouped zone operation also known zonal statistics GIS world.\ncontrast focal operations return raster object default.following code chunk uses zonal() function calculate mean elevation associated grain size class.returns statistics category, mean altitude grain size class.\nNote also possible get raster calculated statistics zone setting .raster argument TRUE.","code":"\nz = zonal(elev, grain, fun = \"mean\")\nz\n#> grain elev\n#> 1 clay 14.8\n#> 2 silt 21.2\n#> 3 sand 18.7"},{"path":"spatial-operations.html","id":"global-operations-and-distances","chapter":"4 Spatial data operations","heading":"4.3.6 Global operations and distances","text":"Global operations special case zonal operations entire raster dataset representing single zone.\ncommon global operations descriptive statistics entire raster dataset minimum maximum – already discussed Section 3.3.2.Aside , global operations also useful computation distance weight rasters.\nfirst case, one can calculate distance cell specific target cell.\nexample, one might want compute distance nearest coast (see also terra::distance()).\nmight also want consider topography, means, interested pure distance like also avoid crossing mountain ranges going coast.\n, can weight distance elevation additional altitudinal meter ‘prolongs’ Euclidean distance (Exercises 8 9 end chapter exactly ).\nVisibility viewshed computations also belong family global operations (exercises Chapter 10, compute viewshed raster).","code":""},{"path":"spatial-operations.html","id":"map-algebra-counterparts-in-vector-processing","chapter":"4 Spatial data operations","heading":"4.3.7 Map algebra counterparts in vector processing","text":"Many map algebra operations counterpart vector processing (Liu Mason 2009).\nComputing distance raster (global operation) considering maximum distance (logical focal operation) equivalent vector buffer operation (Section 5.2.5).\nReclassifying raster data (either local zonal function depending input) equivalent dissolving vector data (Section 4.2.5).\nOverlaying two rasters (local operation), one contains NULL NA values representing mask, similar vector clipping (Section 5.2.5).\nQuite similar spatial clipping intersecting two layers (Section 4.2.1).\ndifference two layers (vector raster) simply share overlapping area (see Figure 5.8 example).\nHowever, careful wording.\nSometimes words slightly different meanings raster vector data models.\naggregating polygon geometries means dissolving boundaries, raster data geometries means increasing cell sizes thereby reducing spatial resolution.\nZonal operations dissolve cells one raster accordance zones (categories) another raster dataset using aggregating function.","code":""},{"path":"spatial-operations.html","id":"merging-rasters","chapter":"4 Spatial data operations","heading":"4.3.8 Merging rasters","text":"\nSuppose like compute NDVI (see Section 4.3.3), additionally want compute terrain attributes elevation data observations within study area.\ncomputations rely remotely sensed information.\ncorresponding imagery often divided scenes covering specific spatial extent, frequently, study area covers one scene.\n, need merge scenes covered study area.\neasiest case, can just merge scenes, put side side.\npossible, example, digital elevation data.\nfollowing code chunk first download SRTM elevation data Austria Switzerland (country codes, see geodata function country_codes()).\nsecond step, merge two rasters one.terra’s merge() command combines two images, case overlap, uses value first raster.merging approach little use overlapping values correspond .\nfrequently case want combine spectral imagery scenes taken different dates.\nmerge() command still work see clear border resulting image.\nhand, mosaic() command lets define function overlapping area.\ninstance, compute mean value – might smooth clear border merged result likely make disappear.\ndetailed introduction remote sensing R, see Wegmann, Leutner, Dech (2016).","code":"\naut = geodata::elevation_30s(country = \"AUT\", path = tempdir())\nch = geodata::elevation_30s(country = \"CHE\", path = tempdir())\naut_ch = merge(aut, ch)"},{"path":"spatial-operations.html","id":"exercises-2","chapter":"4 Spatial data operations","heading":"4.4 Exercises","text":"E1. established Section 4.2 Canterbury region New Zealand containing 101 highest points country.\nmany high points Canterbury region contain?Bonus: plot result using plot() function show New Zealand, canterbury region highlighted yellow, high points Canterbury represented red crosses (hint: pch = 7) high points parts New Zealand represented blue circles. See help page ?points details illustration different pch values.E2. region second highest number nz_height points, many ?E3. Generalizing question regions: many New Zealand’s 16 regions contain points belong top 101 highest points country? regions?Bonus: create table listing regions order number points name.E4. Test knowledge spatial predicates finding plotting US states relate spatial objects.starting point exercise create object representing Colorado state USA. command\ncolorado = us_states[us_states$NAME == \"Colorado\",] (base R) filter() function (tidyverse) plot resulting object context US states.Create new object representing states geographically intersect Colorado plot result (hint: concise way subsetting method [).Create another object representing objects touch (shared boundary ) Colorado plot result (hint: remember can use argument op = st_intersects spatial relations spatial subsetting operations base R).Bonus: create straight line centroid District Columbia near East coast centroid California near West coast USA (hint: functions st_centroid(), st_union() st_cast() described Chapter 5 may help) identify states long East-West line crosses.E5. Use dem = rast(system.file(\"raster/dem.tif\", package = \"spDataLarge\")), reclassify elevation three classes: low (<300), medium high (>500).\nSecondly, read NDVI raster (ndvi = rast(system.file(\"raster/ndvi.tif\", package = \"spDataLarge\"))) compute mean NDVI mean elevation altitudinal class.E6. Apply line detection filter rast(system.file(\"ex/logo.tif\", package = \"terra\")).\nPlot result.\nHint: Read ?terra::focal().E7. Calculate Normalized Difference Water Index (NDWI; (green - nir)/(green + nir)) Landsat image.\nUse Landsat image provided spDataLarge package (system.file(\"raster/landsat.tif\", package = \"spDataLarge\")).\nAlso, calculate correlation NDVI NDWI area (hint: can use layerCor() function).E8. StackOverflow post (stackoverflow.com/questions/35555709) shows compute distances nearest coastline using raster::distance().\nTry something similar terra::distance(): retrieve digital elevation model Spain, compute raster represents distances coast across country (hint: use geodata::elevation_30s()).\nConvert resulting distances meters kilometers.\nNote: may wise increase cell size input raster reduce compute time operation (aggregate()).E9. Try modify approach used exercise weighting distance raster elevation raster; every 100 altitudinal meters increase distance coast 10 km.\nNext, compute visualize difference raster created using Euclidean distance (E7) raster weighted elevation.","code":"\nlibrary(sf)\nlibrary(dplyr)\nlibrary(spData)"},{"path":"geometry-operations.html","id":"geometry-operations","chapter":"5 Geometry operations","heading":"5 Geometry operations","text":"","code":""},{"path":"geometry-operations.html","id":"prerequisites-3","chapter":"5 Geometry operations","heading":"Prerequisites","text":"chapter uses packages Chapter 4 addition spDataLarge, installed Chapter 2:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)\nlibrary(spDataLarge)"},{"path":"geometry-operations.html","id":"introduction-2","chapter":"5 Geometry operations","heading":"5.1 Introduction","text":"far book explained structure geographic datasets (Chapter 2), manipulate based non-geographic attributes (Chapter 3) spatial relations (Chapter 4).\nchapter focuses manipulating geographic elements spatial objects, example creating buffers, simplifying converting vector geometries, aggregating resampling raster data.\nreading — attempting exercises end — understand control geometry column sf objects extent geographic location pixels represented rasters relation geographic objects.Section 5.2 covers transforming vector geometries ‘unary’ ‘binary’ operations.\nUnary operations work single geometry isolation, including simplification (lines polygons), creation buffers centroids, shifting/scaling/rotating single geometries using ‘affine transformations’ (Sections 5.2.1 5.2.4).\nBinary transformations modify one geometry based shape another, including clipping geometry unions, covered Sections 5.2.5 5.2.7.\nType transformations (polygon line, example) demonstrated Section 5.2.8.Section 5.3 covers geometric transformations raster objects.\ninvolves changing size number underlying pixels, assigning new values.\nteaches change resolution (also called raster aggregation disaggregation), extent origin raster.\noperations especially useful one like align raster datasets diverse sources.\nAligned raster objects share one--one correspondence pixels, allowing processed using map algebra operations, described Section 4.3.2.interaction raster vector objects covered Chapter 6.\npresents raster values can ‘masked’ ‘extracted’ vector geometries.\nImportantly also shows ‘polygonize’ rasters ‘rasterize’ vector datasets, making two data models interchangeable.","code":""},{"path":"geometry-operations.html","id":"geo-vec","chapter":"5 Geometry operations","heading":"5.2 Geometric operations on vector data","text":"section operations way change geometry vector (sf) objects.\nadvanced spatial data operations presented previous chapter (Section 4.2), drill geometry:\nfunctions discussed section work objects class sfc addition objects class sf.","code":""},{"path":"geometry-operations.html","id":"simplification","chapter":"5 Geometry operations","heading":"5.2.1 Simplification","text":"\nSimplification process generalization vector objects (lines polygons) usually use smaller scale maps.\nAnother reason simplifying objects reduce amount memory, disk space network bandwidth consume:\nmay wise simplify complex geometries publishing interactive maps.\nsf package provides st_simplify(), uses Douglas-Peucker algorithm reduce vertex count.\nst_simplify() uses dTolerance control level generalization map units (see Douglas Peucker 1973 details).\nFigure 5.1 illustrates simplification LINESTRING geometry representing river Seine tributaries.\nsimplified geometry created following command:\nFIGURE 5.1: Comparison original simplified geometry seine object.\nresulting seine_simp object copy original seine fewer vertices.\napparent, result visually simpler (Figure 5.1, right) consuming less memory original object, verified :\nSimplification also applicable polygons.\nillustrated using us_states, representing contiguous United States.limitation st_simplify() simplifies objects per-geometry basis.\nmeans ‘topology’ lost, resulting overlapping ‘holey’ areal units illustrated Figure 5.2 (right top panel).\nms_simplify() rmapshaper provides alternative.\ndefault uses Visvalingam algorithm, overcomes limitations Douglas-Peucker algorithm (Visvalingam Whyatt 1993).\n\nfollowing code chunk uses function simplify us_states.\nresult 1% vertices input (set using argument keep) number objects remains intact set keep_shapes = TRUE:22\nalternative process simplification smoothing boundaries polygon linestring geometries, implemented smoothr package.\nSmoothing interpolates edges geometries necessarily lead fewer vertices, can especially useful working geometries arise spatially vectorizing raster (topic covered Chapter 6).\nsmoothr implements three techniques smoothing: Gaussian kernel regression, Chaikin’s corner cutting algorithm, spline interpolation, described package vignette website.\nNote similar st_simplify(), smoothing algorithms don’t preserve ‘topology’.\nworkhorse function smoothr smooth(), method argument specifies smoothing technique use.\nexample using Gaussian kernel regression smooth borders US states using method=ksmooth.\nsmoothness argument controls bandwidth Gaussian used smooth geometry default value 1.Finally, visual comparison original dataset simplified smoothed versions shown Figure 5.2.\nDifferences can observed outputs Douglas-Peucker (st_simplify), Visvalingam (ms_simplify), Gaussian kernel regression (smooth(method=ksmooth) algorithms.\nFIGURE 5.2: Polygon simplification action, comparing original geometry contiguous United States simplified versions, generated functions sf (top-right), rmapshaper (bottom-left), smoothr (bottom-right) packages.\n","code":"\nseine_simp = st_simplify(seine, dTolerance = 2000) # 2000 m\nobject.size(seine)\n#> 18096 bytes\nobject.size(seine_simp)\n#> 9112 bytes\nus_states_simp1 = st_simplify(us_states, dTolerance = 100000) # 100 km\n# proportion of points to retain (0-1; default 0.05)\nus_states_simp2 = rmapshaper::ms_simplify(us_states, keep = 0.01,\n keep_shapes = TRUE)\nus_states_simp3 = smoothr::smooth(us_states, method = \"ksmooth\", smoothness = 6)"},{"path":"geometry-operations.html","id":"centroids","chapter":"5 Geometry operations","heading":"5.2.2 Centroids","text":"\nCentroid operations identify center geographic objects.\nLike statistical measures central tendency (including mean median definitions ‘average’), many ways define geographic center object.\ncreate single point representations complex vector objects.commonly used centroid operation geographic centroid.\ntype centroid operation (often referred ‘centroid’) represents center mass spatial object (think balancing plate finger).\nGeographic centroids many uses, example create simple point representation complex geometries, estimate distances polygons.\ncan calculated sf function st_centroid() demonstrated code , generates geographic centroids regions New Zealand tributaries River Seine, illustrated black points Figure 5.3.Sometimes geographic centroid falls outside boundaries parent objects (think doughnut).\ncases point surface operations can used guarantee point parent object (e.g., labeling irregular multipolygon objects island states), illustrated red points Figure 5.3.\nNotice red points always lie parent objects.\ncreated st_point_on_surface() follows:23\nFIGURE 5.3: Centroids (black points) ‘points surface’ (red points) New Zealand’s regions (left) Seine (right) datasets.\ntypes centroids exist, including Chebyshev center visual center.\nexplore possible calculate using R, ’ll see Chapter 11.","code":"\nnz_centroid = st_centroid(nz)\nseine_centroid = st_centroid(seine)\nnz_pos = st_point_on_surface(nz)\nseine_pos = st_point_on_surface(seine)"},{"path":"geometry-operations.html","id":"buffers","chapter":"5 Geometry operations","heading":"5.2.3 Buffers","text":"\nBuffers polygons representing area within given distance geometric feature:\nregardless whether input point, line polygon, output polygon.\nUnlike simplification (often used visualization reducing file size) buffering tends used geographic data analysis.\nmany points within given distance line?\ndemographic groups within travel distance new shop?\nkinds questions can answered visualized creating buffers around geographic entities interest.Figure 5.4 illustrates buffers different sizes (5 50 km) surrounding river Seine tributaries.\nbuffers created commands , show command st_buffer() requires least two arguments: input geometry distance, provided units CRS (case meters).\nFIGURE 5.4: Buffers around Seine dataset 5 km (left) 50 km (right). Note colors, reflect fact one buffer created per geometry feature.\nst_buffer() additional arguments.\nimportant ones :nQuadSegs (GEOS engine used), means ‘number segments per quadrant’ set default 30 (meaning circles created buffers composed \\(4 \\times 30 = 120\\) lines).\nUnusual cases may useful include memory consumed output buffer operation major concern (case reduced) high precision needed (case increased)max_cells (S2 engine used), larger value, smooth buffer , calculations take longerendCapStyle joinStyle (GEOS engine used), control appearance buffer’s edgessingleSide (GEOS engine used), controls whether buffer created one sides input geometry","code":"\nseine_buff_5km = st_buffer(seine, dist = 5000)\nseine_buff_50km = st_buffer(seine, dist = 50000)"},{"path":"geometry-operations.html","id":"affine-transformations","chapter":"5 Geometry operations","heading":"5.2.4 Affine transformations","text":"\nAffine transformation transformation preserves lines parallelism.\nHowever, angles length necessarily preserved.\nAffine transformations include, among others, shifting (translation), scaling rotation.\nAdditionally, possible use combination .\nAffine transformations essential part geocomputation.\nexample, shifting needed labels placement, scaling used non-contiguous area cartograms (see Section 9.6), many affine transformations applied reprojecting improving geometry created based distorted wrongly projected map.\nsf package implements affine transformation objects classes sfg sfc.Shifting moves every point distance map units.\ndone adding numerical vector vector object.\nexample, code shifts y-coordinates 100,000 meters north, leaves x-coordinates untouched (Figure 5.5, left panel).Scaling enlarges shrinks objects factor.\ncan applied either globally locally.\nGlobal scaling increases decreases coordinates values relation origin coordinates, keeping geometries topological relations intact.\ncan done subtraction multiplication sfg sfc object.Local scaling treats geometries independently requires points around geometries going scaled, e.g., centroids.\nexample , geometry shrunk factor two around centroids (Figure 5.5, middle panel).\nachieve , object firstly shifted way center coordinates 0, 0 ((nz_sfc - nz_centroid_sfc)).\nNext, sizes geometries reduced half (* 0.5).\nFinally, object’s centroid moved back input data coordinates (+ nz_centroid_sfc).Rotation two-dimensional coordinates requires rotation matrix:\\[\nR =\n\\begin{bmatrix}\n\\cos \\theta & -\\sin \\theta \\\\ \n\\sin \\theta & \\cos \\theta \\\\\n\\end{bmatrix}\n\\]rotates points clockwise direction.\nrotation matrix can implemented R :rotation function accepts one argument - rotation angle degrees.\nRotation done around selected points, centroids (Figure 5.5, right panel).\nSee vignette(\"sf3\") examples.\nFIGURE 5.5: Illustrations affine transformations: shift, scale rotate.\nFinally, newly created geometries can replace old ones st_set_geometry() function:","code":"\nnz_sfc = st_geometry(nz)\nnz_shift = nz_sfc + c(0, 100000)\nnz_centroid_sfc = st_centroid(nz_sfc)\nnz_scale = (nz_sfc - nz_centroid_sfc) * 0.5 + nz_centroid_sfc\nrotation = function(a){\n r = a * pi / 180 #degrees to radians\n matrix(c(cos(r), sin(r), -sin(r), cos(r)), nrow = 2, ncol = 2)\n} \nnz_rotate = (nz_sfc - nz_centroid_sfc) * rotation(30) + nz_centroid_sfc\nnz_scale_sf = st_set_geometry(nz, nz_scale)"},{"path":"geometry-operations.html","id":"clipping","chapter":"5 Geometry operations","heading":"5.2.5 Clipping","text":"\nSpatial clipping form spatial subsetting involves changes geometry columns least affected features.Clipping can apply features complex points:\nlines, polygons ‘multi’ equivalents.\nillustrate concept start simple example:\ntwo overlapping circles center point one unit away radius one (Figure 5.6).\nFIGURE 5.6: Overlapping circles.\nImagine want select one circle , space covered x y.\ncan done using function st_intersection(), illustrated using objects named x y represent left- right-hand circles (Figure 5.7).\nFIGURE 5.7: Overlapping circles gray color indicating intersection .\nsubsequent code chunk demonstrates works combinations ‘Venn’ diagram representing x y, inspired Figure 5.1 book R Data Science (Grolemund Wickham 2016).\nFIGURE 5.8: Spatial equivalents logical operators.\n","code":"\nb = st_sfc(st_point(c(0, 1)), st_point(c(1, 1))) # create 2 points\nb = st_buffer(b, dist = 1) # convert points to circles\nplot(b, border = \"grey\")\ntext(x = c(-0.5, 1.5), y = 1, labels = c(\"x\", \"y\"), cex = 3) # add text\nx = b[1]\ny = b[2]\nx_and_y = st_intersection(x, y)\nplot(b, border = \"grey\")\nplot(x_and_y, col = \"lightgrey\", border = \"grey\", add = TRUE) # intersecting area"},{"path":"geometry-operations.html","id":"subsetting-and-clipping","chapter":"5 Geometry operations","heading":"5.2.6 Subsetting and clipping","text":"\nClipping objects can change geometry can also subset objects, returning features intersect (partly intersect) clipping/subsetting object.\nillustrate point, subset points cover bounding box circles x y Figure 5.8.\npoints inside just one circle, inside inside neither.\nst_sample() used generate simple random distribution points within extent circles x y, resulting output illustrated Figure 5.9, raising question: subset points return point intersects x y?\nFIGURE 5.9: Randomly distributed points within bounding box enclosing circles x y. point intersects objects x y highlighted.\ncode chunk demonstrates three ways achieve result.\ncan use intersection x y (represented x_and_y previous code chunk) subsetting object directly, shown first line code chunk .\ncan also find intersection input points represented p subsetting/clipping object x_and_y, demonstrated second line code chunk .\nsecond approach return features partly intersect x_and_y modified geometries spatially extensive features cross border subsetting object.\nthird approach create subsetting object using binary spatial predicate st_intersects(), introduced previous chapter.\nresults identical (except superficial differences attribute names), implementation differs substantially:Although example rather contrived provided educational rather applied purposes, encourage reader reproduce results deepen understanding handling geographic vector objects R, raises important question: implementation use?\nGenerally, concise implementations favored, meaning first approach .\nreturn question choosing different implementations technique algorithm Chapter 11.","code":"\nbb = st_bbox(st_union(x, y))\nbox = st_as_sfc(bb)\nset.seed(2024)\np = st_sample(x = box, size = 10)\np_xy1 = p[x_and_y]\nplot(box, border = \"grey\", lty = 2)\nplot(x, add = TRUE, border = \"grey\")\nplot(y, add = TRUE, border = \"grey\")\nplot(p, add = TRUE, cex = 3.5)\nplot(p_xy1, cex = 5, col = \"red\", add = TRUE)\ntext(x = c(-0.5, 1.5), y = 1, labels = c(\"x\", \"y\"), cex = 3)\nbb = st_bbox(st_union(x, y))\nbox = st_as_sfc(bb)\nset.seed(2024)\np = st_sample(x = box, size = 10)\nx_and_y = st_intersection(x, y)\n# way #1\np_xy1 = p[x_and_y]\n# way #2\np_xy2 = st_intersection(p, x_and_y)\n# way #3\nsel_p_xy = st_intersects(p, x, sparse = FALSE)[, 1] & \n st_intersects(p, y, sparse = FALSE)[, 1]\np_xy3 = p[sel_p_xy]"},{"path":"geometry-operations.html","id":"geometry-unions","chapter":"5 Geometry operations","heading":"5.2.7 Geometry unions","text":"\nsaw Section 3.2.3, spatial aggregation can silently dissolve geometries touching polygons group.\ndemonstrated code chunk 48 US states District Columbia (us_states) aggregated four regions using base dplyr functions (see results Figure 5.10):\nFIGURE 5.10: Spatial aggregation contiguous polygons, illustrated aggregating population US states regions, population represented color. Note operation automatically dissolves boundaries states.\ngoing terms geometries?\nBehind scenes, aggregate() summarize() combine geometries dissolve boundaries using st_union().\ndemonstrated code chunk creates united western US:function can take two geometries unite , demonstrated code chunk creates united western block incorporating Texas (challenge: reproduce plot result):","code":"\nregions = aggregate(x = us_states[, \"total_pop_15\"], by = list(us_states$REGION),\n FUN = sum, na.rm = TRUE)\nregions2 = us_states |> \n group_by(REGION) |>\n summarize(pop = sum(total_pop_15, na.rm = TRUE))\nus_west = us_states[us_states$REGION == \"West\", ]\nus_west_union = st_union(us_west)\ntexas = us_states[us_states$NAME == \"Texas\", ]\ntexas_union = st_union(us_west_union, texas)"},{"path":"geometry-operations.html","id":"type-trans","chapter":"5 Geometry operations","heading":"5.2.8 Type transformations","text":"\nGeometry casting powerful operation enables transformation geometry type.\nimplemented st_cast() function sf package.\nImportantly, st_cast() behaves differently single simple feature geometry (sfg) objects, simple feature geometry column (sfc) simple features objects.Let’s create multipoint illustrate geometry casting works simple feature geometry (sfg) objects:case, st_cast() can useful transform new object linestring polygon (Figure 5.11).\nFIGURE 5.11: Examples linestring polygon casted multipoint geometry.\nConversion multipoint linestring common operation creates line object ordered point observations, GPS measurements geotagged media.\n, turn, allows us perform spatial operations calculation length path traveled.\nConversion multipoint linestring polygon often used calculate area, example set GPS measurements taken around lake corners building lot.transformation process can also reversed using st_cast():Geometry casting simple features geometry column (sfc) simple features objects works sfg cases.\nOne important difference conversion multi-types non-multi-types.\nresult process, multi-objects sfc sf split many non-multi-objects.Let’s say following sf objects:POI - POINT type (one point definition)MPOI - MULTIPOINT type four pointsLIN - LINESTRING type one linestring containing five pointsMLIN - MULTILINESTRING type two linestrings (one five points one two points)POL - POLYGON type one polygon (created using five points)MPOL - MULTIPOLYGON type consisting two polygons (consisting five points)GC - GEOMETRYCOLLECTION type two geometries, MULTIPOINT (four points) LINESTRING (five points)Table 5.1 shows possible geometry type transformations simple feature objects listed .\nSingle simple feature geometries (represented first column table) can transformed multiple geometry types, represented columns Table 5.1.\ntransformations possible: convert single point multilinestring polygon, example, explaining cells [1, 4:5] table contain NA.\ntransformations split single features input multiple sub-features, ‘expanding’ sf objects (adding new rows duplicate attribute values).\nmultipoint geometry consisting five pairs coordinates tranformed ‘POINT’ geometry, example, output contain five features.TABLE 5.1: Geometry casting simple feature geometries (see Section 2.1) input type row output type columnNote:\nNote: Values like (1) represent number features; NA means operation possibleLet’s try apply geometry type transformations new object, multilinestring_sf, example (left Figure 5.12):can imagine road river network.\nnew object one row defines lines.\nrestricts number operations can done, example prevents adding names line segment calculating lengths single lines.\nst_cast() function can used situation, separates one mutlilinestring three linestrings.\nFIGURE 5.12: Examples type casting MULTILINESTRING (left) LINESTRING (right).\nnewly created object allows attributes creation (see Section 3.2.5) length measurements:","code":"\nmultipoint = st_multipoint(matrix(c(1, 3, 5, 1, 3, 1), ncol = 2))\nlinestring = st_cast(multipoint, \"LINESTRING\")\npolyg = st_cast(multipoint, \"POLYGON\")\nmultipoint_2 = st_cast(linestring, \"MULTIPOINT\")\nmultipoint_3 = st_cast(polyg, \"MULTIPOINT\")\nall.equal(multipoint, multipoint_2)\n#> [1] TRUE\nall.equal(multipoint, multipoint_3)\n#> [1] TRUE\nmultilinestring_list = list(matrix(c(1, 4, 5, 3), ncol = 2), \n matrix(c(4, 4, 4, 1), ncol = 2),\n matrix(c(2, 4, 2, 2), ncol = 2))\nmultilinestring = st_multilinestring(multilinestring_list)\nmultilinestring_sf = st_sf(geom = st_sfc(multilinestring))\nmultilinestring_sf\n#> Simple feature collection with 1 feature and 0 fields\n#> Geometry type: MULTILINESTRING\n#> Dimension: XY\n#> Bounding box: xmin: 1 ymin: 1 xmax: 4 ymax: 5\n#> CRS: NA\n#> geom\n#> 1 MULTILINESTRING ((1 5, 4 3)...\nlinestring_sf2 = st_cast(multilinestring_sf, \"LINESTRING\")\nlinestring_sf2\n#> Simple feature collection with 3 features and 0 fields\n#> Geometry type: LINESTRING\n#> Dimension: XY\n#> Bounding box: xmin: 1 ymin: 1 xmax: 4 ymax: 5\n#> CRS: NA\n#> geom\n#> 1 LINESTRING (1 5, 4 3)\n#> 2 LINESTRING (4 4, 4 1)\n#> 3 LINESTRING (2 2, 4 2)\nlinestring_sf2$name = c(\"Riddle Rd\", \"Marshall Ave\", \"Foulke St\")\nlinestring_sf2$length = st_length(linestring_sf2)\nlinestring_sf2\n#> Simple feature collection with 3 features and 2 fields\n#> Geometry type: LINESTRING\n#> Dimension: XY\n#> Bounding box: xmin: 1 ymin: 1 xmax: 4 ymax: 5\n#> CRS: NA\n#> geom name length\n#> 1 LINESTRING (1 5, 4 3) Riddle Rd 3.61\n#> 2 LINESTRING (4 4, 4 1) Marshall Ave 3.00\n#> 3 LINESTRING (2 2, 4 2) Foulke St 2.00"},{"path":"geometry-operations.html","id":"geo-ras","chapter":"5 Geometry operations","heading":"5.3 Geometric operations on raster data","text":"\nGeometric raster operations include shift, flipping, mirroring, scaling, rotation warping images.\noperations necessary variety applications including georeferencing, used allow images overlaid accurate map known CRS (Liu Mason 2009).\nvariety georeferencing techniques exist, including:Georectification based known ground control pointsOrthorectification, also accounts local topographyImage registration used combine images thing shot different sensors aligning one image another (terms coordinate system resolution)R rather unsuitable first two points since often require manual intervention usually done help dedicated GIS software (see also Chapter 10).\nhand, aligning several images possible R section shows among others .\noften includes changing extent, resolution origin image.\nmatching projection course also required already covered Section 7.8.case, reasons perform geometric operation single raster image.\ninstance, Chapter 14 define metropolitan areas Germany 20 km2 pixels 500,000 inhabitants.\noriginal inhabitant raster, however, resolution 1 km2 decrease (aggregate) resolution factor 20 (see Section 14.5).\nAnother reason aggregating raster simply decrease run-time save disk space.\ncourse, approach recommended task hand allows coarser resolution raster data.","code":""},{"path":"geometry-operations.html","id":"geometric-intersections","chapter":"5 Geometry operations","heading":"5.3.1 Geometric intersections","text":"\nSection 4.3.1 shown extract values raster overlaid spatial objects.\nretrieve spatial output, can use almost subsetting syntax.\ndifference make clear like keep matrix structure setting drop argument FALSE.\nreturn raster object containing cells whose midpoints overlap clip.operation can also use intersect() crop() command.","code":"\nelev = rast(system.file(\"raster/elev.tif\", package = \"spData\"))\nclip = rast(xmin = 0.9, xmax = 1.8, ymin = -0.45, ymax = 0.45,\n resolution = 0.3, vals = rep(1, 9))\nelev[clip, drop = FALSE]\n#> class : SpatRaster \n#> dimensions : 2, 1, 1 (nrow, ncol, nlyr)\n#> resolution : 0.5, 0.5 (x, y)\n#> extent : 1, 1.5, -0.5, 0.5 (xmin, xmax, ymin, ymax)\n#> coord. ref. : lon/lat WGS 84 (EPSG:4326) \n#> source(s) : memory\n#> varname : elev \n#> name : elev \n#> min value : 18 \n#> max value : 24"},{"path":"geometry-operations.html","id":"extent-and-origin","chapter":"5 Geometry operations","heading":"5.3.2 Extent and origin","text":"\nmerging performing map algebra rasters, resolution, projection, origin /extent match. Otherwise, add values one raster resolution 0.2 decimal degrees second raster resolution 1 decimal degree?\nproblem arises like merge satellite imagery different sensors different projections resolutions.\ncan deal mismatches aligning rasters.simplest case, two images differ regard extent.\nfollowing code adds one row two columns side raster setting new values NA (Figure 5.13).\nFIGURE 5.13: Original raster (left) raster (right) extended one row top bottom two columns left right.\nPerforming algebraic operation two objects differing extents R, terra package returns error.However, can align extent two rasters extend().\nInstead telling function many rows columns added (done ), allow figure using another raster object.\n, extend elev object extent elev_2.\nvalues newly added rows columns set NA.\norigin raster cell corner closest coordinates (0, 0).\norigin() function returns coordinates origin.\nexample cell corner exists coordinates (0, 0), necessarily case.two rasters different origins, cells overlap completely make map algebra impossible.\nchange origin, use origin().24\nFigure 5.14 reveals effect changing origin way.\nFIGURE 5.14: Rasters identical values different origins.\nNote changing resolution (next section) frequently also changes origin.","code":"\nelev = rast(system.file(\"raster/elev.tif\", package = \"spData\"))\nelev_2 = extend(elev, c(1, 2))\nelev_3 = elev + elev_2\n#> Error: [+] extents do not match\nelev_4 = extend(elev, elev_2)\norigin(elev_4)\n#> [1] 0 0\n# change the origin\norigin(elev_4) = c(0.25, 0.25)"},{"path":"geometry-operations.html","id":"aggregation-and-disaggregation","chapter":"5 Geometry operations","heading":"5.3.3 Aggregation and disaggregation","text":"\nRaster datasets can also differ regard resolution.\nmatch resolutions, one can either decrease (aggregate()) increase (disagg()) resolution one raster.25As example, change spatial resolution dem (found spDataLarge package) factor 5 (Figure 5.15).\nAdditionally, output cell value going correspond mean input cells (note one use functions well, median(), sum(), etc.):\nFIGURE 5.15: Original raster (left). Aggregated raster (right).\nTable 5.2 compares properties original aggregated raster.\nNotice “decreasing” resolution aggregate() increases resolution \\((30.85, 30.85)\\) \\((154.25, 154.25)\\).\ndone decreasing number rows (nrow) columns (ncol) (see Section 2.3).\nextent slightly adjusted accommodate new grid size.TABLE 5.2: Properties original aggregated raster.\ndisagg() function increases resolution raster objects.\ncomes two methods compute values newly created cells: default method (method = \"near\") simply gives output cells value input cell, hence duplicates values, translates ‘blocky’ output.\nbilinear method uses four nearest pixel centers input image (salmon colored points Figure 5.16) compute average weighted distance (arrows Figure 5.16).\nvalue output cell represented square upper left corner Figure 5.16.\nFIGURE 5.16: distance-weighted average four closest input cells determine output using bilinear method disaggregation.\nComparing values dem dem_disagg tells us identical (can also use compareGeom() .equal()).\nHowever, hardly expected, since disaggregating simple interpolation technique.\nimportant keep mind disaggregating results finer resolution; corresponding values, however, accurate lower resolution source.","code":"\ndem = rast(system.file(\"raster/dem.tif\", package = \"spDataLarge\"))\ndem_agg = aggregate(dem, fact = 5, fun = mean)\ndem_disagg = disagg(dem_agg, fact = 5, method = \"bilinear\")\nidentical(dem, dem_disagg)\n#> [1] FALSE"},{"path":"geometry-operations.html","id":"resampling","chapter":"5 Geometry operations","heading":"5.3.4 Resampling","text":"\nmethods aggregation disaggregation suitable want change resolution raster aggregation/disaggregation factor.\nHowever, two rasters different resolutions origins?\nrole resampling – process computing values new pixel locations.\nshort, process takes values original raster recalculates new values target raster custom resolution origin (Figure 5.17).\nFIGURE 5.17: Resampling original (input) raster target raster custom resolution origin.\n\nseveral methods estimating values raster different resolutions/origins, shown Figure 5.18.\nmain resampling methods include:Nearest neighbor: assigns value nearest cell original raster cell target one. fast simple technique usually suitable resampling categorical rastersBilinear interpolation: assigns weighted average four nearest cells original raster cell target one (Figure 5.16). fastest method appropriate continuous rastersCubic interpolation: uses values 16 nearest cells original raster determine output cell value, applying third-order polynomial functions. Used continuous rasters results smoother surface compared bilinear interpolation, computationally demandingCubic spline interpolation: also uses values 16 nearest cells original raster determine output cell value, applies cubic splines (piecewise third-order polynomial functions). Used continuous rastersLanczos windowed sinc resampling: uses values 36 nearest cells original raster determine output cell value. Used continuous rasters26The explanation highlights nearest neighbor resampling suitable categorical rasters, methods can used (different outcomes) continuous rasters.\nPlease note also, methods gain complexity processing time top bottom.\nMoreover, resampling can done using statistics (e.g., minimum mode) contributing cells.apply resampling, terra package provides resample() function.\naccepts input raster (x), raster target spatial properties (y), resampling method (method).need raster target spatial properties see resample() function works.\nexample, create target_rast, often use already existing raster object.Next, need provide two raster objects first two arguments one resampling methods described .Figure 5.18 shows comparison different resampling methods dem object.\nFIGURE 5.18: Visual comparison original raster five different resampling methods.\nresample() function also additional resampling methods, including sum, min, q1, med, q3, max, average, mode, rms.\ncalculate given statistic based values non-NA contributing grid cells.\nexample, sum useful raster cell represents spatially extensive variable (e.g., number people).\neffect using sum, resampled raster total number people original one.\nsee section 7.8, raster reprojection special case resampling target raster different CRS original raster.geometry operations terra user-friendly, rather fast, work large raster objects.\nHowever, cases, terra performant either extensive rasters many raster files, alternatives considered.established alternatives come GDAL library.\ncontains several utility functions, including:gdalinfo - lists various information raster file, including resolution, CRS, bounding box, moregdal_translate - converts raster data different file formatsgdal_rasterize - converts vector data raster filesgdalwarp - allows raster mosaicing, resampling, cropping, reprojecting","code":"\ntarget_rast = rast(xmin = 794650, xmax = 798250, \n ymin = 8931750, ymax = 8935350,\n resolution = 300, crs = \"EPSG:32717\")\ndem_resampl = resample(dem, y = target_rast, method = \"bilinear\")"},{"path":"geometry-operations.html","id":"exercises-3","chapter":"5 Geometry operations","heading":"5.4 Exercises","text":"E1. Generate plot simplified versions nz dataset.\nExperiment different values keep (ranging 0.5 0.00005) ms_simplify() dTolerance (100 100,000) st_simplify().value form result start break method, making New Zealand unrecognizable?Advanced: different geometry type results st_simplify() compared geometry type ms_simplify()? problems create can resolved?E2. first exercise Chapter Spatial data operations established Canterbury region 70 101 highest points New Zealand.\nUsing st_buffer(), many points nz_height within 100 km Canterbury?E3. Find geographic centroid New Zealand.\nfar geographic centroid Canterbury?E4. world maps north-orientation.\nworld map south-orientation created reflection (one affine transformations mentioned chapter) world object’s geometry.\nWrite code .\nHint: can use rotation() function chapter transformation.\nBonus: create upside-map country.E5. Run code Section 5.2.6. reference objects created section, subset point p contained within x y.Using base subsetting operators.Using intermediary object created st_intersection().E6. Calculate length boundary lines US states meters.\nstate longest border shortest?\nHint: st_length function computes length LINESTRING MULTILINESTRING geometry.E7. Read srtm.tif file R (srtm = rast(system.file(\"raster/srtm.tif\", package = \"spDataLarge\"))).\nraster resolution 0.00083 0.00083 degrees.\nChange resolution 0.01 0.01 degrees using method available terra package.\nVisualize results.\nCan notice differences results resampling methods?","code":""},{"path":"raster-vector.html","id":"raster-vector","chapter":"6 Raster-vector interactions","heading":"6 Raster-vector interactions","text":"","code":""},{"path":"raster-vector.html","id":"prerequisites-4","chapter":"6 Raster-vector interactions","heading":"Prerequisites","text":"chapter requires following packages:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)"},{"path":"raster-vector.html","id":"introduction-3","chapter":"6 Raster-vector interactions","heading":"6.1 Introduction","text":"\nchapter focuses interactions raster vector geographic data models, introduced Chapter 2.\nincludes several main techniques:\nraster cropping masking using vector objects (Section 6.2),\nextracting raster values using different types vector data (Section 6.3),\nraster-vector conversion (Sections 6.4 6.5).\nconcepts demonstrated using data previous chapters understand potential real-world applications.","code":""},{"path":"raster-vector.html","id":"raster-cropping","chapter":"6 Raster-vector interactions","heading":"6.2 Raster cropping","text":"\nMany geographic data projects involve integrating data many different sources, remote sensing images (rasters) administrative boundaries (vectors).\nOften extent input raster datasets larger area interest.\ncase, raster cropping masking useful unifying spatial extent input data.\noperations reduce object memory use associated computational resources subsequent analysis steps may necessary preprocessing step creating attractive maps involving raster data.use two objects illustrate raster cropping:SpatRaster object srtm representing elevation (meters sea level) southwestern UtahA vector (sf) object zion representing Zion National ParkBoth target cropping objects must projection.\nfollowing code chunk therefore reads datasets spDataLarge package installed Chapter 2, also ‘reprojects’ zion (topic covered Chapter 7):use crop() terra package crop srtm raster.\nfunction reduces rectangular extent object passed first argument based extent object passed second argument.\nfunctionality demonstrated command , generates Figure 6.1(B).\nRelated crop() terra function mask(), sets values outside bounds object passed second argument NA.\nfollowing command therefore masks every cell outside Zion National Park boundaries (Figure 6.1(C)).Importantly, want use crop() mask() together cases.\ncombination functions () limit raster’s extent area interest (b) replace values outside area NA.27Changing settings mask() yields different results.\nSetting inverse = TRUE mask everything inside bounds park (see ?mask details) (Figure 6.1(D)), setting updatevalue = 0 set pixels outside national park 0.\nFIGURE 6.1: Raster cropping raster masking.\n","code":"\nsrtm = rast(system.file(\"raster/srtm.tif\", package = \"spDataLarge\"))\nzion = read_sf(system.file(\"vector/zion.gpkg\", package = \"spDataLarge\"))\nzion = st_transform(zion, st_crs(srtm))\nsrtm_cropped = crop(srtm, zion)\nsrtm_masked = mask(srtm, zion)\nsrtm_cropped = crop(srtm, zion)\nsrtm_final = mask(srtm_cropped, zion)\nsrtm_inv_masked = mask(srtm, zion, inverse = TRUE)"},{"path":"raster-vector.html","id":"raster-extraction","chapter":"6 Raster-vector interactions","heading":"6.3 Raster extraction","text":"\nRaster extraction process identifying returning values associated ‘target’ raster specific locations, based (typically vector) geographic ‘selector’ object.\nresults depend type selector used (points, lines polygons) arguments passed terra::extract() function.\nreverse raster extraction — assigning raster cell values based vector objects — rasterization, described Section 6.4.\nbasic example extracting value raster cell specific points.\npurpose, use zion_points, contain sample 30 locations within Zion National Park (Figure 6.2).\nfollowing command extracts elevation values srtm creates data frame points’ IDs (one value per vector’s row) related srtm values point.\nNow, can add resulting object zion_points dataset cbind() function:\nFIGURE 6.2: Locations points used raster extraction.\n\nRaster extraction also works line selectors.\n, extracts one value raster cell touched line.\nHowever, line extraction approach recommended obtain values along transects, hard get correct distance pair extracted raster values.case, better approach split line many points extract values points.\ndemonstrate , code creates zion_transect, straight line going northwest southeast Zion National Park, illustrated Figure 6.3() (see Section 2.2 recap vector data model):utility extracting heights linear selector illustrated imagining planning hike.\nmethod demonstrated provides ‘elevation profile’ route (line need straight), useful estimating long take due long climbs.first step add unique id transect.\nNext, st_segmentize() function can add points along line(s) provided density (dfMaxLength) convert points st_cast().Now, large set points, want derive distance first point transects subsequent points.\ncase, one transect, code, principle, work number transects:Finally, can extract elevation values point transects combine information main object.resulting zion_transect can used create elevation profiles, illustrated Figure 6.3(B).\nFIGURE 6.3: Location line used () raster extraction (B) elevation along line.\n\nfinal type geographic vector object raster extraction polygons.\nLike lines, polygons tend return many raster values per polygon.\ndemonstrated command , results data frame column names ID (row number polygon) srtm (associated elevation values):results can used generate summary statistics raster values per polygon, example characterize single region compare many regions.\nshown code , creates object zion_srtm_df containing summary statistics elevation values Zion National Park (see Figure 6.4()):preceding code chunk used dplyr provide summary statistics cell values per polygon ID, described Chapter 3.\nresults provide useful summaries, example maximum height park around 2,661 meters sea level (summary statistics, standard deviation, can also calculated way).\none polygon example, data frame single row returned; however, method works multiple selector polygons used.similar approach works counting occurrences categorical raster values within polygons.\nillustrated land cover dataset (nlcd) spDataLarge package Figure 6.4(B), demonstrated code :\nFIGURE 6.4: Area used () continuous (B) categorical raster extraction.\n\nAlthough terra package offers rapid extraction raster values within polygons, extract() can still bottleneck processing large polygon datasets.\nexactextractr package offers significantly faster alternative extracting pixel values exact_extract() function.\nexact_extract() function also computes, default, fraction raster cell overlapped polygon, precise (see note details).","code":"\ndata(\"zion_points\", package = \"spDataLarge\")\nelevation = terra::extract(srtm, zion_points)\nzion_points = cbind(zion_points, elevation)\nzion_transect = cbind(c(-113.2, -112.9), c(37.45, 37.2)) |>\n st_linestring() |> \n st_sfc(crs = crs(srtm)) |>\n st_sf(geometry = _)\nzion_transect$id = 1:nrow(zion_transect)\nzion_transect = st_segmentize(zion_transect, dfMaxLength = 250)\nzion_transect = st_cast(zion_transect, \"POINT\")\nzion_transect = zion_transect |> \n group_by(id) |> \n mutate(dist = st_distance(geometry)[, 1]) \nzion_elev = terra::extract(srtm, zion_transect)\nzion_transect = cbind(zion_transect, zion_elev)\nzion_srtm_values = terra::extract(x = srtm, y = zion)\ngroup_by(zion_srtm_values, ID) |> \n summarize(across(srtm, list(min = min, mean = mean, max = max)))\n#> # A tibble: 1 × 4\n#> ID srtm_min srtm_mean srtm_max\n#> \n#> 1 1 1122 1818. 2661\nnlcd = rast(system.file(\"raster/nlcd.tif\", package = \"spDataLarge\"))\nzion2 = st_transform(zion, st_crs(nlcd))\nzion_nlcd = terra::extract(nlcd, zion2)\nzion_nlcd |> \n group_by(ID, levels) |>\n count()\n#> # A tibble: 7 × 3\n#> # Groups: ID, levels [7]\n#> ID levels n\n#> \n#> 1 1 Developed 4205\n#> 2 1 Barren 98285\n#> 3 1 Forest 298299\n#> 4 1 Shrubland 203700\n#> # ℹ 3 more rows"},{"path":"raster-vector.html","id":"rasterization","chapter":"6 Raster-vector interactions","heading":"6.4 Rasterization","text":"\nRasterization conversion vector objects representation raster objects.\nUsually, output raster used quantitative analysis (e.g., analysis terrain) modeling.\nsaw Chapter 2, raster data model characteristics make conducive certain methods.\nFurthermore, process rasterization can help simplify datasets resulting values spatial resolution: rasterization can seen special type geographic data aggregation.terra package contains function rasterize() work.\nfirst two arguments , x, vector object rasterized , y, ‘template raster’ object defining extent, resolution CRS output.\ngeographic resolution input raster major impact results: low (cell size large), result may miss full geographic variability vector data; high, computational times may excessive.\nsimple rules follow deciding appropriate geographic resolution, heavily dependent intended use results.\nOften target resolution imposed user, example output rasterization needs aligned existing raster.\ndemonstrate rasterization action, use template raster extent CRS input vector data cycle_hire_osm_projected (dataset cycle hire points London illustrated Figure 6.5()) spatial resolution 1000 meters:Rasterization flexible operation: results depend nature template raster, also type input vector (e.g., points, polygons) variety arguments taken rasterize() function.illustrate flexibility, try three different approaches rasterization.\nFirst, create raster representing presence absence cycle hire points (known presence/absence rasters).\ncase rasterize() requires argument addition x y, aforementioned vector raster objects (results illustrated Figure 6.5(B)).fun argument specifies summary statistics used convert multiple observations close proximity associate cells raster object.\ndefault fun = \"last\" used, options fun = \"length\" can used, case count number cycle hire points grid cell (results operation illustrated Figure 6.5(C)).new output, ch_raster2, shows number cycle hire points grid cell.\ncycle hire locations different numbers bicycles described capacity variable, raising question, ’s capacity grid cell?\ncalculate must sum field (\"capacity\"), resulting output illustrated Figure 6.5(D), calculated following command (summary functions mean used).\nFIGURE 6.5: Examples point rasterization.\n\nAnother dataset based California’s polygons borders (created ) illustrates rasterization lines.\ncasting polygon objects multilinestring, template raster created resolution 0.5 degree:considering line polygon rasterization, one useful additional argument touches.\ndefault FALSE, changed TRUE, cells touched line polygon border get value.\nLine rasterization touches = TRUE demonstrated code (Figure 6.6()).Compare polygon rasterization, touches = FALSE default, selects raster cells whose centroids inside selector polygon, illustrated Figure 6.6(B).\nFIGURE 6.6: Examples line polygon rasterizations.\n","code":"\ncycle_hire_osm = spData::cycle_hire_osm\ncycle_hire_osm_projected = st_transform(cycle_hire_osm, \"EPSG:27700\")\nraster_template = rast(ext(cycle_hire_osm_projected), resolution = 1000,\n crs = crs(cycle_hire_osm_projected))\nch_raster1 = rasterize(cycle_hire_osm_projected, raster_template)\nch_raster2 = rasterize(cycle_hire_osm_projected, raster_template, \n fun = \"length\")\nch_raster3 = rasterize(cycle_hire_osm_projected, raster_template, \n field = \"capacity\", fun = sum, na.rm = TRUE)\ncalifornia = dplyr::filter(us_states, NAME == \"California\")\ncalifornia_borders = st_cast(california, \"MULTILINESTRING\")\nraster_template2 = rast(ext(california), resolution = 0.5,\n crs = st_crs(california)$wkt)\ncalifornia_raster1 = rasterize(california_borders, raster_template2,\n touches = TRUE)\ncalifornia_raster2 = rasterize(california, raster_template2) "},{"path":"raster-vector.html","id":"spatial-vectorization","chapter":"6 Raster-vector interactions","heading":"6.5 Spatial vectorization","text":"\nSpatial vectorization counterpart rasterization (Section 6.4), opposite direction.\ninvolves converting spatially continuous raster data spatially discrete vector data points, lines polygons.\nsimplest form vectorization convert centroids raster cells points.\n.points() exactly non-NA raster grid cells (Figure 6.7).\nNote, also used st_as_sf() convert resulting object sf class.\nFIGURE 6.7: Raster point representation elev object.\n\nAnother common type spatial vectorization creation contour lines representing lines continuous height temperatures (isotherms), example.\nuse real-world digital elevation model (DEM) artificial raster elev produces parallel lines (task reader: verify explain happens).\nContour lines can created terra function .contour(), wrapper around built-R function filled.contour(), demonstrated (shown):Contours can also added existing plots functions contour(), rasterVis::contourplot().\n\nillustrated Figure 6.8, isolines can labeled.\nFIGURE 6.8: Digital elevation model hillshading, showing southern flank Mt. Mongón overlaid contour lines.\n\nfinal type vectorization involves conversion rasters polygons.\ncan done terra::.polygons(), converts raster cell polygon consisting five coordinates, stored memory (explaining rasters often fast compared vectors!).illustrated converting grain object polygons subsequently dissolving borders polygons attribute values (also see dissolve argument .polygons()).\nFIGURE 6.9: Vectorization () raster (B) polygons (dissolve = FALSE) aggregated polygons (dissolve = TRUE).\naggregated polygons grain dataset rectilinear boundaries arise defined connecting rectangular pixels.\nsmoothr package described Chapter 5 can used smooth edges polygons.\nsmoothing removes sharp edges polygon boundaries, smoothed polygons exact spatial coverage original pixels.\nCaution therefore taken using smoothed polygons analysis.","code":"\nelev = rast(system.file(\"raster/elev.tif\", package = \"spData\"))\nelev_point = as.points(elev) |> \n st_as_sf()\ndem = rast(system.file(\"raster/dem.tif\", package = \"spDataLarge\"))\ncl = as.contour(dem) |> \n st_as_sf()\nplot(dem, axes = FALSE)\nplot(cl, add = TRUE)\ngrain = rast(system.file(\"raster/grain.tif\", package = \"spData\"))\ngrain_poly = as.polygons(grain) |> \n st_as_sf()"},{"path":"raster-vector.html","id":"exercises-4","chapter":"6 Raster-vector interactions","heading":"6.6 Exercises","text":"following exercises use vector (zion_points) raster dataset (srtm) spDataLarge package.\nalso use polygonal ‘convex hull’ derived vector dataset (ch) represent area interest:E1. Crop srtm raster using (1) zion_points dataset (2) ch dataset.\ndifferences output maps?\nNext, mask srtm using two datasets.\nCan see difference now?\ncan explain ?E2. Firstly, extract values srtm points represented zion_points.\nNext, extract average values srtm using 90 buffer around point zion_points compare two sets values.\nextracting values buffers suitable points alone?Bonus: Implement extraction using exactextractr package compare results.E3. Subset points higher 3100 meters New Zealand (nz_height object) create template raster resolution 3 km extent new point dataset.\nUsing two new objects:Count numbers highest points grid cell.Find maximum elevation grid cell.E4. Aggregate raster counting high points New Zealand (created previous exercise), reduce geographic resolution half (cells 6 x 6 km) plot result.Resample lower resolution raster back original resolution 3 km. results changed?Name two advantages disadvantages reducing raster resolution.E5. Polygonize grain dataset filter squares representing clay.Name two advantages disadvantages vector data raster data.useful convert rasters vectors work?","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(spData)\nzion_points_path = system.file(\"vector/zion_points.gpkg\", package = \"spDataLarge\")\nzion_points = read_sf(zion_points_path)\nsrtm = rast(system.file(\"raster/srtm.tif\", package = \"spDataLarge\"))\nch = st_combine(zion_points) |>\n st_convex_hull() |> \n st_as_sf()"},{"path":"reproj-geo-data.html","id":"reproj-geo-data","chapter":"7 Reprojecting geographic data","heading":"7 Reprojecting geographic data","text":"","code":""},{"path":"reproj-geo-data.html","id":"prerequisites-5","chapter":"7 Reprojecting geographic data","heading":"Prerequisites","text":"chapter requires following packages:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)\nlibrary(spDataLarge)"},{"path":"reproj-geo-data.html","id":"reproj-intro","chapter":"7 Reprojecting geographic data","heading":"7.1 Introduction","text":"Section 2.4 introduced coordinate reference systems (CRSs), focus two major types: geographic (‘lon/lat’, units degrees longitude latitude) projected (typically units meters datum) coordinate systems.\nchapter builds knowledge goes .\ndemonstrates set transform geographic data one CRS another , furthermore, highlights specific issues can arise due ignoring CRSs aware , especially data stored lon/lat coordinates.\nmany projects need worry , let alone convert , different CRSs.\nNonetheless, important know data projected geographic coordinate reference system, consequences geometry operations.\nknow information, CRSs just work behind scenes: people often suddenly need learn CRSs things go wrong.\nclearly defined CRS project data , understanding use different CRSs, can ensure things don’t go wrong.\nFurthermore, learning coordinate systems deepen knowledge geographic datasets use effectively.chapter teaches fundamentals CRSs, demonstrates consequences using different CRSs (including can go wrong), ‘reproject’ datasets one coordinate system another.\nnext section introduce CRSs R, followed Section 7.3 shows get set CRSs associated spatial objects.\nSection 7.4 demonstrates importance knowing CRS data reference worked example creating buffers.\ntackle questions reproject CRS use Section 7.5 Section 7.6, respectively.\nFinally, cover reprojecting vector raster objects Sections 7.7 7.8 modifying map projections Section 7.9.","code":""},{"path":"reproj-geo-data.html","id":"crs-in-r","chapter":"7 Reprojecting geographic data","heading":"7.2 Coordinate reference systems","text":"\nmodern geographic tools require CRS conversions, including core R-spatial packages desktop GIS software QGIS, interface PROJ, open source C++ library “transforms coordinates one coordinate reference system (CRS) another”.\nCRSs can described many ways, including following:Simple yet potentially ambiguous statements “’s lon/lat coordinates”Formalized yet now outdated ‘proj4 strings’ (also known ‘proj-string’) +proj=longlat +ellps=WGS84 +datum=WGS84 +no_defsWith identifying ‘authority:code’ text string EPSG:4326Each refers thing: ‘WGS84’ coordinate system forms basis Global Positioning System (GPS) coordinates many datasets.\none correct?\nshort answer third way identify CRSs preferable: EPSG:4326 understood sf (extension stars) terra packages covered book, plus many software projects working geographic data including QGIS PROJ.\nEPSG:4326 future-proof.\nFurthermore, although machine readable, “EPSG:4326” short, easy remember highly ‘findable’ online (searching EPSG:4326 yields dedicated page website epsg.io, example).\nconcise identifier 4326 understood sf, recommend explicit AUTHORITY:CODE representation prevent ambiguity provide context.\nlonger answer none three descriptions sufficient, detail needed unambiguous CRS handling transformations: due complexity CRSs, possible capture relevant information short text strings.\nreason, Open Geospatial Consortium (OGC, also developed simple features specification sf package implements) developed open standard format describing CRSs called WKT (Well-Known Text).\ndetailed 100+ page document “defines structure content text string implementation abstract model coordinate reference systems described ISO 19111:2019” (Open Geospatial Consortium 2019).\nWKT representation WGS84 CRS, identifier EPSG:4326 follows:\noutput command shows CRS identifier (also known Spatial Reference Identifier SRID) works: simply look-, providing unique identifier associated complete WKT representation CRS.\nraises question: happens mismatch identifier longer WKT representation CRS?\npoint Open Geospatial Consortium (2019) clear, verbose WKT representation takes precedence identifier:attributes values given cited identifier conflict attributes values given explicitly WKT description, WKT values shall prevail.\nconvention referring CRSs identifiers form AUTHORITY:CODE, also used geographic software written languages, allows wide range formally defined coordinate systems referred .28\ncommonly used authority CRS identifiers EPSG, acronym European Petroleum Survey Group published standardized list CRSs (EPSG taken Geomatics Committee International Association Oil & Gas Producers 2005).\nauthorities can used CRS identifiers.\nESRI:54030, example, refers ESRI’s implementation Robinson projection, following WKT string (first eight lines shown):\nWKT strings exhaustive, detailed, precise, allowing unambiguous CRSs storage transformations.\ncontain relevant information given CRS, including datum ellipsoid, prime meridian, projection, units.29\nRecent PROJ versions (6+) still allow use proj-strings define coordinate operations, proj-string keys (+nadgrids, +towgs84, +k, +init=epsg:) either longer supported discouraged.\nAdditionally, three datums (.e., WGS84, NAD83, NAD27) can directly set proj-string.\nLonger explanations evolution CRS definitions PROJ library can found Bivand (2021), chapter 2 Pebesma Bivand (2023b), blog post Floris Vanderhaeghe, available inbo.github.io/tutorials/tutorials/spatial_crs_coding/.\nAlso, outlined PROJ documentation different versions WKT CRS format including WKT1 two variants WKT2, latter (WKT2, 2018 specification) corresponds ISO 19111:2019 (Open Geospatial Consortium 2019).","code":"\nst_crs(\"EPSG:4326\")\n#> Coordinate Reference System:\n#> User input: EPSG:4326 \n#> wkt:\n#> GEOGCRS[\"WGS 84\",\n#> ENSEMBLE[\"World Geodetic System 1984 ensemble\",\n#> MEMBER[\"World Geodetic System 1984 (Transit)\"],\n#> MEMBER[\"World Geodetic System 1984 (G730)\"],\n#> MEMBER[\"World Geodetic System 1984 (G873)\"],\n#> MEMBER[\"World Geodetic System 1984 (G1150)\"],\n#> MEMBER[\"World Geodetic System 1984 (G1674)\"],\n#> MEMBER[\"World Geodetic System 1984 (G1762)\"],\n#> MEMBER[\"World Geodetic System 1984 (G2139)\"],\n#> ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n#> LENGTHUNIT[\"metre\",1]],\n#> ENSEMBLEACCURACY[2.0]],\n#> PRIMEM[\"Greenwich\",0,\n#> ANGLEUNIT[\"degree\",0.0174532925199433]],\n#> CS[ellipsoidal,2],\n#> AXIS[\"geodetic latitude (Lat)\",north,\n#> ORDER[1],\n#> ANGLEUNIT[\"degree\",0.0174532925199433]],\n#> AXIS[\"geodetic longitude (Lon)\",east,\n#> ORDER[2],\n#> ANGLEUNIT[\"degree\",0.0174532925199433]],\n#> USAGE[\n#> SCOPE[\"Horizontal component of 3D system.\"],\n#> AREA[\"World.\"],\n#> BBOX[-90,-180,90,180]],\n#> ID[\"EPSG\",4326]]\nst_crs(\"ESRI:54030\")\n#> Coordinate Reference System:\n#> User input: ESRI:54030 \n#> wkt:\n#> PROJCRS[\"World_Robinson\",\n#> BASEGEOGCRS[\"WGS 84\",\n#> DATUM[\"World Geodetic System 1984\",\n#> ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n#> LENGTHUNIT[\"metre\",1]]],\n...."},{"path":"reproj-geo-data.html","id":"crs-setting","chapter":"7 Reprojecting geographic data","heading":"7.3 Querying and setting coordinate systems","text":"\nLet’s look CRSs stored R spatial objects can queried set.\nFirst, look getting setting CRSs vector geographic data objects, starting following example:new object, new_vector, data frame class sf represents countries worldwide (see help page ?spData::world details).\nCRS can retrieved sf function st_crs().\noutput list containing two main components:User input (case WGS 84, synonym EPSG:4326 case taken input file), corresponding CRS identifiers described abovewkt, containing full WKT string relevant information CRSThe input element flexible, depending input file user input, can contain AUTHORITY:CODE representation (e.g., EPSG:4326), CRS’s name (e.g., WGS 84), even proj-string definition.\nwkt element stores WKT representation, used saving object file coordinate operations.\n, can see new_vector object WGS84 ellipsoid, uses Greenwich prime meridian, latitude longitude axis order.\ncase, also additional elements, USAGE explaining area suitable use CRS, ID pointing CRS’s identifier: EPSG:4326.\nst_crs function also one helpful feature – can retrieve additional information used CRS.\nexample, try run:st_crs(new_vector)$IsGeographic check CRS geographic notst_crs(new_vector)$units_gdal find CRS unitsst_crs(new_vector)$srid extract ‘SRID’ identifier (available)st_crs(new_vector)$proj4string extract proj-string representationIn cases CRS missing wrong CRS set, st_set_crs() function can used (case WKT string remains unchanged CRS already set correctly file read-):\nGetting setting CRSs works similar way raster geographic data objects.\ncrs() function terra package accesses CRS information SpatRaster object (note use cat() function print nicely).output WKT string representation CRS.\nfunction, crs(), can also used set CRS raster objects., can use either identifier (recommended cases) complete WKT string representation.\nAlternative methods set crs include proj-string strings CRSs extracted existing objects crs(), although approaches may less future-proof.Importantly, st_crs() crs() functions alter coordinates’ values geometries.\nrole set metadata information object CRS.cases CRS geographic object unknown, case london dataset created code chunk , building example London introduced Section 2.2:output NA shows sf know CRS unwilling guess (NA literally means ‘available’).\nUnless CRS manually specified loaded source CRS metadata, sf make explicit assumptions coordinate systems, say “don’t know”.\nbehavior makes sense given diversity available CRSs differs approaches, GeoJSON file format specification, makes simplifying assumption coordinates lon/lat CRS: EPSG:4326.\nDatasets without specified CRS can cause problems: geographic coordinates coordinate reference system, software can make good decisions around plotting geometry operations knows type CRS working .\nThus, , important always check CRS dataset set missing.","code":"\nvector_filepath = system.file(\"shapes/world.gpkg\", package = \"spData\")\nnew_vector = read_sf(vector_filepath)\nst_crs(new_vector) # get CRS\n#> Coordinate Reference System:\n#> User input: WGS 84 \n#> wkt:\n#> ...\nnew_vector = st_set_crs(new_vector, \"EPSG:4326\") # set CRS\nraster_filepath = system.file(\"raster/srtm.tif\", package = \"spDataLarge\")\nmy_rast = rast(raster_filepath)\ncat(crs(my_rast)) # get CRS\n#> GEOGCRS[\"WGS 84\",\n#> DATUM[\"World Geodetic System 1984\",\n#> ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n#> LENGTHUNIT[\"metre\",1]]],\n#> PRIMEM[\"Greenwich\",0,\n#> ANGLEUNIT[\"degree\",0.0174532925199433]],\n....\ncrs(my_rast) = \"EPSG:26912\" # set CRS\nlondon = data.frame(lon = -0.1, lat = 51.5) |> \n st_as_sf(coords = c(\"lon\", \"lat\"))\nst_is_longlat(london)\n#> [1] NA\nlondon_geo = st_set_crs(london, \"EPSG:4326\")\nst_is_longlat(london_geo)\n#> [1] TRUE"},{"path":"reproj-geo-data.html","id":"geom-proj","chapter":"7 Reprojecting geographic data","heading":"7.4 Geometry operations on projected and unprojected data","text":"Since sf version 1.0.0, R’s ability work geographic vector datasets lon/lat CRSs improved substantially, thanks integration S2 spherical geometry engine introduced Section 2.2.9.\nshown Figure 7.1, sf uses either GEOS S2 depending type CRS whether S2 disabled (enabled default).\nGEOS always used projected data data CRS; geographic data, S2 used default can disabled sf::sf_use_s2(FALSE).\nFIGURE 7.1: Behavior geometry operations sf package depending input data’s CRS.\ndemonstrate importance CRSs, create buffer 100 km around london object previous section.\nalso create deliberately faulty buffer ‘distance’ 1 degree, roughly equivalent 100 km (1 degree 111 km equator).\ndiving code, may worth skipping briefly ahead peek Figure 7.2 get visual handle outputs able reproduce following code chunks .first stage create three buffers around london london_geo objects created boundary distances 1 degree 100 km (100,000 m, can expressed 1e5 scientific notation) central London:first line , sf assumes input projected generates result buffer units degrees, problematic, see.\nsecond line, sf silently uses spherical geometry engine S2, introduced Chapter 2, calculate extent buffer using default value max_cells = 1000 — set 100 line three — consequences become apparent shortly.\nhighlight impact sf’s use S2 geometry engine unprojected (geographic) coordinate systems, temporarily disable command sf_use_s2() (, TRUE, default), code chunk .\nLike london_buff_no_crs, new london_geo object geographic abomination: units degrees, makes sense vast majority cases:warning message hints issues performing planar geometry operations lon/lat data.\nspherical geometry operations turned , command sf::sf_use_s2(FALSE), buffers (geometric operations) may result worthless outputs use units latitude longitude, poor substitute proper units distances meters.interpret warning geographic (longitude/latitude) CRS “CRS set”: almost always !\nbetter understood suggestion reproject data onto projected CRS.\nsuggestion always need heeded: performing spatial geometric operations makes little difference cases (e.g., spatial subsetting).\noperations involving distances buffering, way ensure good result (without using spherical geometry engines) create projected copy data run operation .\ndone code chunk .result new object identical london, created using suitable CRS (British National Grid, EPSG code 27700 case) units meters.\ncan verify CRS changed using st_crs() follows (output replaced ...,):Notable components CRS description include EPSG code (EPSG: 27700) detailed wkt string (first five lines shown).30\nfact units CRS, described LENGTHUNIT field, meters (rather degrees) tells us projected CRS: st_is_longlat(london_proj) now returns FALSE geometry operations london_proj work without warning.\nBuffer operations london_proj use GEOS, results returned proper units distance.\nfollowing line code creates buffer around projected data exactly 100 km:geometries three london_buff* objects created preceding code specified CRS (london_buff_s2, london_buff_lonlat london_buff_projected) illustrated Figure 7.2.\nFIGURE 7.2: Buffers around London showing results created S2 spherical geometry engine lon/lat data (left), projected data (middle) lon/lat data without using spherical geometry (right). left plot illustrates result buffering unprojected data sf, calls Google’s S2 spherical geometry engine default max cells set 1000 (thin line). thick, blocky line illustrates result operation max cells set 100.\nclear Figure 7.2 buffers based s2 properly projected CRSs ‘squashed’, meaning every part buffer boundary equidistant London.\nresults generated lon/lat CRSs s2 used, either input lacks CRS sf_use_s2() turned , heavily distorted, result elongated north-south axis, highlighting dangers using algorithms assume projected data lon/lat inputs (GEOS ).\nresults generated using S2 also distorted, however, although less dramatically.\nbuffer boundaries Figure 7.2 (left) jagged, although may apparent relevant thick boundary representing buffer created s2 argument max_cells set 100.\nlesson results obtained lon/lat data via S2 different results obtained using projected data.\ndifference S2 derived buffers GEOS derived buffers projected data reduce value max_cells increases: ‘right’ value argument may depend many factors default value 1000 often reasonable default.\nchoosing max_cells values, speed computation balanced resolution results.\nsituations smooth curved boundaries advantageous, transforming projected CRS buffering (performing geometry operations) may appropriate.importance CRSs (primarily whether projected geographic) impacts sf’s default setting use S2 buffers lon/lat data clear example .\nsubsequent sections go depth, exploring CRS use projected CRSs needed details reprojecting vector raster objects.","code":"\nlondon_buff_no_crs = st_buffer(london, dist = 1) # incorrect: no CRS\nlondon_buff_s2 = st_buffer(london_geo, dist = 100000) # silent use of s2\nlondon_buff_s2_100_cells = st_buffer(london_geo, dist = 100000, max_cells = 100) \nsf::sf_use_s2(FALSE)\n#> Spherical geometry (s2) switched off\nlondon_buff_lonlat = st_buffer(london_geo, dist = 1) # incorrect result\n#> Warning in st_buffer.sfc(st_geometry(x), dist, nQuadSegs, endCapStyle =\n#> endCapStyle, : st_buffer does not correctly buffer longitude/latitude data\n#> dist is assumed to be in decimal degrees (arc_degrees).\nsf::sf_use_s2(TRUE)\n#> Spherical geometry (s2) switched on\nlondon_proj = data.frame(x = 530000, y = 180000) |> \n st_as_sf(coords = c(\"x\", \"y\"), crs = \"EPSG:27700\")\nst_crs(london_proj)\n#> Coordinate Reference System:\n#> User input: EPSG:27700 \n#> wkt:\n#> PROJCRS[\"OSGB36 / British National Grid\",\n#> BASEGEOGCRS[\"OSGB36\",\n#> DATUM[\"Ordnance Survey of Great Britain 1936\",\n#> ELLIPSOID[\"Airy 1830\",6377563.396,299.3249646,\n#> LENGTHUNIT[\"metre\",1]]],\n....\nlondon_buff_projected = st_buffer(london_proj, 100000)"},{"path":"reproj-geo-data.html","id":"whenproject","chapter":"7 Reprojecting geographic data","heading":"7.5 When to reproject?","text":"\nprevious section showed set CRS manually, st_set_crs(london, \"EPSG:4326\").\nreal-world applications, however, CRSs usually set automatically data read-.\nmany projects main CRS-related task transform objects, one CRS another.\ndata transformed?\nCRS?\nclear-cut answers questions CRS selection always involves trade-offs (Maling 1992).\nHowever, general principles provided section can help decide.First ’s worth considering transform.\ncases transformation geographic CRS essential, publishing data online leaflet package.\nAnother case two objects different CRSs must compared combined, shown try find distance two sf objects different CRSs:make london london_proj objects geographically comparable, one must transformed CRS .\nCRS use?\nanswer depends context: many projects, especially involving web mapping, require outputs EPSG:4326, case worth transforming projected object.\n, however, project requires planar geometry operations rather spherical geometry operations engine (e.g., create buffers smooth edges), may worth transforming data geographic CRS equivalent object projected CRS, British National Grid (EPSG:27700).\nsubject Section 7.7.","code":"\nst_distance(london_geo, london_proj)\n# > Error: st_crs(x) == st_crs(y) is not TRUE"},{"path":"reproj-geo-data.html","id":"which-crs","chapter":"7 Reprojecting geographic data","heading":"7.6 Which CRS to use?","text":"\nquestion CRS use tricky, rarely ‘right’ answer:\n“exist -purpose projections, involve distortion far center specified frame” (Bivand, Pebesma, Gómez-Rubio 2013).\nAdditionally, attached just one projection every task.\npossible use one projection part analysis, another projection different part, even visualization.\nAlways try pick CRS serves goal best!selecting geographic CRSs, answer often WGS84.\nused web mapping, also GPS datasets thousands raster vector datasets provided CRS default.\nWGS84 common CRS world, worth knowing EPSG code: 4326.31\n‘magic number’ can used convert objects unusual projected CRSs something widely understood.projected CRS required?\ncases, something free decide:\n“often choice projection made public mapping agency” (Bivand, Pebesma, Gómez-Rubio 2013).\nmeans working local data sources, likely preferable work CRS data provided, ensure compatibility, even official CRS accurate.\nexample London easy answer () British National Grid (associated EPSG code 27700) well known (b) original dataset (london) already CRS.\ncommonly used default Universal Transverse Mercator (UTM), set CRSs divides Earth 60 longitudinal wedges 20 latitudinal segments.\nAlmost every place Earth UTM code, “60H” refers northern New Zealand R invented.\nUTM EPSG codes run sequentially 32601 32660 northern hemisphere locations 32701 32760 southern hemisphere locations.show system works, let’s create function, lonlat2UTM() calculate EPSG code associated point planet follows:following command uses function identify UTM zone associated EPSG code Auckland London:transverse Mercator projection used UTM CRSs conformal distorts areas distances increasing severity distance center UTM zone.\nDocumentation GIS software Manifold therefore suggests restricting longitudinal extent projects using UTM zones 6 degrees central meridian (manifold.net).\nTherefore, recommend using UTM focus preserving angles relatively small area!Currently, also tools helping us select proper CRS, includes crsuggest package (K. Walker (2022)).\nmain function package, suggest_crs(), takes spatial object geographic CRS returns list possible projected CRSs used given area.32\nAnother helpful tool webpage https://jjimenezshaw.github.io/crs-explorer/ lists CRSs based selected location type.\nImportant note: tools helpful many situations, need aware properties recommended CRS apply .\ncases appropriate CRS immediately clear, choice CRS depend properties important preserve subsequent maps analysis.\nCRSs either equal-area, equidistant, conformal (shapes remaining unchanged), combination compromises (Section 2.4.2).\nCustom CRSs local parameters can created region interest multiple CRSs can used projects single CRS suits tasks.\n‘Geodesic calculations’ can provide fall-back CRSs appropriate (see proj.org/geodesic.html).\nRegardless projected CRS used, results may accurate geometries covering hundreds kilometers.\ndeciding custom CRS, recommend following:33\nLambert azimuthal equal-area (LAEA) projection custom local projection (set latitude longitude origin center study area), equal-area projection locations distorts shapes beyond thousands kilometersAzimuthal equidistant (AEQD) projections specifically accurate straight-line distance point center point local projectionLambert conformal conic (LCC) projections regions covering thousands kilometers, cone set keep distance area properties reasonable secant linesStereographic (STERE) projections polar regions, taking care rely area distance calculations thousands kilometers centerOne possible approach automatically select projected CRS specific local dataset create AEQD projection center-point study area.\ninvolves creating custom CRS (EPSG code) units meters based center point dataset.\nNote approach used caution: datasets compatible custom CRS created, results may accurate used extensive datasets covering hundreds kilometers.principles outlined section apply equally vector raster datasets.\nfeatures CRS transformation, however, unique geographic data model.\ncover particularities vector data transformation Section 7.7 raster transformation Section 7.8.\nNext, Section 7.9, shows create custom map projections.","code":"\nlonlat2UTM = function(lonlat) {\n utm = (floor((lonlat[1] + 180) / 6) %% 60) + 1\n if (lonlat[2] > 0) {\n utm + 32600\n } else{\n utm + 32700\n }\n}\nlonlat2UTM(c(174.7, -36.9))\n#> [1] 32760\nlonlat2UTM(st_coordinates(london))\n#> [1] 32630"},{"path":"reproj-geo-data.html","id":"reproj-vec-geom","chapter":"7 Reprojecting geographic data","heading":"7.7 Reprojecting vector geometries","text":"\nChapter 2 demonstrated vector geometries made points, points form basis complex objects lines polygons.\nReprojecting vectors thus consists transforming coordinates points, form vertices lines polygons.Section 7.5 contains example least one sf object must transformed equivalent object different CRS calculate distance two objects.Now transformed version london created, using sf function st_transform(), distance two representations London can found.34\nmay come surprise london london2 2 km apart!35Functions querying reprojecting CRSs demonstrated reference cycle_hire_osm, sf object spData represents ‘docking stations’ can hire bicycles London.\nCRS sf objects can queried, learned Section 7.1, set function st_crs().\noutput printed multiple lines text containing information coordinate system:saw Section 7.3, main CRS components, User input wkt, printed single entity. output st_crs() fact named list class crs two elements, single character strings named input wkt, shown output following code chunk:Additional elements can retrieved $ operator, including Name, proj4string epsg (see ?st_crs CRS tranformation tutorial GDAL website details):mentioned Section 7.2, WKT representation, stored $wkt element crs_lnd object ultimate source truth.\nmeans outputs previous code chunk queries wkt representation provided PROJ, rather inherent attributes object CRS.wkt User Input elements CRS changed object’s CRS transformed.\ncode chunk , create new version cycle_hire_osm projected CRS (first 4 lines CRS output shown brevity).resulting object new CRS EPSG code 27700.\nfind details EPSG code, code?\nOne option search online, another look properties CRS object:result shows EPSG code 27700 represents British National Grid, found searching online “EPSG 27700”.","code":"\nlondon2 = st_transform(london_geo, \"EPSG:27700\")\nst_distance(london2, london_proj)\n#> Units: [m]\n#> [,1]\n#> [1,] 2016\nst_crs(cycle_hire_osm)\n#> Coordinate Reference System:\n#> User input: EPSG:4326 \n#> wkt:\n#> GEOGCS[\"WGS 84\",\n#> DATUM[\"WGS_1984\",\n#> SPHEROID[\"WGS 84\",6378137,298.257223563,\n....\ncrs_lnd = st_crs(london_geo)\nclass(crs_lnd)\n#> [1] \"crs\"\nnames(crs_lnd)\n#> [1] \"input\" \"wkt\"\ncrs_lnd$Name\n#> [1] \"WGS 84\"\ncrs_lnd$proj4string\n#> [1] \"+proj=longlat +datum=WGS84 +no_defs\"\ncrs_lnd$epsg\n#> [1] 4326\ncycle_hire_osm_projected = st_transform(cycle_hire_osm, \"EPSG:27700\")\nst_crs(cycle_hire_osm_projected)\n#> Coordinate Reference System:\n#> User input: EPSG:27700 \n#> wkt:\n#> PROJCRS[\"OSGB36 / British National Grid\",\n#> ...crs_lnd_new = st_crs(\"EPSG:27700\")\ncrs_lnd_new$Name\n#> [1] \"OSGB36 / British National Grid\"\ncrs_lnd_new$proj4string\n#> [1] \"+proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000\n+y_0=-100000 +ellps=airy +units=m +no_defs\"\ncrs_lnd_new$epsg\n#> [1] 27700"},{"path":"reproj-geo-data.html","id":"reproj-ras","chapter":"7 Reprojecting geographic data","heading":"7.8 Reprojecting raster geometries","text":"\nprojection concepts described previous section apply rasters.\nHowever, important differences reprojection vectors rasters:\ntransforming vector object involves changing coordinates every vertex, apply raster data.\nRasters composed rectangular cells size (expressed map units, degrees meters), usually impracticable transform coordinates pixels separately.\nThus, raster reprojection involves creating new raster object, often different number columns rows original.\nattributes must subsequently re-estimated, allowing new pixels ‘filled’ appropriate values.\nwords, raster reprojection can thought two separate spatial operations: vector reprojection raster extent another CRS (Section 7.7), computation new pixel values resampling (Section 5.3.4).\nThus cases raster vector data used, better avoid reprojecting rasters reproject vectors instead.raster reprojection process done project() terra package.\nLike st_transform() function demonstrated previous section, project() takes spatial object (raster dataset case) CRS representation second argument.\nside note, second argument can also existing raster object different CRS.Let’s take look two examples raster transformation: using categorical continuous data.\nLand cover data usually represented categorical maps.\nnlcd.tif file provides information small area Utah, USA obtained National Land Cover Database 2011 NAD83 / UTM zone 12N CRS, shown output code chunk (first line output shown).region, eight land cover classes distinguished (full list NLCD2011 land cover classes can found mrlc.gov):reprojecting categorical rasters, estimated values must original.\ndone using nearest neighbor method (near), sets new cell value value nearest cell (center) input raster.\nexample reprojecting cat_raster WGS84, geographic CRS well suited web mapping.\nfirst step obtain definition CRS.\nsecond step reproject raster project() function , case categorical data, uses nearest neighbor method (near).Many properties new object differ previous one, including number columns rows (therefore number cells), resolution (transformed meters degrees), extent, illustrated Table 7.1 (note number categories increases 8 9 addition NA values, new category created — land cover classes preserved).TABLE 7.1: Key attributes original (cat_raster) projected (cat_raster_wgs84) categorical raster datasets.Reprojecting numeric rasters (numeric case integer values) follows almost identical procedure.\ndemonstrated srtm.tif spDataLarge Shuttle Radar Topography Mission (SRTM), represents height meters sea level (elevation) WGS84 CRS:reproject dataset projected CRS, nearest neighbor method appropriate categorical data.\nInstead, use bilinear method computes output cell value based four nearest cells original raster.36\nvalues projected dataset distance-weighted average values four cells:\ncloser input cell center output cell, greater weight.\nfollowing commands create text string representing WGS 84 / UTM zone 12N, reproject raster CRS, using bilinear method (output shown).Raster reprojection numeric variables also leads changes values spatial properties, number cells, resolution, extent.\nchanges demonstrated Table 7.2.37TABLE 7.2: Key attributes original (con_raster) projected (con_raster_ea) continuous raster datasets.","code":"\ncat_raster = rast(system.file(\"raster/nlcd.tif\", package = \"spDataLarge\"))\ncrs(cat_raster)\n#> PROJCRS[\"NAD83 / UTM zone 12N\",\n#> ...\nunique(cat_raster)\n#> levels\n#> 1 Water\n#> 2 Developed\n#> 3 Barren\n#> 4 Forest\n#> 5 Shrubland\n#> 6 Herbaceous\n#> 7 Cultivated\n#> 8 Wetlands\ncat_raster_wgs84 = project(cat_raster, \"EPSG:4326\", method = \"near\")\ncon_raster = rast(system.file(\"raster/srtm.tif\", package = \"spDataLarge\"))\ncat(crs(con_raster))\n#> GEOGCRS[\"WGS 84\",\n#> DATUM[\"World Geodetic System 1984\",\n#> ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n#> LENGTHUNIT[\"metre\",1]]],\n#> PRIMEM[\"Greenwich\",0,\n#> ANGLEUNIT[\"degree\",0.0174532925199433]],\n....\ncon_raster_ea = project(con_raster, \"EPSG:32612\", method = \"bilinear\")\ncat(crs(con_raster_ea))"},{"path":"reproj-geo-data.html","id":"mapproj","chapter":"7 Reprojecting geographic data","heading":"7.9 Custom map projections","text":"\nEstablished CRSs captured AUTHORITY:CODE identifiers EPSG:4326 well suited many applications.\nHowever, desirable use alternative projections create custom CRSs cases.\nSection 7.6 mentioned reasons using custom CRSs provided several possible approaches.\n, show apply ideas R.One take existing WKT definition CRS, modify elements, use new definition reprojecting.\ncan done spatial vectors st_crs() st_transform(), spatial rasters crs() project(), demonstrated following example transforms zion object custom azimuthal equidistant (AEQD) CRS.Using custom AEQD CRS requires knowing coordinates center point dataset degrees (geographic CRS).\ncase, information can extracted calculating centroid zion area transforming WGS84.Next, can use newly obtained values update WKT definition AEQD CRS seen .\nNotice modified just two values – \"Central_Meridian\" longitude \"Latitude_Of_Origin\" latitude centroid.approach’s last step transform original object (zion) new custom CRS (zion_aeqd).Custom projections can also made interactively, example, using Projection Wizard web application (Šavrič, Jenny, Jenny 2016).\nwebsite allows select spatial extent data distortion property, returns list possible projections.\nlist also contains WKT definitions projections can copy use reprojections.\nAlso, see Open Geospatial Consortium (2019) details creating custom CRS definitions WKT strings.\nPROJ strings can also used create custom projections, accepting limitations inherent projections, especially geometries covering large geographic areas, mentioned Section 7.2.\nMany projections developed can set +proj= element PROJ strings, dozens projects described detail PROJ website alone.mapping world preserving area relationships, Mollweide projection, illustrated Figure 7.3, popular often sensible choice (Jenny et al. 2017).\nuse projection, need specify using proj-string element, \"+proj=moll\", st_transform function:\nFIGURE 7.3: Mollweide projection world.\noften desirable minimize distortion spatial properties (area, direction, distance) mapping world.\nOne popular projections achieve Winkel tripel, illustrated Figure 7.4.38\nresult created following command:\nFIGURE 7.4: Winkel tripel projection world.\nMoreover, proj-string parameters can modified CRS definitions, example center projection can adjusted using +lon_0 +lat_0 parameters.\ncode transforms coordinates Lambert azimuthal equal-area projection centered longitude latitude New York City (Figure 7.5).\nFIGURE 7.5: Lambert azimuthal equal-area projection world centered New York City.\ninformation CRS modifications can found Using PROJ documentation.","code":"\nzion = read_sf(system.file(\"vector/zion.gpkg\", package = \"spDataLarge\"))\nzion_centr = st_centroid(zion)\nzion_centr_wgs84 = st_transform(zion_centr, \"EPSG:4326\")\nst_as_text(st_geometry(zion_centr_wgs84))\n#> [1] \"POINT (-113 37.3)\"\nmy_wkt = 'PROJCS[\"Custom_AEQD\",\n GEOGCS[\"GCS_WGS_1984\",\n DATUM[\"WGS_1984\",\n SPHEROID[\"WGS_1984\",6378137.0,298.257223563]],\n PRIMEM[\"Greenwich\",0.0],\n UNIT[\"Degree\",0.0174532925199433]],\n PROJECTION[\"Azimuthal_Equidistant\"],\n PARAMETER[\"Central_Meridian\",-113.0263],\n PARAMETER[\"Latitude_Of_Origin\",37.29818],\n UNIT[\"Meter\",1.0]]'\nzion_aeqd = st_transform(zion, my_wkt)\nworld_mollweide = st_transform(world, crs = \"+proj=moll\")\nworld_wintri = st_transform(world, crs = \"+proj=wintri\")\nworld_laea2 = st_transform(world,\n crs = \"+proj=laea +x_0=0 +y_0=0 +lon_0=-74 +lat_0=40\")"},{"path":"reproj-geo-data.html","id":"exercises-5","chapter":"7 Reprojecting geographic data","heading":"7.10 Exercises","text":"E1. Create new object called nz_wgs transforming nz object WGS84 CRS.Create object class crs use query CRSs.reference bounding box object, units CRS use?Remove CRS nz_wgs plot result: wrong map New Zealand ?E2. Transform world dataset transverse Mercator projection (\"+proj=tmerc\") plot result.\nchanged ?\nTry transform back WGS 84 plot new object.\nnew object differ original one?E3. Transform continuous raster (con_raster) NAD83 / UTM zone 12N using nearest neighbor interpolation method.\nchanged?\ninfluence results?E4. Transform categorical raster (cat_raster) WGS 84 using bilinear interpolation method.\nchanged?\ninfluence results?","code":""},{"path":"read-write.html","id":"read-write","chapter":"8 Geographic data I/O","heading":"8 Geographic data I/O","text":"","code":""},{"path":"read-write.html","id":"prerequisites-6","chapter":"8 Geographic data I/O","heading":"Prerequisites","text":"chapter requires following packages:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)"},{"path":"read-write.html","id":"introduction-4","chapter":"8 Geographic data I/O","heading":"8.1 Introduction","text":"chapter reading writing geographic data.\nGeographic data input essential geocomputation: real-world applications impossible without data.\nData output also vital, enabling others use valuable new improved datasets resulting work.\nTaken together, processes input/output can referred data /O.\nGeographic data /O often done haste beginning end projects otherwise ignored.\nHowever, data import export fundamental success otherwise projects: small /O mistakes made beginning projects (e.g., using --date dataset) can lead large problems line.many geographic file formats, advantages disadvantages, described Section 8.2.\nReading writing file formats covered Sections 8.3 8.4, respectively.\nterms find data, Section 8.5 describes geoportals import data .\nDedicated packages ease geographic data import, sources including OpenStreetMap, described Section 8.6.\nwant put data ‘production’ web services (want sure data adheres established standards), geographic metadata important, described Section 8.7.\nAnother possibility obtain spatial data use web services, outlined Section 8.8.\nfinal Section 8.9 demonstrates methods saving visual outputs (maps), preparation Chapter 9 visualization.","code":""},{"path":"read-write.html","id":"file-formats","chapter":"8 Geographic data I/O","heading":"8.2 File formats","text":"\nGeographic datasets usually stored files spatial databases.\nFile formats can either store vector raster data, spatial databases PostGIS can store (see also Section 10.7).\nToday variety file formats may seem bewildering, much consolidation standardization since beginnings GIS software 1960s first widely distributed program (SYMAP) spatial analysis created Harvard University (Coppock Rhind 1991).\nGDAL (pronounced “goo-dal”, double “o” making reference object-orientation), Geospatial Data Abstraction Library, resolved many issues associated incompatibility geographic file formats since release 2000.\nGDAL provides unified high-performance interface reading writing many raster vector data formats.39\nMany open proprietary GIS programs, including GRASS GIS, ArcGIS QGIS, use GDAL behind GUIs legwork ingesting spitting geographic data appropriate formats.GDAL provides access 200 vector raster data formats.\nTable 8.1 presents basic information selected often used spatial file formats.\nTABLE 8.1: TABLE 8.2: Selected spatial file formats.\n\nimportant development ensuring standardization open-sourcing file formats founding Open Geospatial Consortium (OGC) 1994.\nBeyond defining simple features data model (see Section 2.2.1), OGC also coordinates development open standards, example used file formats GML, KML GeoPackage.\nOpen file formats kind endorsed OGC several advantages proprietary formats: standards published, ensure transparency open possibility users develop adjust file formats specific needs.ESRI Shapefile popular vector data exchange format; however, open format (though specification open).\ndeveloped early 1990s number limitations.\nFirst , multi-file format, consists least three files.\nsupports 255 columns, column names restricted ten characters file size limit 2 GB.\nFurthermore, ESRI Shapefile support possible geometry types, example, unable distinguish polygon multipolygon.40\nDespite limitations, viable alternative missing long time.\nrecent years, GeoPackage emerged, seems suitable replacement candidate ESRI Shapefile.\nGeopackage format exchanging geospatial information OGC standard.\nGeoPackage standard describes rules store geospatial information tiny SQLite container.\nHence, GeoPackage lightweight spatial database container, allows storage vector raster data also non-spatial data extensions.\nAside GeoPackage, geospatial data exchange formats worth checking (Table 8.1).\nGeoTIFF format seems prominent raster data format.\nallows spatial information, CRS, embedded within TIFF file.\nSimilar ESRI Shapefile, format first developed 1990s, open format.\nAdditionally, GeoTIFF still expanded improved.\nOne significant recent additions GeoTIFF format variant called COG (Cloud Optimized GeoTIFF).\nRaster objects saved COGs can hosted HTTP servers, people can read parts file without downloading whole file (see Sections 8.3.2 8.4.2).many geographic file formats beyond shown Table 8.1 new data formats capable representing geographic developed.\nRecent examples formats based GeoArrow Zarr specifications.\nGDAL’s documentation provides good resource learning vector raster drivers.\nFurthermore, data formats can store data models (types) beyond vector raster data models introduced Section 2.2.1.\nincludes LAS LAZ formats storing lidar point clouds, NetCDF HDF storing multidimensional arrays.Spatial data also often stored using tabular (non-spatial) text formats, including CSV files Excel spreadsheets.\nexample, can convenient share spatial samples people use GIS tools exchange data software accept spatial data formats.\nHowever, approach downsides: challenging storing geometries complex POINTs omits important spatial metadata CRS.","code":""},{"path":"read-write.html","id":"data-input","chapter":"8 Geographic data I/O","heading":"8.3 Data input (I)","text":"Executing commands sf::read_sf() (main function use loading vector data) terra::rast() (main function used loading raster data) silently sets chain events reads data files.\nMany R packages provide example datasets (e.g., dataset spData::world used earlier chapters) functions get geographic datasets range data sources.\nload data R , precisely, assign objects workspace.\nmeans objects imported R stored RAM41, can listed ls() (viewable ‘Environment’ panels development environement) can accessed .GlobalEnv R session.","code":""},{"path":"read-write.html","id":"iovec","chapter":"8 Geographic data I/O","heading":"8.3.1 Vector data","text":"\nSpatial vector data comes wide variety file formats.\npopular representations .geojson .gpkg files can imported directly R sf function read_sf() (equivalent st_read()), uses GDAL’s vector drivers behind scenes.\nst_drivers() function returns data frame containing name long_name first two columns, features driver available GDAL (therefore sf), including ability write data store raster data subsequent columns, illustrated key file formats Table 8.3.\nTABLE 8.3: TABLE 8.4: Popular drivers/formats reading/writing vector data.\nfollowing commands show first three drivers reported computer’s GDAL installation (results can vary depending GDAL version installed) summary features.\nNote majority drivers can write data, dozen formats can efficiently represent raster data addition vector data (see ?st_drivers() details):first argument read_sf() dsn, text string object containing single text string.\ncontent text string vary different drivers.\ncases, ESRI Shapefile (.shp) GeoPackage format (.gpkg), dsn file name.\nread_sf() guesses driver based file extension, illustrated .gpkg file :drivers, dsn provided folder name, access credentials database, GeoJSON string representation (see examples read_sf() help page details).vector driver formats can store multiple data layers.\ndefault, read_sf() automatically reads first layer file specified dsn; however, using layer argument can specify layer.\nread_sf() function also allows reading just parts file RAM two possible mechanisms.\nfirst one related query argument, allows specifying part data read OGR SQL query text.\nexample extracts data Tanzania (Figure 8.1A).\ndone specifying want get columns (SELECT *) \"world\" layer name_long equal \"Tanzania\":know names available columns, good approach just read one row data 'SELECT * world FID = 1'.\nFID represents feature ID – often, row number; however, values depend used file format.\nexample, FID starts 0 ESRI Shapefile, 1 file formats, can even arbitrary.second mechanism uses wkt_filter argument.\nargument expects well-known text representing study area want extract data.\nLet’s try using small example – want read polygons file intersect buffer 50,000 meters Tanzania’s borders.\n, need prepare “filter” () creating buffer (Section 5.2.3), (b) converting sf buffer object sfc geometry object st_geometry(), (c) translating geometries well-known text representation st_as_text():Now, can apply “filter” using wkt_filter argument.result, shown Figure 8.1(B), contains Tanzania every country within 50-km buffer.\nFIGURE 8.1: Reading subset vector data using () query (B) wkt filter.\nNaturally, options specific certain drivers.42\nexample, think coordinates stored spreadsheet format (.csv).\nread-files spatial objects, naturally specify names columns (X Y example ) representing coordinates.\ncan help options parameter.\nfind possible options, please refer ‘Open Options’ section corresponding GDAL driver description.\ncomma-separated value (csv) format, visit https://gdal.org/drv_csv.html.Instead columns describing ‘XY’ coordinates, single column can also contain geometry information.\nWell-known text (WKT), well-known binary (WKB), GeoJSON formats examples .\ninstance, world_wkt.csv file column named WKT representing polygons world’s countries.\nuse options parameter indicate .\nfinal example, show read_sf() also reads KML files.\nKML file stores geographic information XML format, data format creation web pages transfer data application-independent way (Nolan Lang 2014).\n, access KML file web.\nfile contains one layer.\nst_layers() lists available layers.\nchoose first layer Placemarks say help layer parameter read_sf().examples presented section far used sf package geographic data import.\nfast flexible, may worth looking packages duckdb, R interface DuckDB database system spatial extension.","code":"\nsf_drivers = st_drivers()\nhead(sf_drivers, n = 3)\nsummary(sf_drivers[-c(1:2)])\nf = system.file(\"shapes/world.gpkg\", package = \"spData\")\nworld = read_sf(f)\ntanzania = read_sf(f, query = 'SELECT * FROM world WHERE name_long = \"Tanzania\"')\ntanzania_buf = st_buffer(tanzania, 50000)\ntanzania_buf_geom = st_geometry(tanzania_buf)\ntanzania_buf_wkt = st_as_text(tanzania_buf_geom)\ntanzania_neigh = read_sf(f, wkt_filter = tanzania_buf_wkt)\ncycle_hire_txt = system.file(\"misc/cycle_hire_xy.csv\", package = \"spData\")\ncycle_hire_xy = read_sf(cycle_hire_txt,\n options = c(\"X_POSSIBLE_NAMES=X\", \"Y_POSSIBLE_NAMES=Y\"))\nworld_txt = system.file(\"misc/world_wkt.csv\", package = \"spData\")\nworld_wkt = read_sf(world_txt, options = \"GEOM_POSSIBLE_NAMES=WKT\")\nu = \"https://developers.google.com/kml/documentation/KML_Samples.kml\"\ndownload.file(u, \"KML_Samples.kml\")\nst_layers(\"KML_Samples.kml\")\n#> Driver: LIBKML \n#> Available layers:\n#> layer_name geometry_type features fields crs_name\n#> 1 Placemarks 3 11 WGS 84\n#> 2 Styles and Markup 1 11 WGS 84\n#> 3 Highlighted Icon 1 11 WGS 84\n....\nkml = read_sf(\"KML_Samples.kml\", layer = \"Placemarks\")"},{"path":"read-write.html","id":"raster-data-read","chapter":"8 Geographic data I/O","heading":"8.3.2 Raster data","text":"\nSimilar vector data, raster data comes many file formats supporting multilayer files.\nterra’s rast() command reads single layer file just one layer provided.also works case want read multilayer file.\nprevious examples read spatial information files stored hard drive.\nHowever, GDAL also allows reading data directly online resources, HTTP/HTTPS/FTP web resources.\nthing need add /vsicurl/ prefix path file.\nLet’s try connecting global monthly snow probability 500-m resolution period 2000-2012.\nSnow probability December stored Cloud Optimized GeoTIFF (COG) file (see Section 8.2) zenodo.org.\nread online file, just need provide URL together /vsicurl/ prefix.\nDue fact input data COG, actually reading file RAM, rather creating connection without obtaining values.\nvalues read apply value-based operation (e.g., crop() extract()).\nallows us also just read tiny portion data without downloading entire file.\nexample, can get snow probability December Reykjavik (70%) specifying coordinates applying extract() function:way, just downloaded single value instead whole, large GeoTIFF file.\nexample just shows one simple (useful) case, explore.\n/vsicurl/ prefix also works raster also vector file formats.\nallows reading vectors directly online storage read_sf() just adding prefix vector file URL.Importantly, /vsicurl/ prefix provided GDAL – many exist, /vsizip/ read spatial files ZIP archives without decompressing beforehand /vsis3/ --fly reading files available AWS S3 buckets.\ncan learn https://gdal.org/user/virtual_file_systems.html.vector data, raster datasets can also stored read spatial databases, notably PostGIS.\nSee Section 10.7 details.","code":"\nraster_filepath = system.file(\"raster/srtm.tif\", package = \"spDataLarge\")\nsingle_layer = rast(raster_filepath)\nmultilayer_filepath = system.file(\"raster/landsat.tif\", package = \"spDataLarge\")\nmultilayer_rast = rast(multilayer_filepath)\nmyurl = paste0(\"/vsicurl/https://zenodo.org/record/5774954/files/\",\n \"clm_snow.prob_esacci.dec_p.90_500m_s0..0cm_2000..2012_v2.0.tif\")\nsnow = rast(myurl)\nsnow\n#> class : SpatRaster \n#> dimensions : 35849, 86400, 1 (nrow, ncol, nlyr)\n#> resolution : 0.00417, 0.00417 (x, y)\n#> extent : -180, 180, -62, 87.4 (xmin, xmax, ymin, ymax)\n#> coord. ref. : lon/lat WGS 84 (EPSG:4326) \n#> source : clm_snow.prob_esacci.dec_p.90_500m_s0..0cm_2000..2012_v2.0.tif \n#> name : clm_snow.prob_esacci.dec_p.90_500m_s0..0cm_2000..2012_v2.0\nrey = data.frame(lon = -21.94, lat = 64.15)\nsnow_rey = extract(snow, rey)\nsnow_rey\n#> ID clm_snow.prob_esacci.dec_p.90_500m_s0..0cm_2000..2012_v2.0\n#> 1 1 70"},{"path":"read-write.html","id":"data-output","chapter":"8 Geographic data I/O","heading":"8.4 Data output (O)","text":"Writing geographic data allows convert one format another save newly created objects.\nDepending data type (vector raster), object class (e.g., sf SpatRaster), type amount stored information (e.g., object size, range values), important know store spatial files efficient way.\nnext two sections demonstrate .","code":""},{"path":"read-write.html","id":"vector-data-1","chapter":"8 Geographic data I/O","heading":"8.4.1 Vector data","text":"counterpart read_sf() write_sf().\nallows write sf objects wide range geographic vector file formats, including common .geojson, .shp .gpkg.\nBased file name, write_sf() decides automatically driver use.\nspeed writing process depends also driver.Note: try write data source , function overwrite file:Instead overwriting file, add new layer file specifying layer argument.\nsupported several spatial formats, including GeoPackage.Alternatively, can use st_write() since equivalent write_sf().\nHowever, different defaults – overwrite files (returns error try ) shows short summary written file format object.layer_options argument also used many different purposes.\nOne write spatial data text file.\ncan done specifying GEOMETRY inside layer_options.\neither AS_XY simple point datasets (creates two new columns coordinates) AS_WKT complex spatial data (one new column created contains well-known text representation spatial objects).","code":"\nwrite_sf(obj = world, dsn = \"world.gpkg\")\nwrite_sf(obj = world, dsn = \"world.gpkg\")\nwrite_sf(obj = world, dsn = \"world_many_layers.gpkg\", layer = \"second_layer\")\nst_write(obj = world, dsn = \"world2.gpkg\")\n#> Writing layer `world2' to data source `world2.gpkg' using driver `GPKG'\n#> Writing 177 features with 10 fields and geometry type Multi Polygon.\nwrite_sf(cycle_hire_xy, \"cycle_hire_xy.csv\", layer_options = \"GEOMETRY=AS_XY\")\nwrite_sf(world_wkt, \"world_wkt.csv\", layer_options = \"GEOMETRY=AS_WKT\")"},{"path":"read-write.html","id":"raster-data-write","chapter":"8 Geographic data I/O","heading":"8.4.2 Raster data","text":"\nwriteRaster() function saves SpatRaster objects files disk.\nfunction expects input regarding output data type file format, also accepts GDAL options specific selected file format (see ?writeRaster details).\nterra package offers seven data types saving raster: INT1U, INT2S, INT2U, INT4S, INT4U, FLT4S, FLT8S,43 determine bit representation raster object written disk (Table 8.5).\ndata type use depends range values raster object.\nvalues data type can represent, larger file get disk.\nUnsigned integers (INT1U, INT2U, INT4U) suitable categorical data, float numbers (FLT4S FLT8S) usually represent continuous data.\nwriteRaster() uses FLT4S default.\nworks cases, size output file unnecessarily large save binary categorical data.\nTherefore, recommend use data type needs least storage space still able represent values (check range values summary() function).TABLE 8.5: Data types supported terra package.default, output file format derived filename.\nNaming file *.tif create GeoTIFF file, demonstrated :raster file formats additional options can set providing GDAL parameters options argument writeRaster().\nGeoTIFF files written terra, default, LZW compression gdal = c(\"COMPRESS=LZW\").\nchange disable compression, need modify argument.\nAdditionally, can save raster object COG (Cloud Optimized GeoTIFF, Section 8.2) filetype = \"COG\" options.learn compression GeoTIFF files, recommend Paul Ramsey’s comprehensive blog post, GeoTiff Compression Dummies can found online.","code":"\nwriteRaster(single_layer, filename = \"my_raster.tif\", datatype = \"INT2U\")\nwriteRaster(x = single_layer, filename = \"my_raster.tif\",\n gdal = c(\"COMPRESS=NONE\"), overwrite = TRUE)\nwriteRaster(x = single_layer, filename = \"my_raster.tif\",\n filetype = \"COG\", overwrite = TRUE)"},{"path":"read-write.html","id":"retrieving-data","chapter":"8 Geographic data I/O","heading":"8.5 Geoportals","text":"\nvast ever-increasing amount geographic data available internet, much free access use (appropriate credit given providers).44\nways now much data, sense often multiple places access dataset.\ndatasets poor quality.\ncontext, vital know look, first section covers important sources.\nVarious ‘geoportals’ (web services providing geospatial datasets Data.gov) good place start, providing wide range data often specific locations (illustrated updated Wikipedia page topic).\nglobal geoportals overcome issue.\nGEOSS portal Copernicus Data Space Ecosystem, example, contain many raster datasets global coverage.\nwealth vector datasets can accessed SEDAC portal run National Aeronautics Space Administration (NASA) European Union’s INSPIRE geoportal, global regional coverage.geoportals provide graphical interface allowing datasets queried based characteristics spatial temporal extent, United States Geological Survey’s EarthExplorer prime example.\nExploring datasets interactively browser effective way understanding available layers.\nDownloading data best done code, however, reproducibility efficiency perspectives.\nDownloads can initiated command line using variety techniques, primarily via URLs APIs (see Copernicus APIs example).45\nFiles hosted static URLs can downloaded download.file(), illustrated code chunk accesses PeRL: Permafrost Region Pond Lake Database pangaea.de:","code":"\ndownload.file(url = \"https://hs.pangaea.de/Maps/PeRL/PeRL_permafrost_landscapes.zip\",\n destfile = \"PeRL_permafrost_landscapes.zip\", \n mode = \"wb\")\nunzip(\"PeRL_permafrost_landscapes.zip\")\ncanada_perma_land = read_sf(\"PeRL_permafrost_landscapes/canada_perma_land.shp\")"},{"path":"read-write.html","id":"geographic-data-packages","chapter":"8 Geographic data I/O","heading":"8.6 Geographic data packages","text":"\nMany R packages developed accessing geographic data, presented Table 8.6.\nprovide interfaces one spatial libraries geoportals aim make data access even quicker command line.\nTABLE 8.6: TABLE 8.7: Selected R packages geographic data retrieval.\nemphasized Table 8.6 represents small number available geographic data packages.\nexample, large number R packages exist obtain various socio-demographic data, tidycensus tigris (USA), cancensus (Canada), eurostat giscoR (European Union), idbr (international databases) – read Analyzing US Census Data (K. E. Walker 2022) find examples analyze data.\nSimilarly, several R packages exist giving access spatial data various regions countries, bcdata (Province British Columbia), geobr (Brazil), RCzechia (Czech Republic), rgugik (Poland).data package syntax accessing data.\ndiversity demonstrated subsequent code chunks, show get data using three packages Table 8.6.46\nCountry borders often useful can accessed ne_countries() function rnaturalearth package (Massicotte South 2023) follows:Country borders can also accessed packages, geodata, giscoR, rgeoboundaries.second example downloads series rasters containing global monthly precipitation sums spatial resolution 10 minutes (~18.5 km equator) using geodata package (Hijmans 2023a).\nresult multilayer object class SpatRaster.third example uses osmdata package (Padgham et al. 2023) find parks OpenStreetMap (OSM) database.\nillustrated code-chunk , queries begin function opq() (short OpenStreetMap query), first argument bounding box, text string representing bounding box (city Leeds, case).\nresult passed function selecting OSM elements ’re interested (parks case), represented key-value pairs.\nNext, passed function osmdata_sf() work downloading data converting list sf objects (see vignette('osmdata') details):limitation osmdata package rate-limited, meaning download large OSM datasets (e.g., OSM data large city).\novercome limitation, osmextract package developed, can used download import binary .pbf files containing compressed versions OSM database predefined regions.OpenStreetMap vast global database crowd-sourced data, growing daily, wider ecosystem tools enabling easy access data, Overpass turbo web service rapid development testing OSM queries osm2pgsql importing data PostGIS database.\nAlthough quality datasets derived OSM varies, data source wider OSM ecosystems many advantages: provide datasets available globally, free charge, constantly improving thanks army volunteers.\nUsing OSM encourages ‘citizen science’ contributions back digital commons (can start editing data representing part world know well www.openstreetmap.org).\nexamples OSM data action provided Chapters 10, 13 14.Sometimes, packages come built-datasets.\ncan accessed four ways: attaching package (package uses ‘lazy loading’ spData ), data(dataset, package = mypackage), referring dataset mypackage::dataset, system.file(filepath, package = mypackage) access raw data files.\nfollowing code chunk illustrates latter two options using world dataset (already loaded attaching parent package library(spData)):47The last example, system.file(\"shapes/world.gpkg\", package = \"spData\"), returns path world.gpkg file, stored inside \"shapes/\" folder spData package.\nAnother way obtain spatial information perform geocoding – transform description location, usually address, coordinates.\nusually done sending query online service getting location result.\nMany services exist differ used method geocoding, usage limitations, costs, application programming interface (API) key requirements.\nR several packages geocoding; however, tidygeocoder seems allow connect largest number geocoding services consistent interface.\ntidygeocoder main function geocode, takes data frame addresses adds coordinates \"lat\" \"long\".\nfunction also allows select geocoding service method argument many additional parameters.Let’s try package searching coordinates John Snow blue plaque located building Soho district London.resulting data frame can converted sf object st_as_sf().tidygeocoder also allows performing opposite process called reverse geocoding used get set information (name, address, etc.) based pair coordinates.\nGeographic data can also imported R various ‘bridges’ geographic software, described Chapter 10.","code":"\nlibrary(rnaturalearth)\nusa_sf = ne_countries(country = \"United States of America\", returnclass = \"sf\")\nlibrary(geodata)\nworldclim_prec = worldclim_global(\"prec\", res = 10, path = tempdir())\nclass(worldclim_prec)\nlibrary(osmdata)\nparks = opq(bbox = \"leeds uk\") |> \n add_osm_feature(key = \"leisure\", value = \"park\") |> \n osmdata_sf()\nworld2 = spData::world\nworld3 = read_sf(system.file(\"shapes/world.gpkg\", package = \"spData\"))\nlibrary(tidygeocoder)\ngeo_df = data.frame(address = \"54 Frith St, London W1D 4SJ, UK\")\ngeo_df = geocode(geo_df, address, method = \"osm\")\ngeo_df\ngeo_sf = st_as_sf(geo_df, coords = c(\"long\", \"lat\"), crs = \"EPSG:4326\")"},{"path":"read-write.html","id":"geographic-metadata","chapter":"8 Geographic data I/O","heading":"8.7 Geographic metadata","text":"Geographic metadata cornerstone geographic information management, used describe datasets, data structures services.\nhelp make datasets FAIR (Findable, Accessible, Interoperable, Reusable) defined ISO/OGC standards, particular ISO 19115 standard underlying schemas.\nstandards widely used within spatial data infrastructures, handled metadata catalogs.Geographic metadata can managed geometa, package allows writing, reading validating geographic metadata according ISO/OGC standards.\nalready supports various international standards geographic metadata information, ISO 19110 (feature catalogue), ISO 19115-1 19115-2 (geographic metadata vector gridded/imagery datasets), ISO 19119 (geographic metadata service), ISO 19136 (Geographic Markup Language) providing methods read, validate write geographic metadata R using ISO/TS 19139 (XML) technical specification.\n\nGeographic metadata can created geometa follows, creates saves metadata file:package comes examples extended packages geoflow ease automate management metadata.field standard geographic information management, distinction data metadata less clear.\nGeography Markup Language (GML) standard file format covers data metadata, example.\ngeometa package allows exporting GML (ISO 19136) objects geometry objects modeled sf.\nfunctionality allows use geographic metadata (enabling inclusion metadata detailed geographic temporal extents, rather simple bounding boxes, example) provision services extend GML standard (e.g., Open Geospatial Consortium Web Coverage Service, OGC-WCS).","code":"\nlibrary(geometa)\n# create a metadata\nmd = ISOMetadata$new()\n#... fill the metadata 'md' object\n# validate metadata\nmd$validate()\n# XML representation of the ISOMetadata\nxml = md$encode()\n# save metadata\nmd$save(\"my_metadata.xml\")\n# read a metadata from an XML file\nmd = readISO19139(\"my_metadata.xml\")"},{"path":"read-write.html","id":"geographic-web-services","chapter":"8 Geographic data I/O","heading":"8.8 Geographic web services","text":"\neffort standardize web APIs accessing spatial data, Open Geospatial Consortium (OGC) created number standard specifications web services (collectively known OWS, short OGC Web Services).\nservices complement use core standards developed model geographic information, ISO/OGC Spatial Schema (ISO 19107:2019), Simple Features (ISO 19125-1:2004), format data, Geographic Markup Language (GML).\nspecifications cover common access services data metadata.\nVector data can accessed Web Feature Service (WFS), whereas grid/imagery can accessed Web Coverage Service (WCS).\nMap image representations, tiles, can accessed Web Map Service (WMS) Web Map Tile Service (WMTS).\nMetadata also covered means Catalogue Service Web (CSW).\nFinally, standard processing handled Web Processing Service (WPS) Web Coverage Processing Service (WCPS).Various open-source projects adopted protocols, GeoServer MapServer data handling, GeoNetwork PyCSW metadata handling, leading standardization queries.\nIntegrated tools Spatial Data Infrastructures (SDI), GeoNode, GeOrchestra Examind also adopted standard webservices, either directly using previously mentioned open-source tools.Like web APIs, OWS APIs use ‘base URL’, ‘endpoint’ ‘URL query arguments’ following ? request data (see best-practices-api-packages vignette httr package).many requests can made OWS service. examples illustrate requests can made directly httr straightforward ows4R package (OGC Web-Services R).Let’s start examples using httr package, can useful understanding web services work.\nOne fundamental requests getCapabilities, demonstrated httr functions GET() modify_url() .\nfollowing code chunk demonstrates API queries can constructed dispatched, case discover capabilities service run Fisheries Aquaculture Division Food Agriculture Organization United Nations (UN-FAO).code chunk demonstrates API requests can constructed programmatically GET() function, takes base URL list query parameters can easily extended.\nresult request saved res, object class response defined httr package, list containing information request, including URL.\ncan seen executing browseURL(res$url), results can also read directly browser.\nOne way extracting contents request follows:Data can downloaded WFS services GetFeature request specific typeName (illustrated code chunk ).Available names differ depending accessed web feature service.\nOne can extract programmatically using web technologies (Nolan Lang 2014) scrolling manually contents GetCapabilities output browser.order keep geometry validity along data access chain, since standards underlying open-source server solutions (GeoServer) built Simple Features access, important deactivate new default behavior introduced sf, use S2 geometry model data access time.\ndone code sf::sf_use_s2(FALSE).\nAlso note use write_disk() ensure results written disk rather loaded memory, allowing imported sf.many everyday tasks, however, higher-level interface may appropriate, number R packages, tutorials, developed precisely purpose.\npackage ows4R developed working OWS services.\nprovides stable interface common access services, WFS, WCS data, CSW metadata, WPS processing.\nOGC services coverage described README package, hosted github.com/eblondel/ows4R, new standard protocols investigation/development.Based example, code shows perform getCapabilities getFeatures operations package.\nows4R package relies principle clients.\ninteract OWS service (WFS), client created follows:operations accessible client object, e.g., getCapabilities getFeatures.explained , accessing data OGC services, handling sf features done deactivating new default behavior introduced sf, sf::sf_use_s2(FALSE).\ndone default ows4R.Additional examples available vignettes, access raster data WCS, access metadata CSW.","code":"\nlibrary(httr)\nbase_url = \"https://www.fao.org\"\nendpoint = \"/fishery/geoserver/wfs\"\nq = list(request = \"GetCapabilities\")\nres = GET(url = modify_url(base_url, path = endpoint), query = q)\nres$url\n#> [1] \"https://www.fao.org/fishery/geoserver/wfs?request=GetCapabilities\"\ntxt = content(res, \"text\")\nxml = xml2::read_xml(txt)\nxml\n#> {xml_document} ...\n#> [1] \\n GeoServer WFS...\n#> [2] \\n UN-FAO Fishe...\n#> ...\nlibrary(sf)\nsf::sf_use_s2(FALSE)\nqf = list(request = \"GetFeature\", typeName = \"fifao:FAO_MAJOR\")\nfile = tempfile(fileext = \".gml\")\nGET(url = base_url, path = endpoint, query = qf, write_disk(file))\nfao_areas = read_sf(file)\nlibrary(ows4R)\nWFS = WFSClient$new(\n url = \"https://www.fao.org/fishery/geoserver/wfs\",\n serviceVersion = \"1.0.0\",\n logger = \"INFO\"\n)\nlibrary(ows4R)\ncaps = WFS$getCapabilities()\nfeatures = WFS$getFeatures(\"fifao:FAO_MAJOR\")"},{"path":"read-write.html","id":"visual-outputs","chapter":"8 Geographic data I/O","heading":"8.9 Visual outputs","text":"\nR supports many different static interactive graphics formats.\nChapter 9 covers map-making detail, worth mentioning ways output visualizations .\ngeneral method save static plot open graphic device, create plot, close , example:available graphic devices include pdf(), bmp(), jpeg(), tiff().\ncan specify several properties output plot, including width, height resolution.\nAdditionally, several graphic packages provide functions save graphical output.\nexample, tmap package tmap_save() function.\ncan save tmap object different graphic formats HTML file specifying object name file path new file.hand, can save interactive maps created mapview package HTML file image using mapshot2() function:","code":"\npng(filename = \"lifeExp.png\", width = 500, height = 350)\nplot(world[\"lifeExp\"])\ndev.off()\nlibrary(tmap)\ntmap_obj = tm_shape(world) + tm_polygons(col = \"lifeExp\")\ntmap_save(tmap_obj, filename = \"lifeExp_tmap.png\")\nlibrary(mapview)\nmapview_obj = mapview(world, zcol = \"lifeExp\", legend = TRUE)\nmapshot2(mapview_obj, url = \"my_interactive_map.html\")"},{"path":"read-write.html","id":"exercises-6","chapter":"8 Geographic data I/O","heading":"8.10 Exercises","text":"E1. List describe three types vector, raster, geodatabase formats.E2. Name least two differences sf functions read_sf() st_read().E3. Read cycle_hire_xy.csv file spData package spatial object (Hint: located misc folder).\ngeometry type loaded object?E4. Download borders Germany using rnaturalearth, create new object called germany_borders.\nWrite new object file GeoPackage format.E5. Download global monthly minimum temperature spatial resolution 5 minutes using geodata package.\nExtract June values, save file named tmin_june.tif file (hint: use terra::subset()).E6. Create static map Germany’s borders, save PNG file.E7. Create interactive map using data cycle_hire_xy.csv file.\nExport map file called cycle_hire.html.","code":""},{"path":"adv-map.html","id":"adv-map","chapter":"9 Making maps with R","heading":"9 Making maps with R","text":"","code":""},{"path":"adv-map.html","id":"prerequisites-7","chapter":"9 Making maps with R","heading":"Prerequisites","text":"chapter requires following packages already using:main package used chapter tmap.\nrecommend install development version r-universe repository, updated frequently CRAN version:uses following visualization packages (also install shiny want develop interactive mapping applications):also need read-couple datasets follows Section 4.3:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(spData)\nlibrary(spDataLarge)\ninstall.packages(\"tmap\", repos = c(\"https://r-tmap.r-universe.dev\",\n \"https://cloud.r-project.org\"))\nlibrary(tmap) # for static and interactive maps\nlibrary(leaflet) # for interactive maps\nlibrary(ggplot2) # tidyverse data visualization package\nnz_elev = rast(system.file(\"raster/nz_elev.tif\", package = \"spDataLarge\"))"},{"path":"adv-map.html","id":"introduction-5","chapter":"9 Making maps with R","heading":"9.1 Introduction","text":"satisfying important aspect geographic research communicating results.\nMap-making — art cartography — ancient skill involving communication, attention detail, element creativity.\nStatic mapping R straightforward plot() function, saw Section 2.2.3.\npossible create advanced maps using base R methods (Murrell 2016).\nfocus chapter, however, cartography dedicated map-making packages.\nlearning new skill, makes sense gain depth--knowledge one area branching .\nMap-making exception, hence chapter’s coverage one package (tmap) depth rather many superficially.addition fun creative, cartography also important practical applications.\ncarefully crafted map can best way communicating results work, poorly designed maps can leave bad impression.\nCommon design issues include poor placement, size readability text careless selection colors, outlined style guide Journal Maps.\nFurthermore, poor map-making can hinder communication results (Brewer 2015):Amateur-looking maps can undermine audience’s ability understand important information weaken presentation professional data investigation.\nMaps used several thousand years wide variety purposes.\nHistoric examples include maps buildings land ownership Old Babylonian dynasty 3000 years ago Ptolemy’s world map masterpiece Geography nearly 2000 years ago (Talbert 2014).Map-making historically activity undertaken , behalf , elite.\nchanged emergence open source mapping software R package tmap ‘print layout’ QGIS, allow anyone make high-quality maps, enabling ‘citizen science’.\nMaps also often best way present findings geocomputational research way accessible.\nMap-making therefore critical part geocomputation emphasis describing, also changing world.chapter shows make wide range maps.\nnext section covers range static maps, including aesthetic considerations, facets inset maps.\nSections 9.3 9.5 cover animated interactive maps (including web maps mapping applications).\nFinally, Section 9.6 covers range alternative map-making packages including ggplot2 cartogram.","code":""},{"path":"adv-map.html","id":"static-maps","chapter":"9 Making maps with R","heading":"9.2 Static maps","text":"\nStatic maps common type visual output geocomputation.\nusually stored standard formats including .png .pdf graphical raster vector outputs, respectively.\nInitially, static maps type maps R produce.\nThings advanced release sp (see Pebesma Bivand 2005), many map-making techniques, functions, packages developed since .\nHowever, despite innovation interactive mapping, static plotting still emphasis geographic data visualization R decade later (Cheshire Lovelace 2015).generic plot() function often fastest way create static maps vector raster spatial objects (see Sections 2.2.3 2.3.3).\nSometimes, simplicity speed priorities, especially development phase project, plot() excels.\nbase R approach also extensible, plot() offering dozens arguments.\nAnother approach grid package allows low-level control static maps, illustrated chapter 14 Murrell (2016).\npart book focuses tmap emphasizes essential aesthetic layout options.\ntmap powerful flexible map-making package sensible defaults.\nconcise syntax allows creation attractive maps minimal code familiar ggplot2 users.\nalso unique capability generate static interactive maps using code via tmap_mode().\nFinally, accepts wider range spatial classes (including sf terra objects) alternatives ggplot2.","code":""},{"path":"adv-map.html","id":"tmap-basics","chapter":"9 Making maps with R","heading":"9.2.1 tmap basics","text":"\nLike ggplot2, tmap based idea ‘grammar graphics’ (Wilkinson Wills 2005).\ninvolves separation input data aesthetics (data visualized): input dataset can ‘mapped’ range different ways including location map (defined data’s geometry), color, visual variables.\nbasic building block tm_shape() (defines input data: vector raster object), followed one layer elements tm_fill() tm_dots().\nlayering demonstrated chunk , generates maps presented Figure 9.1:\nFIGURE 9.1: New Zealand’s shape plotted fill (left), border (middle) fill border (right) layers added using tmap functions.\nobject passed tm_shape() case nz, sf object representing regions New Zealand (see Section 2.2.1 sf objects).\nLayers added represent nz visually, tm_fill() tm_borders() creating shaded areas (left panel) border outlines (middle panel) Figure 9.1, respectively.\nintuitive approach map-making:\ncommon task adding new layers undertaken addition operator +, followed tm_*().\nasterisk (*) refers wide range layer types self-explanatory names including:tm_fill(): shaded areas (multi)polygonstm_borders(): border outlines (multi)polygonstm_polygons(): , shaded areas border outlines (multi)polygonstm_lines(): lines (multi)linestringstm_symbols(): symbols (multi)points, (multi)linestrings, (multi)polygonstm_raster(): colored cells raster data (also tm_rgb() rasters three layers)tm_text(): text information (multi)points, (multi)linestrings, (multi)polygonsThis layering illustrated right panel Figure 9.1, result adding border top fill layer.","code":"\n# Add fill layer to nz shape\ntm_shape(nz) +\n tm_fill() \n# Add border layer to nz shape\ntm_shape(nz) +\n tm_borders() \n# Add fill and border layers to nz shape\ntm_shape(nz) +\n tm_fill() +\n tm_borders() "},{"path":"adv-map.html","id":"map-obj","chapter":"9 Making maps with R","heading":"9.2.2 Map objects","text":"useful feature tmap ability store objects representing maps.\ncode chunk demonstrates saving last plot Figure 9.1 object class tmap (note use tm_polygons() condenses tm_fill() + tm_borders() single function):map_nz can plotted later, example adding layers (shown ) simply running map_nz console, equivalent print(map_nz).New shapes can added + tm_shape(new_obj).\ncase, new_obj represents new spatial object plotted top preceding layers.\nnew shape added way, subsequent aesthetic functions refer , another new shape added.\nsyntax allows creation maps multiple shapes layers, illustrated next code chunk uses function tm_raster() plot raster layer (col_alpha set make layer semi-transparent):Building previously created map_nz object, preceding code creates new map object map_nz1 contains another shape (nz_elev) representing average elevation across New Zealand (see Figure 9.2, left).\nshapes layers can added, illustrated code chunk creates nz_water, representing New Zealand’s territorial waters, adds resulting lines existing map object.limit number layers shapes can added tmap objects, shape can even used multiple times.\nfinal map illustrated Figure 9.2 created adding layer representing high points (stored object nz_height) onto previously created map_nz2 object tm_symbols() (see ?tm_symbols details tmap’s point plotting functions).\nresulting map, four layers, illustrated right-hand panel Figure 9.2:\nuseful little known feature tmap multiple map objects can arranged single ‘metaplot’ tmap_arrange().\ndemonstrated code chunk plots map_nz1 map_nz3, resulting Figure 9.2.\nFIGURE 9.2: Maps added layers final map Figure 9.1.\nelements can also added + operator.\nAesthetic settings, however, controlled arguments layer functions.","code":"\nmap_nz = tm_shape(nz) + tm_polygons()\nclass(map_nz)\n#> [1] \"tmap\"\nmap_nz1 = map_nz +\n tm_shape(nz_elev) + tm_raster(col_alpha = 0.7)\nnz_water = st_union(nz) |>\n st_buffer(22200) |> \n st_cast(to = \"LINESTRING\")\nmap_nz2 = map_nz1 +\n tm_shape(nz_water) + tm_lines()\nmap_nz3 = map_nz2 +\n tm_shape(nz_height) + tm_symbols()\ntmap_arrange(map_nz1, map_nz2, map_nz3)"},{"path":"adv-map.html","id":"visual-variables","chapter":"9 Making maps with R","heading":"9.2.3 Visual variables","text":"\nplots previous section demonstrate tmap’s default aesthetic settings.\nGray shades used tm_fill() tm_symbols() layers continuous black line used represent lines created tm_lines().\ncourse, default values aesthetics can overridden.\npurpose section show .two main types map aesthetics: change data constant.\nUnlike ggplot2, uses helper function aes() represent variable aesthetics, tmap accepts aesthetic arguments, depending selected layer type:fill: fill color polygoncol: color polygon border, line, point, rasterlwd: line widthlty: line typesize: size symbolshape: shape symbolAdditionally, may customize fill border color transparency using fill_alpha col_alpha.map variable aesthetic, pass column name corresponding argument, set fixed aesthetic, pass desired value instead.48\nimpact setting fixed values illustrated Figure 9.3.\nFIGURE 9.3: Impact changing commonly used fill border aesthetics fixed values.\nLike base R plots, arguments defining aesthetics can also receive values vary.\nUnlike base R code (generates left panel Figure 9.4), tmap aesthetic arguments accept numeric vector:Instead fill (aesthetics can vary lwd line layers size point layers) requires character string naming attribute associated geometry plotted.\nThus, one achieve desired result follows (Figure 9.4, right panel):\nFIGURE 9.4: Comparison base (left) tmap (right) handling numeric color field.\nvisual variable three related additional arguments, suffixes .scale, .legend, .free.\nexample, tm_fill() function arguments fill, fill.scale, fill.legend, fill.free.\n.scale argument determines provided values represented map legend (Section 9.2.4), .legend argument used customize legend settings, title, orientation, position (Section 9.2.5).\n.free argument relevant maps many facets determine facet different scale legend.","code":"\nma1 = tm_shape(nz) + tm_polygons(fill = \"red\")\nma2 = tm_shape(nz) + tm_polygons(fill = \"red\", fill_alpha = 0.3)\nma3 = tm_shape(nz) + tm_polygons(col = \"blue\")\nma4 = tm_shape(nz) + tm_polygons(lwd = 3)\nma5 = tm_shape(nz) + tm_polygons(lty = 2)\nma6 = tm_shape(nz) + tm_polygons(fill = \"red\", fill_alpha = 0.3,\n col = \"blue\", lwd = 3, lty = 2)\ntmap_arrange(ma1, ma2, ma3, ma4, ma5, ma6)\nplot(st_geometry(nz), col = nz$Land_area) # works\ntm_shape(nz) + tm_fill(fill = nz$Land_area) # fails\n#> Error: palette should be a character value\ntm_shape(nz) + tm_fill(fill = \"Land_area\")"},{"path":"adv-map.html","id":"scales","chapter":"9 Making maps with R","heading":"9.2.4 Scales","text":"\nScales control values represented map legend, largely depend selected visual variable.\nexample, visual variable col, col.scale controls colors spatial objects related provided values; visual variable size, size.scale controls sizes represent provided values.\ndefault, used scale tm_scale(), selects visual settings automatically given input data type (factor, numeric, integer).\nLet’s see scales work customizing polygons’ fill colors.\nColor settings important part map design – can major impact spatial variability portrayed illustrated Figure 9.5.\nfigure shows four ways coloring regions New Zealand depending median income, left right (demonstrated code chunk ):default setting uses ‘pretty’ breaks, described next paragraphbreaks allows manually set breaksn sets number bins numeric variables categorizedvalues defines color scheme, example, BuGn\nFIGURE 9.5: Illustration color settings. results show (left right): default settings, manual breaks, n breaks, impact changing palette.\n\nalso able customize scales using family functions start tm_scale_ prefix.\nimportant ones tm_scale_intervals(), tm_scale_continuous(), tm_scale_categorical().\ntm_scale_intervals() function splits input data values set intervals.\naddition manually setting breaks, tmap allows users specify algorithms create breaks style argument automatically.\ndefault tm_scale_intervals(style = \"pretty\"), rounds breaks whole numbers possible spaces evenly.\noptions listed presented Figure 9.6.style = \"equal\": divides input values bins equal range appropriate variables uniform distribution (recommended variables skewed distribution resulting map may end little color diversity)style = \"quantile\": ensures number observations fall category (potential downside bin ranges can vary widely)style = \"jenks\": identifies groups similar values data maximizes differences categoriesstyle = \"log10_pretty\": common logarithmic (logarithm base 10) version regular pretty style used variables right-skewed distribution\nFIGURE 9.6: Different interval scale methods set using style argument tmap.\n\ntm_scale_continuous() function presents continuous color field particularly suited continuous rasters (Figure 9.7, left panel).\ncase variables q skewed distribution, can also use variants – tm_scale_continuous_log() tm_scale_continuous_log1p().\nFinally, tm_scale_categorical() designed represent categorical values ensures category receives unique color (Figure 9.7, right panel).\nFIGURE 9.7: Continuous categorical scales tmap.\n\nPalettes define color ranges associated bins determined tm_scale_*() functions, breaks n arguments described .\nexpects vector colors new color palette name, can found interactively cols4all::c4a_gui().\ncan also add - color palette name prefix reverse palette order.\nthree main groups color palettes: categorical, sequential diverging (Figure 9.8), serves different purpose.49\nCategorical palettes consist easily distinguishable colors appropriate categorical data without particular order state names land cover classes.\nColors intuitive: rivers blue, example, pastures green.\nAvoid many categories: maps large legends many colors can uninterpretable.50The second group sequential palettes.\nfollow gradient, example light dark colors (light colors often tend represent lower values), appropriate continuous (numeric) variables.\nSequential palettes can single (greens goes light dark green, example) multi-color/hue (yl_gn_bu gradient light yellow blue via green, example), demonstrated code chunk — output shown, run code see results!third group, diverging palettes, typically range three distinct colors (purple-white-green Figure 9.8) usually created joining two single-color sequential palettes darker colors end.\nmain purpose visualize difference important reference point, e.g., certain temperature, median household income mean probability drought event.\nreference point’s value can adjusted tmap using midpoint argument.\nFIGURE 9.8: Examples categorical, sequential diverging palettes.\ntwo important principles consideration working colors: perceptibility accessibility.\nFirstly, colors maps match perception.\nmeans certain colors viewed experience also cultural lenses.\nexample, green colors usually represent vegetation lowlands, blue connected water coolness.\nColor palettes also easy understand effectively convey information.\nclear values lower higher, colors change gradually.\nSecondly, changes colors accessible largest number people.\nTherefore, important use colorblind friendly palettes often possible.51","code":"\ntm_shape(nz) + tm_polygons(fill = \"Median_income\")\ntm_shape(nz) + tm_polygons(fill = \"Median_income\",\n fill.scale = tm_scale(breaks = c(0, 30000, 40000, 50000)))\ntm_shape(nz) + tm_polygons(fill = \"Median_income\",\n fill.scale = tm_scale(n = 10))\ntm_shape(nz) + tm_polygons(fill = \"Median_income\",\n fill.scale = tm_scale(values = \"BuGn\"))\ntm_shape(nz) + \n tm_polygons(\"Median_income\", fill.scale = tm_scale(values = \"greens\"))\ntm_shape(nz) + \n tm_polygons(\"Median_income\", fill.scale = tm_scale(values = \"yl_gn_bu\"))\ntm_shape(nz) + \n tm_polygons(\"Median_income\",\n fill.scale = tm_scale_continuous(values = \"pu_gn_div\", \n midpoint = 28000))"},{"path":"adv-map.html","id":"legends","chapter":"9 Making maps with R","heading":"9.2.5 Legends","text":"\ndecided visual variable properties, move attention toward related map legend style.\nUsing tm_legend() function, may change title, position, orientation, even disable .\nimportant argument function title, sets title associated legend.\ngeneral, map legend title provide two pieces information: legend represents units presented variable.\nfollowing code chunk demonstrates functionality providing attractive name variable name Land_area (note use expression() create superscript text):default legend orientation tmap \"portrait\", however, alternative legend orientation, \"landscape\", also possible.\n, can also customize location legend using position argument.legend position (also position several map elements tmap) can customized using one functions.\ntwo important :tm_pos_out(): default, adds legend outside map frame area.\ncan customize location two values represent horizontal position (\"left\", \"center\", \"right\"), vertical position (\"bottom\", \"center\", \"top\")tm_pos_in(): puts legend inside map frame area.\nmay decide position using two arguments, first one can \"left\", \"center\", \"right\", second one can \"bottom\", \"center\", \"top\".Alternatively, may just provide vector two values (two numbers 0 1) – case, legend put inside map frame.","code":"\nlegend_title = expression(\"Area (km\"^2*\")\")\ntm_shape(nz) +\n tm_polygons(fill = \"Land_area\", fill.legend = tm_legend(title = legend_title))\ntm_shape(nz) +\n tm_polygons(fill = \"Land_area\",\n fill.legend = tm_legend(title = legend_title,\n orientation = \"landscape\",\n position = tm_pos_out(\"center\", \"bottom\")))"},{"path":"adv-map.html","id":"layouts","chapter":"9 Making maps with R","heading":"9.2.6 Layouts","text":"\nmap layout refers combination map elements cohesive map.\nMap elements include among others objects mapped, map grid, scale bar, title, margins, color settings covered previous section relate palette breakpoints used affect map looks.\nmay result subtle changes can equally large impact impression left maps.Additional map elements graticules , north arrows, scale bars map titles functions: tm_graticules(), tm_compass(), tm_scalebar(), tm_title() (Figure 9.9).52\nFIGURE 9.9: Map additional elements: north arrow scale bar.\ntmap also allows wide variety layout settings changed, , produced using following code (see args(tm_layout) ?tm_layout full list), illustrated Figure 9.10.\nFIGURE 9.10: Layout options specified (left right) scale, bg.color, frame arguments.\narguments tm_layout() provide control many aspects map relation canvas placed.\nuseful layout settings (illustrated Figure 9.11):Margin settings including inner.margin outer.marginFont settings controlled fontface fontfamilyLegend settings including options legend.show (whether show legend) legend.orientation, legend.position, legend.frameFrame width (frame.lwd) option allow double lines (frame.double.line)Color settings controlling color.sepia.intensity (yellowy map looks) color.saturation (color-grayscale)\nFIGURE 9.11: Selected layout options.\n","code":"\nmap_nz + \n tm_graticules() +\n tm_compass(type = \"8star\", position = c(\"left\", \"top\")) +\n tm_scalebar(breaks = c(0, 100, 200), text.size = 1, position = c(\"left\", \"top\")) +\n tm_title(\"New Zealand\")\nmap_nz + tm_layout(scale = 4)\nmap_nz + tm_layout(bg.color = \"lightblue\")\nmap_nz + tm_layout(frame = FALSE)"},{"path":"adv-map.html","id":"faceted-maps","chapter":"9 Making maps with R","heading":"9.2.7 Faceted maps","text":"\nFaceted maps, also referred ‘small multiples’, composed many maps arranged side--side, sometimes stacked vertically (Meulemans et al. 2017).\nFacets enable visualization spatial relationships change respect another variable, time.\nchanging populations settlements, example, can represented faceted map panel representing population particular moment time.\ntime dimension represented via another visual variable color.\nHowever, risks cluttering map involve multiple overlapping points (cities tend move time!).Typically individual facets faceted map contain geometry data repeated multiple times, column attribute data (default plotting method sf objects, see Chapter 2).\nHowever, facets can also represent shifting geometries evolution point pattern time.\nuse case faceted plot illustrated Figure 9.12.\nFIGURE 9.12: Faceted map showing top 30 largest urban agglomerations 1970 2030 based population projections United Nations.\npreceding code chunk demonstrates key features faceted maps created using tm_facets_wrap() function:Shapes facet variable repeated (countries world case)argument varies depending variable (\"year\" case)nrow/ncol setting specifying number rows columns facets arranged intoAlternatively, possible use tm_facets_grid() function allows facets based three different variables: one rows, one columns, possibly one pages.addition utility showing changing spatial relationships, faceted maps also useful foundation animated maps (see Section 9.3).","code":"\nurb_1970_2030 = urban_agglomerations |> \n filter(year %in% c(1970, 1990, 2010, 2030))\n\ntm_shape(world) +\n tm_polygons() +\n tm_shape(urb_1970_2030) +\n tm_symbols(fill = \"black\", col = \"white\", size = \"population_millions\") +\n tm_facets_wrap(by = \"year\", nrow = 2)"},{"path":"adv-map.html","id":"inset-maps","chapter":"9 Making maps with R","heading":"9.2.8 Inset maps","text":"\ninset map smaller map rendered within next main map.\nserve many different purposes, including providing context (Figure 9.13) bringing non-contiguous regions closer ease comparison (Figure 9.14).\nalso used focus smaller area detail cover area map, representing different topic.example , create map central part New Zealand’s Southern Alps.\ninset map show main map relation whole New Zealand.\nfirst step define area interest, can done creating new spatial object, nz_region.second step, create base-map showing New Zealand’s Southern Alps area.\nplace important message stated.third step consists inset map creation.\ngives context helps locate area interest.\nImportantly, map needs clearly indicate location main map, example stating borders.One main differences regular charts (e.g., scatterplots) maps input data determine aspect ratio maps.\nThus, case, need calculate aspect ratios two main datasets, nz_region nz.\nfollowing function, norm_dim() returns normalized width (\"w\") height (\"h\") object (\"snpc\" units understood graphic device).Next, knowing aspect ratios, need specify sizes locations two maps – main map inset map – using viewport() function.\nviewport part graphics device use draw graphical elements given moment.\nviewport main map just representation aspect ratio.hand, viewport inset map needs specify size location.\n, make inset map twice smaller main one multiplying width height 0.5, locate 0.5 cm bottom right main map frame.Finally, combine two maps creating new, blank canvas, printing main map, placing inset map inside main map viewport.\nFIGURE 9.13: Inset map providing context – location central part Southern Alps New Zealand.\nInset maps can saved file either using graphic device (see Section 8.9) tmap_save() function arguments: insets_tm insets_vp.Inset maps also used create one map non-contiguous areas.\nProbably, often used example map United States, consists contiguous United States, Hawaii Alaska.\nimportant find best projection individual inset types cases (see Chapter 7 learn ).\ncan use US National Atlas Equal Area map contiguous United States putting EPSG code crs argument tm_shape().rest objects, hawaii alaska, already proper projections; therefore, just need create two separate maps:final map created combining, resizing arranging three maps:\nFIGURE 9.14: Map United States.\ncode presented compact can used basis inset maps, results, Figure 9.14, provide poor representation locations sizes Hawaii Alaska.\n-depth approach, see us-map vignette geocompkg.","code":"\nnz_region = st_bbox(c(xmin = 1340000, xmax = 1450000,\n ymin = 5130000, ymax = 5210000),\n crs = st_crs(nz_height)) |> \n st_as_sfc()\nnz_height_map = tm_shape(nz_elev, bbox = nz_region) +\n tm_raster(col.scale = tm_scale_continuous(values = \"YlGn\"),\n col.legend = tm_legend(position = c(\"left\", \"top\"))) +\n tm_shape(nz_height) + tm_symbols(shape = 2, col = \"red\", size = 1) +\n tm_scalebar(position = c(\"left\", \"bottom\"))\nnz_map = tm_shape(nz) + tm_polygons() +\n tm_shape(nz_height) + tm_symbols(shape = 2, col = \"red\", size = 0.1) + \n tm_shape(nz_region) + tm_borders(lwd = 3) +\n tm_layout(bg.color = \"lightblue\")\nlibrary(grid)\nnorm_dim = function(obj){\n bbox = st_bbox(obj)\n width = bbox[[\"xmax\"]] - bbox[[\"xmin\"]]\n height = bbox[[\"ymax\"]] - bbox[[\"ymin\"]]\n w = width / max(width, height)\n h = height / max(width, height)\n return(unit(c(w, h), \"snpc\"))\n}\nmain_dim = norm_dim(nz_region)\nins_dim = norm_dim(nz)\nmain_vp = viewport(width = main_dim[1], height = main_dim[2])\nins_vp = viewport(width = ins_dim[1] * 0.5, height = ins_dim[2] * 0.5,\n x = unit(1, \"npc\") - unit(0.5, \"cm\"), y = unit(0.5, \"cm\"),\n just = c(\"right\", \"bottom\"))\ngrid.newpage()\nprint(nz_height_map, vp = main_vp)\npushViewport(main_vp)\nprint(nz_map, vp = ins_vp)\nus_states_map = tm_shape(us_states, crs = \"EPSG:9311\") + \n tm_polygons() + \n tm_layout(frame = FALSE)\nhawaii_map = tm_shape(hawaii) +\n tm_polygons() + \n tm_title(\"Hawaii\") +\n tm_layout(frame = FALSE, bg.color = NA, \n title.position = c(\"LEFT\", \"BOTTOM\"))\nalaska_map = tm_shape(alaska) +\n tm_polygons() + \n tm_title(\"Alaska\") +\n tm_layout(frame = FALSE, bg.color = NA)\nus_states_map\nprint(hawaii_map, vp = grid::viewport(0.35, 0.1, width = 0.2, height = 0.1))\nprint(alaska_map, vp = grid::viewport(0.15, 0.15, width = 0.3, height = 0.3))"},{"path":"adv-map.html","id":"animated-maps","chapter":"9 Making maps with R","heading":"9.3 Animated maps","text":"\nFaceted maps, described Section 9.2.7, can show spatial distributions variables change (e.g., time), approach disadvantages.\nFacets become tiny many .\nFurthermore, fact facet physically separated screen page means subtle differences facets can hard detect.Animated maps solve issues.\nAlthough depend digital publication, becoming less issue content moves online.\nAnimated maps can still enhance paper reports: can always link readers webpage containing animated (interactive) version printed map help make come alive.\nseveral ways generate animations R, including animation packages gganimate, builds ggplot2 (see Section 9.6).\nsection focuses creating animated maps tmap syntax familiar previous sections flexibility approach.Figure 9.15 simple example animated map.\nUnlike faceted plot, squeeze multiple maps single screen allows reader see spatial distribution world’s populous agglomerations evolve time (see book’s website animated version).\nFIGURE 9.15: Animated map showing top 30 largest urban agglomerations 1950 2030 based population projects United Nations. Animated version available online : r.geocompx.org.\nanimated map illustrated Figure 9.15 can created using tmap techniques generate faceted maps, demonstrated Section 9.2.7.\ntwo differences, however, related arguments tm_facets_wrap():nrow = 1, ncol = 1 added keep one moment time one layerfree.coords = FALSE, maintains map extent map iterationThese additional arguments demonstrated subsequent code chunk53:resulting urb_anim represents set separate maps year.\nfinal stage combine save result .gif file tmap_animation().\nfollowing command creates animation illustrated Figure 9.15, elements missing, add exercises:Another illustration power animated maps provided Figure 9.16.\nshows development states United States, first formed east incrementally west finally interior.\nCode reproduce map can found script code/09-usboundaries.R book GitHub repository.\nFIGURE 9.16: Animated map showing population growth, state formation boundary changes United States, 1790-2010. Animated version available online r.geocompx.org.\n","code":"\nurb_anim = tm_shape(world) + tm_polygons() + \n tm_shape(urban_agglomerations) + tm_symbols(size = \"population_millions\") +\n tm_facets_wrap(by = \"year\", nrow = 1, ncol = 1, free.coords = FALSE)\ntmap_animation(urb_anim, filename = \"urb_anim.gif\", delay = 25)"},{"path":"adv-map.html","id":"interactive-maps","chapter":"9 Making maps with R","heading":"9.4 Interactive maps","text":"\nstatic animated maps can enliven geographic datasets, interactive maps can take new level.\nInteractivity can take many forms, common useful ability pan around zoom part geographic dataset overlaid ‘web map’ show context.\nLess advanced interactivity levels include pop-ups appear click different features, kind interactive label.\nadvanced levels interactivity include ability tilt rotate maps, demonstrated mapdeck example , provision “dynamically linked” sub-plots automatically update user pans zooms (Pezanowski et al. 2018).important type interactivity, however, display geographic data interactive ‘slippy’ web maps.\nrelease leaflet package 2015 (uses leaflet JavaScript library) revolutionized interactive web map creation within R, number packages built foundations adding new features (e.g., leaflet.extras2) making creation web maps simple creating static maps (e.g., mapview tmap).\nsection illustrates approach opposite order.\nexplore make slippy maps tmap (syntax already learned), mapview, mapdeck finally leaflet (provides low-level control interactive maps).unique feature tmap mentioned Section 9.2 ability create static interactive maps using code.\nMaps can viewed interactively point switching view mode, using command tmap_mode(\"view\").\ndemonstrated code , creates interactive map New Zealand based tmap object map_nz, created Section 9.2.2, illustrated Figure 9.17:\nFIGURE 9.17: Interactive map New Zealand created tmap view mode. Interactive version available online : r.geocompx.org.\nNow interactive mode ‘turned ’, maps produced tmap launch (another way create interactive maps tmap_leaflet() function).\nNotable features interactive mode include ability specify basemap tm_basemap() (tmap_options()) demonstrated (result shown):impressive little-known feature tmap’s view mode also works faceted plots.\nargument sync tm_facets_wrap() can used case produce multiple maps synchronized zoom pan settings, illustrated Figure 9.18, produced following code:\nFIGURE 9.18: Faceted interactive maps global coffee production 2016 2017 sync, demonstrating tmap’s view mode action.\nSwitch tmap back plotting mode function:proficient tmap, quickest way create interactive maps R may mapview.\nfollowing ‘one liner’ reliable way interactively explore wide range geographic data formats:\nFIGURE 9.19: Illustration mapview action.\nmapview concise syntax, yet, powerful.\ndefault, standard GIS functionality mouse position information, attribute queries (via pop-ups), scale bar, zoom--layer buttons.\nalso offers advanced controls including ability ‘burst’ datasets multiple layers addition multiple layers + followed name geographic object.\nAdditionally, provides automatic coloring attributes via zcol argument.\nessence, can considered data-driven leaflet API (see information leaflet).\nGiven mapview always expects spatial object (including sf SpatRaster) first argument, works well end piped expressions.\nConsider following example sf used intersect lines polygons visualized mapview (Figure 9.20).\nFIGURE 9.20: Using mapview end sf-based pipe expression.\nOne important thing keep mind mapview layers added via + operator (similar ggplot2 tmap).\ndefault, mapview uses leaflet JavaScript library render output maps, user-friendly lot features.\nHowever, alternative rendering libraries performant (work smoothly larger datasets).\nmapview allows set alternative rendering libraries (\"leafgl\" \"mapdeck\") mapviewOptions().54\ninformation mapview, see package’s website : r-spatial.github.io/mapview/.ways create interactive maps R.\ngoogleway package, example, provides interactive mapping interface flexible extensible\n(see googleway-vignette details).\nAnother approach author mapdeck, provides access Uber’s Deck.gl framework.\nuse WebGL enables interactively visualize large datasets millions points.\npackage uses Mapbox access tokens, must register using package.unique feature mapdeck provision interactive 2.5D perspectives, illustrated Figure 9.21.\nmeans can can pan, zoom rotate around maps, view data ‘extruded’ map.\nFigure 9.21, generated following code chunk, visualizes road traffic crashes UK, bar height representing casualties per area.\nFIGURE 9.21: Map generated mapdeck, representing road traffic casualties across UK. Height 1-km cells represents number crashes.\ncan zoom drag map browser, addition rotating tilting pressing Cmd/Ctrl.\nMultiple layers can added pipe operator, demonstrated mapdeck vignettes.\nmapdeck also supports sf objects, can seen replacing add_grid() function call preceding code chunk add_polygon(data = lnd, layer_id = \"polygon_layer\"), add polygons representing London interactive tilted map.Last leaflet mature widely used interactive mapping package R.\nleaflet provides relatively low-level interface Leaflet JavaScript library many arguments can understood reading documentation original JavaScript library (see leafletjs.com).Leaflet maps created leaflet(), result leaflet map object can piped leaflet functions.\nallows multiple map layers control settings added interactively, demonstrated code generates Figure 9.22 (see rstudio.github.io/leaflet/ details).\nFIGURE 9.22: leaflet package action, showing cycle hire points London. See interactive version online.\n","code":"\ntmap_mode(\"view\")\nmap_nz\nmap_nz + tm_basemap(server = \"OpenTopoMap\")\nworld_coffee = left_join(world, coffee_data, by = \"name_long\")\nfacets = c(\"coffee_production_2016\", \"coffee_production_2017\")\ntm_shape(world_coffee) + tm_polygons(facets) + \n tm_facets_wrap(nrow = 1, sync = TRUE)\ntmap_mode(\"plot\")\n#> ℹ tmap mode set to \"plot\".\nmapview::mapview(nz)\nlibrary(mapview)\noberfranken = subset(franconia, district == \"Oberfranken\")\ntrails |>\n st_transform(st_crs(oberfranken)) |>\n st_intersection(oberfranken) |>\n st_collection_extract(\"LINESTRING\") |>\n mapview(color = \"red\", lwd = 3, layer.name = \"trails\") +\n mapview(franconia, zcol = \"district\") +\n breweries\nlibrary(mapdeck)\nset_token(Sys.getenv(\"MAPBOX\"))\ncrash_data = read.csv(\"https://git.io/geocompr-mapdeck\")\ncrash_data = na.omit(crash_data)\nms = mapdeck_style(\"dark\")\nmapdeck(style = ms, pitch = 45, location = c(0, 52), zoom = 4) |>\n add_grid(data = crash_data, lat = \"lat\", lon = \"lng\", cell_size = 1000,\n elevation_scale = 50, colour_range = hcl.colors(6, \"plasma\"))\npal = colorNumeric(\"RdYlBu\", domain = cycle_hire$nbikes)\nleaflet(data = cycle_hire) |> \n addProviderTiles(providers$CartoDB.Positron) |>\n addCircles(col = ~pal(nbikes), opacity = 0.9) |> \n addPolygons(data = lnd, fill = FALSE) |> \n addLegend(pal = pal, values = ~nbikes) |> \n setView(lng = -0.1, 51.5, zoom = 12) |> \n addMiniMap()"},{"path":"adv-map.html","id":"mapping-applications","chapter":"9 Making maps with R","heading":"9.5 Mapping applications","text":"\ninteractive web maps demonstrated Section 9.4 can go far.\nCareful selection layers display, basemaps pop-ups can used communicate main results many projects involving geocomputation.\nweb-mapping approach interactivity limitations:Although map interactive terms panning, zooming clicking, code static, meaning user interface fixedAll map content generally static web map, meaning web maps scale handle large datasets easilyAdditional layers interactivity, graphs showing relationships variables ‘dashboards’, difficult create using web-mapping-approachOvercoming limitations involves going beyond static web mapping toward geospatial frameworks map servers.\nProducts field include GeoDjango (extends Django web framework written Python), MapServer (framework developing web applications, largely written C C++) GeoServer (mature powerful map server written Java).\nscalable, enabling maps served thousands people daily, assuming sufficient public interest maps!\nbad news server-side solutions require much skilled developer time set maintain, often involving teams people roles dedicated geospatial database administrator (DBA).Fortunately R programmers, web-mapping applications can now rapidly created shiny.\ndescribed open source book Mastering Shiny, shiny R package framework converting R code interactive web applications (Wickham 2021).\ncan embed interactive maps shiny apps thanks functions leaflet::renderLeaflet().\nsection gives context, teaches basics shiny web-mapping perspective, culminates full-screen mapping application less 100 lines code.shiny well documented shiny.posit.co, highlights two components every shiny app: ‘front end’ (bit user sees) ‘back end’ code.\nshiny apps, elements typically created objects named ui server within R script named app.R, lives ‘app folder’.\nallows web-mapping applications represented single file, CycleHireApp/app.R file book’s GitHub repo.considering large apps, worth seeing minimal example, named ‘lifeApp’, action.55\ncode defines launches — command shinyApp() — lifeApp, provides interactive slider allowing users make countries appear progressively lower levels life expectancy (see Figure 9.23):\nFIGURE 9.23: Screenshot showing minimal example web-mapping application created shiny.\nuser interface (ui) lifeApp created fluidPage().\ncontains input output ‘widgets’ — case, sliderInput() (many *Input() functions available) leafletOutput().\narranged row-wise default, explaining slider interface placed directly map Figure 9.23 (see ?column adding content column-wise).server side (server) function input output arguments.\noutput list objects containing elements generated render*() function — renderLeaflet() example generates output$map.\nInput elements input$life referred server must relate elements exist ui — defined inputId = \"life\" code .\nfunction shinyApp() combines ui server elements serves results interactively via new R process.\nmove slider map shown Figure 9.23, actually causing R code re-run, although hidden view user interface.Building basic example knowing find help (see ?shiny), best way forward now may stop reading start programming!\nrecommended next step open previously mentioned CycleHireApp/app.R script integrated development environment (IDE) choice, modify re-run repeatedly.\nexample contains components web mapping application implemented shiny ‘shine’ light behave.CycleHireApp/app.R script contains shiny functions go beyond demonstrated simple ‘lifeApp’ example (Figure 9.24).\ninclude reactive() observe() (creating outputs respond user interface — see ?reactive) leafletProxy() (modifying leaflet object already created).\nelements critical creation web-mapping applications implemented shiny.\nrange ‘events’ can programmed including advanced functionality drawing new layers subsetting data, described shiny section RStudio’s leaflet website.Experimenting apps CycleHireApp build knowledge web-mapping applications R, also practical skills.\nChanging contents setView(), example, change starting bounding box user sees app initiated.\nexperimentation done random, reference relevant documentation, starting ?shiny, motivated desire solve problems posed exercises.shiny used way can make prototyping mapping applications faster accessible ever (deploying shiny apps, https://shiny.posit.co/deploy/, separate topic beyond scope chapter).\nEven applications eventually deployed using different technologies, shiny undoubtedly allows web-mapping applications developed relatively lines code (86 case CycleHireApp).\nstop shiny apps getting rather large.\nPropensity Cycle Tool (PCT) hosted pct.bike, example, national mapping tool funded UK’s Department Transport.\nPCT used dozens people day multiple interactive elements based 1000 lines code (Lovelace et al. 2017).apps undoubtedly take time effort develop, shiny provides framework reproducible prototyping aid development process.\nOne potential problem ease developing prototypes shiny temptation start programming early, purpose mapping application envisioned detail.\nreason, despite advocating shiny, recommend starting longer established technology pen paper first stage interactive mapping projects.\nway prototype web applications limited technical considerations, motivations imagination.\nFIGURE 9.24: CycleHireApp, simple web-mapping application finding closest cycle hiring station based location requirement cycles. Interactive version available online : r.geocompx.org.\n","code":"\nlibrary(shiny) # for shiny apps\nlibrary(leaflet) # renderLeaflet function\nlibrary(spData) # loads the world dataset \nui = fluidPage(\n sliderInput(inputId = \"life\", \"Life expectancy\", 49, 84, value = 80),\n leafletOutput(outputId = \"map\")\n )\nserver = function(input, output) {\n output$map = renderLeaflet({\n leaflet() |> \n # addProviderTiles(\"OpenStreetMap.BlackAndWhite\") |>\n addPolygons(data = world[world$lifeExp < input$life, ])})\n}\nshinyApp(ui, server)"},{"path":"adv-map.html","id":"other-mapping-packages","chapter":"9 Making maps with R","heading":"9.6 Other mapping packages","text":"tmap provides powerful interface creating wide range static maps (Section 9.2) also supports interactive maps (Section 9.4).\nmany options creating maps R.\naim section provide taste pointers additional resources: map-making surprisingly active area R package development, learn can covered .mature option use plot() methods provided core spatial packages sf terra, covered Sections 2.2.3 2.3.3, respectively.\nmentioned sections plot methods vector raster objects can combined results draw onto plot area (elements keys sf plots multi-band rasters interfere ).\nbehavior illustrated subsequent code chunk generates Figure 9.25.\nplot() many options can explored following links ?plot help page fifth sf vignette sf5.\nFIGURE 9.25: Map New Zealand created plot(). legend right refers elevation (1000 m sea level).\ntidyverse plotting package ggplot2 also supports sf objects geom_sf().\nsyntax similar used tmap:\ninitial ggplot() call followed one layers, added + geom_*(), * represents layer type geom_sf() (sf objects) geom_points() (points).ggplot2 plots graticules default.\ndefault settings graticules can overridden using scale_x_continuous(), scale_y_continuous() coord_sf(datum = NA).\nnotable features include use unquoted variable names encapsulated aes() indicate aesthetics vary switching data sources using data argument, demonstrated code chunk creates Figure 9.26:Another benefit maps based ggplot2 can easily given level interactivity printed using function ggplotly() plotly package.\nTry plotly::ggplotly(g1), example, compare result plotly mapping functions described : blog.cpsievert..advantage ggplot2 strong user community many add-packages.\nincludes ggspatial, enhances ggplot2’s mapping capabilities providing options add north arrow (annotation_north_arrow()) scale bar (annotation_scale()), add background tiles (annotation_map_tile()).\nalso accepts various spatial data classes layer_spatial().\nThus, able plot SpatRaster objects terra using function seen Figure 9.26.\nFIGURE 9.26: Comparison map New Zealand created ggplot2 alone (left) ggplot2 ggspatial (right).\ntime, ggplot2 drawbacks, example geom_sf() function always able create desired legend use spatial data.\nGood additional ggplot2 resources can found open source ggplot2 book (Wickham 2016) descriptions multitude ‘ggpackages’ ggrepel tidygraph.covered mapping sf, terra ggplot2 first packages highly flexible, allowing creation wide range static maps.\ncover mapping packages plotting specific type map (next paragraph), worth considering alternatives packages already covered general-purpose mapping (Table 9.1).\nTABLE 9.1: TABLE 9.2: Selected general-purpose mapping packages.\nTable 9.1 shows range mapping packages available, many others listed table.\nnote mapsf, can generate range geographic visualizations including choropleth, ‘proportional symbol’ ‘flow’ maps.\ndocumented mapsf vignette.Several packages focus specific map types, illustrated Table 9.3.\npackages create cartograms distort geographical space, create line maps, transform polygons regular hexagonal grids, visualize complex data grids representing geographic topologies, create 3D visualizations.TABLE 9.3: Selected specific-purpose mapping packages, associated metrics.aforementioned packages, however, different approaches data preparation map creation.\nnext paragraph, focus solely cartogram package (Jeworutzki 2023).\nTherefore, suggest read geogrid, geofacet, linemap, tanaka, rayshader documentations learn .cartogram map geometry proportionately distorted represent mapping variable.\nCreation type map possible R cartogram, allows creating contiguous non-contiguous area cartograms.\nmapping package per se, allows construction distorted spatial objects plotted using generic mapping package.cartogram_cont() function creates contiguous area cartograms.\naccepts sf object name variable (column) inputs.\nAdditionally, possible modify intermax argument – maximum number iterations cartogram transformation.\nexample, represent median income New Zeleand’s regions contiguous cartogram (Figure 9.27, right panel) follows:\nFIGURE 9.27: Comparison standard map (left) contiguous area cartogram (right).\ncartogram also offers creation non-contiguous area cartograms using cartogram_ncont() Dorling cartograms using cartogram_dorling().\nNon-contiguous area cartograms created scaling region based provided weighting variable.\nDorling cartograms consist circles area proportional weighting variable.\ncode chunk demonstrates creation non-contiguous area Dorling cartograms US states’ population (Figure 9.28):\nFIGURE 9.28: Comparison non-contiguous area cartogram (left) Dorling cartogram (right).\n","code":"\ng = st_graticule(nz, lon = c(170, 175), lat = c(-45, -40, -35))\nplot(nz_water, graticule = g, axes = TRUE, col = \"blue\")\nterra::plot(nz_elev / 1000, add = TRUE, axes = FALSE)\nplot(st_geometry(nz), add = TRUE)\nlibrary(ggplot2)\ng1 = ggplot() + geom_sf(data = nz, aes(fill = Median_income)) +\n geom_sf(data = nz_height) +\n scale_x_continuous(breaks = c(170, 175))\ng1\nlibrary(ggspatial)\nggplot() + \n layer_spatial(nz_elev) +\n geom_sf(data = nz, fill = NA) +\n annotation_scale() +\n scale_x_continuous(breaks = c(170, 175)) +\n scale_fill_continuous(na.value = NA)\nlibrary(cartogram)\nnz_carto = cartogram_cont(nz, \"Median_income\", itermax = 5)\ntm_shape(nz_carto) + tm_polygons(\"Median_income\")\nus_states9311 = st_transform(us_states, \"EPSG:9311\")\nus_states9311_ncont = cartogram_ncont(us_states9311, \"total_pop_15\")\nus_states9311_dorling = cartogram_dorling(us_states9311, \"total_pop_15\")"},{"path":"adv-map.html","id":"exercises-7","chapter":"9 Making maps with R","heading":"9.7 Exercises","text":"exercises rely new object, africa.\nCreate using world worldbank_df datasets spData package follows:also use zion nlcd datasets spDataLarge:E1. Create map showing geographic distribution Human Development Index (HDI) across Africa base graphics (hint: use plot()) tmap packages (hint: use tm_shape(africa) + ...).Name two advantages based experience.Name three mapping packages advantage .Bonus: create three maps Africa using three packages.E2. Extend tmap created previous exercise legend three bins: “High” (HDI 0.7), “Medium” (HDI 0.55 0.7) “Low” (HDI 0.55).\nBonus: improve map aesthetics, example changing legend title, class labels color palette.E3. Represent africa’s subregions map.\nChange default color palette legend title.\nNext, combine map map created previous exercise single plot.E4. Create land cover map Zion National Park.Change default colors match perception land cover categoriesAdd scale bar north arrow change position improve map’s aesthetic appealBonus: Add inset map Zion National Park’s location context state Utah. (Hint: object representing Utah can subset us_states dataset.)E5. Create facet maps countries Eastern Africa:one facet showing HDI representing population growth (hint: using variables HDI pop_growth, respectively)‘small multiple’ per countryE6. Building previous facet map examples, create animated maps East Africa:Showing country orderShowing country order legend showing HDIE7. Create interactive map HDI Africa:tmapWith mapviewWith leafletBonus: approach, add legend (automatically provided) scale barE8. Sketch paper ideas web-mapping application used make transport land-use policies evidence-based:city live, couple users per dayIn country live, dozens users per dayWorldwide hundreds users per day large data serving requirementsE9. Update code coffeeApp/app.R instead centering Brazil user can select country focus :Using textInput()Using selectInput()E10. Reproduce Figure 9.1 Figure 9.7 closely possible using ggplot2 package.E11. Join us_states us_states_df together calculate poverty rate state using new dataset.\nNext, construct continuous area cartogram based total population.\nFinally, create compare two maps poverty rate: (1) standard choropleth map (2) map using created cartogram boundaries.\ninformation provided first second map?\ndiffer ?E12. Visualize population growth Africa.\nNext, compare maps hexagonal regular grid created using geogrid package.","code":"\nlibrary(spData)\nafrica = world |> \n filter(continent == \"Africa\", !is.na(iso_a2)) |> \n left_join(worldbank_df, by = \"iso_a2\") |> \n select(name, subregion, gdpPercap, HDI, pop_growth) |> \n st_transform(\"ESRI:102022\") |> \n st_make_valid() |> \n st_collection_extract(\"POLYGON\")\nzion = read_sf((system.file(\"vector/zion.gpkg\", package = \"spDataLarge\")))\nnlcd = rast(system.file(\"raster/nlcd.tif\", package = \"spDataLarge\"))"},{"path":"gis.html","id":"gis","chapter":"10 Bridges to GIS software","heading":"10 Bridges to GIS software","text":"","code":""},{"path":"gis.html","id":"prerequisites-8","chapter":"10 Bridges to GIS software","heading":"Prerequisites","text":"chapter requires QGIS, SAGA GRASS GIS installed following packages attached:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(qgisprocess)\nlibrary(Rsagacmd)\nlibrary(rgrass)\nlibrary(rstac)\nlibrary(gdalcubes)"},{"path":"gis.html","id":"introduction-6","chapter":"10 Bridges to GIS software","heading":"10.1 Introduction","text":"defining feature interpreted languages interactive console — technically read-eval-print loop (REPL) — R way interact :\nrather relying pointing clicking different parts screen, type commands console execute Enter key.\ncommon effective workflow using interactive development environments RStudio VS Code type code source files source editor control interactive execution code shortcut Ctrl+Enter.Command line interfaces (CLIs) unique R: early computing environments relied command line ‘shell’ invention widespread adoption computer mouse 1990s graphical user interfaces (GUIs) became common.\nGRASS GIS longest-standing continuously developed open source GIS software, example, relied CLI gained GUI (Landa 2008).\npopular GIS software projects GUI-driven.\ncan interact QGIS, SAGA, GRASS GIS gvSIG system terminals embedded CLIs, design encourages people interact ‘pointing clicking’.\nunintended consequence GIS users miss advantages CLI-driven scriptable approaches.\nAccording creator popular QGIS software (Sherman 2008):advent ‘modern’ GIS software, people want point click way life. ’s good, tremendous amount flexibility power waiting command line. Many times can something command line fraction time can GUI.‘CLI vs GUI’ debate adversarial: ways working advantages, depending range factors including task (drawing new features well suited GUIs), level reproducibility desired, user’s skillset.\nGRASS GIS good example GIS software primarily based CLI also prominent GUI.\nLikewise, R focused CLI, IDEs RStudio provide GUI improving accessibility.\nSoftware neatly categorized CLI GUI-based.\nHowever, interactive command-line interfaces several important advantages terms :Automating repetitive tasksEnabling transparency reproducibilityEncouraging software development providing tools modify existing functions implement new onesDeveloping future-proof efficient programming skills high demandImproving touch typing, key skill digital ageOn hand, good GUIs also advantages, including:‘Shallow’ learning curves meaning geographic data can explored visualized without hours learning new languageSupport ‘digitizing’ (creating new vector datasets), including trace, snap topological tools56Enables georeferencing (matching raster images existing maps) ground control points orthorectificationSupports stereoscopic mapping (e.g., LiDAR structure motion)Another advantage dedicated GIS software projects provide access hundreds ‘geoalgorithms’ via ‘GIS bridges’ (Neteler Mitasova 2008).\nbridges computational recipes enhancing R’s capabilities solving geographic data problems topic chapter.R natural choice people wanting build bridges reproducible data analysis workflows GIS originated interface language.\nkey feature R (predecessor S) provides access statistical algorithms languages (particularly FORTRAN C), powerful high-level functional language intuitive REPL environment, C FORTRAN lacked (Chambers 2016).\nR continues tradition interfaces numerous languages, notably C++.Although R designed command line GIS, ability interface dedicated GISs gives astonishing geospatial capabilities.\nGIS bridges, R can replicate diverse workflows, additional reproducibility, scalability productivity benefits controlling programming environment consistent CLI.\nFurthermore, R outperforms GISs areas geocomputation, including interactive/animated map-making (see Chapter 9) spatial statistical modeling (see Chapter 12).chapter focuses ‘bridges’ three mature open source GIS products, summarized Table 10.1:QGIS, via package qgisprocess [Dunnington et al. (2024); Section 10.2]SAGA, via Rsagacmd [Pawley (2023); Section 10.3]GRASS GIS, via rgrass [Bivand (2023); Section 10.4]also major developments enabling open source GIS software write execute R scripts inside QGIS (see docs.qgis.org) GRASS GIS (see grasswiki.osgeo.org).TABLE 10.1: Comparison three open-source GIS. Hybrid refers support vector raster operations.addition three R-GIS bridges mentioned , chapter also provides brief introduction R interfaces spatial libraries (Section 10.6), spatial databases (Section 10.7), cloud-based processing Earth observation data (Section 10.8).","code":""},{"path":"gis.html","id":"rqgis","chapter":"10 Bridges to GIS software","heading":"10.2 qgisprocess: a bridge to QGIS and beyond","text":"QGIS popular open-source GIS (Table 10.1; Graser Olaya (2015)).\nQGIS provides unified interface QGIS’s native geoalgorithms, GDAL, — installed — providers GRASS GIS, SAGA (Graser Olaya 2015).\nSince version 3.14 (released summer 2020), QGIS ships qgis_process command-line utility accessing bounty functionality geocomputation.\nqgis_process provides access 300+ geoalgorithms standard QGIS installation 1,000+ via plugins external providers GRASS GIS SAGA.qgisprocess package provides access qgis_process R.\npackage requires QGIS — relevant plugins GRASS GIS SAGA, used chapter — installed available system.\ninstallation instructions, see qgisprocess’s documentation.quick way get --running qgisprocess Docker installed via qgis image developed part project.\nAssuming Docker installed sufficient computational resources, can run R session qgisprocess relevant plugins following command (see geocompx/docker repository details):docker run -e DISABLE_AUTH=true -p 8786:8787 ghcr.io/geocompx/docker:qgisThis package automatically tries detect QGIS installation complains find .57\npossible solutions configuration fails: can set options(qgisprocess.path = \"path//your_qgis_process\"), set R_QGISPROCESS_PATH environment variable.\napproaches can also used one QGIS installation want decide one use.\ndetails, please refer qgisprocess ‘getting started’ vignette.\nNext, can find plugins (meaning different software) available computer:tells us GRASS GIS (grassprovider) SAGA (processing_saga_nextgen) plugins available system yet enabled.\nSince need later chapter, let’s enable .Please note aside installing SAGA system, also need install QGIS Python plugin Processing Saga NextGen.\ncan within QGIS Plugin Manager programmatically help Python package qgis-plugin-manager (least Linux).qgis_providers() lists name software corresponding count available geoalgorithms.output table affirms can use QGIS geoalgorithms (native, qgis, 3d, pdal) external ones third-party providers GDAL, SAGA GRASS GIS QGIS interface.Now, ready geocomputation QGIS friends, within R!\nLet’s try two example case studies.\nfirst one shows unite two polygonal datasets different borders (Section 10.2.1).\nsecond one focuses deriving new information digital elevation model represented raster (Section 10.2.2).","code":"\nlibrary(qgisprocess)\n#> Attempting to load the cache ... Success!\n#> QGIS version: 3.30.3-'s-Hertogenbosch\n#> ...\nqgis_plugins()\n#> # A tibble: 4 × 2\n#> name enabled\n#> \n#> 1 grassprovider FALSE\n#> 2 otbprovider FALSE\n#> 3 processing TRUE\n#> 4 processing_saga_nextgen FALSE\nqgis_enable_plugins(c(\"grassprovider\", \"processing_saga_nextgen\"), \n quiet = TRUE)\nqgis_providers()\n#> # A tibble: 7 × 3\n#> provider provider_title algorithm_count\n#> \n#> 1 gdal GDAL 56\n#> 2 grass GRASS 306\n#> 3 qgis QGIS 50\n#> 4 3d QGIS (3D) 1\n#> 5 native QGIS (native c++) 243\n#> 6 pdal QGIS (PDAL) 17\n#> 7 sagang SAGA Next Gen 509"},{"path":"gis.html","id":"qgis-vector","chapter":"10 Bridges to GIS software","heading":"10.2.1 Vector data","text":"Consider situation two polygon objects different spatial units (e.g., regions, administrative units).\ngoal merge two objects one, containing boundary lines related attributes.\nuse incongruent polygons already encountered Section 4.2.8 (Figure 10.1).\npolygon datasets available spData package, like use geographic CRS (see also Chapter 7).\nFIGURE 10.1: Two areal units: incongruent (black lines) aggregating zones (red borders).\nfirst step find algorithm can merge two vector objects.\nlist available algorithms, can use qgis_algorithms() function.\nfunction returns data frame containing available providers algorithms contain.58To find algorithm, can use qgis_search_algorithms() function.\nAssuming short description function contains word “union”, can run following code find algorithm interest:One algorithms list, \"native:union\", sounds promising.\nnext step find algorithm can use .\nrole qgis_show_help(), returns short summary algorithm , arguments, outputs.59\nmakes output rather long.\nfollowing command returns data frame row representing argument required \"native:union\" columns name, description, type, default value, available values, acceptable values associated :arguments, contained union_arguments$name, INPUT, OVERLAY, OVERLAY_FIELDS_PREFIX, OUTPUT.\nunion_arguments$acceptable_values contains list possible input values argument.\nMany functions require inputs representing paths vector layer; qgisprocess functions accept sf objects arguments.\nObjects terra stars package can used “path raster layer” expected.\ncan convenient, recommend providing path spatial data disk read submit qgisprocess algorithm: first thing qgisprocess executing geoalgorithm export spatial data living R session back disk format known QGIS .gpkg .tif files.\ncan increase algorithm runtimes.main function qgisprocess qgis_run_algorithm(), sends inputs QGIS returns outputs.\naccepts algorithm name set named arguments shown help list, performs expected calculations.\ncase, three arguments seem important: INPUT, OVERLAY, OUTPUT.\nfirst one, INPUT, main vector object incongr_wgs, second one, OVERLAY, aggzone_wgs.\nlast argument, OUTPUT, output file name, qgisprocess automatically choose create tempdir() none provided.Running line code save two input objects temporary .gpkg files, run selected algorithm , return temporary .gpkg file output.\nqgisprocess package stores qgis_run_algorithm() result list containing, case, path output file.\ncan either read file back R read_sf() (e.g., union_sf = read_sf(union[[1]])) directly st_as_sf():Note QGIS union operation merges two input layers one layer using intersection symmetrical difference two input layers (, way, also default union operation GRASS GIS SAGA).\nst_union(incongr_wgs, aggzone_wgs) (see Exercises)!result, union_sf, multipolygon larger number features two input objects.\nNotice, however, many polygons small represent real areas rather result two datasets different level detail.\nartifacts error called sliver polygons (see red-colored polygons left panel Figure 10.2).\nOne way identify slivers find polygons comparatively small areas, , e.g., 25000 m2, next remove .\nLet’s search appropriate algorithm.time found algorithm, v.clean, included QGIS, GRASS GIS.\nGRASS GIS’s v.clean powerful tool cleaning topology spatial vector data.\nImportantly, can use qgisprocess.Similar previous step, start looking algorithm’s help.omitted output , help text quite long contains lot arguments.60\nv.clean multi-tool – can clean different types geometries solve different types topological problems.\nexample, let’s focus just arguments, however, encourage visit algorithm’s documentation learn v.clean capabilities.main argument algorithm input – vector object.\nNext, need select tool – cleaning method. 61\ndozen tools exist v.clean allowing remove duplicate geometries, remove small angles lines, remove small areas, among others.\ncase, interested latter tool, rmarea.\nSeveral tools, rmarea included, expect additional argument threshold, whose behavior depends selected tool.\ncase, rmarea tool removes areas smaller equal provided threshold.\nNote threshold must specified square meters regardless coordinate reference system input layer.Let’s run algorithm convert output new sf object clean_sf.result, right panel Figure 10.2, looks expected – sliver polygons now removed.\nFIGURE 10.2: Sliver polygons colored red (left panel). Cleaned polygons (right panel).\n","code":"\ndata(\"incongruent\", \"aggregating_zones\", package = \"spData\")\nincongr_wgs = st_transform(incongruent, \"EPSG:4326\")\naggzone_wgs = st_transform(aggregating_zones, \"EPSG:4326\")\n# output not shown\nqgis_algorithms()\nqgis_search_algorithms(\"union\")\n#> # A tibble: 2 × 5\n#> provider provider_title group algorithm algorithm_title \n#> \n#> 1 native QGIS (native c++) Vector overlay native:multiunion Union (multiple)\n#> 2 native QGIS (native c++) Vector overlay native:union Union \nalg = \"native:union\"\nunion_arguments = qgis_get_argument_specs(alg)\nunion_arguments\n#> # A tibble: 5 × 6\n#> name description qgis_type default_value available_values acceptable_...\n#> \n#> 1 INPUT Input layer source \n#> 2 OVERLAY Overlay la… source \n#> 3 OVERLA… Overlay fi… string \n#> 4 OUTPUT Union sink \n#> 5 GRID_S… Grid size number \n\n#> [[1]]\n#> [1] \"A numeric value\" \n#> [2] \"field:FIELD_NAME to use a data defined value taken from the FIELD_NAME\n#> field\" \n#> [3] \"expression:SOME EXPRESSION to use a data defined value calculated using\n#> a custom QGIS expression\"\nunion = qgis_run_algorithm(alg,\n INPUT = incongr_wgs, OVERLAY = aggzone_wgs\n)\nunion\n#> $ OUTPUT: 'qgis_outputVector' chr \"/tmp/...gpkg\"\nunion_sf = st_as_sf(union)\nqgis_search_algorithms(\"clean\")\n#> # A tibble: 1 × 5\n#> provider provider_title group algorithm algorithm_title\n#> \n#> 1 grass GRASS Vector (v.*) grass:v.clean v.clean\nqgis_show_help(\"grass:v.clean\")\nqgis_get_argument_specs(\"grass:v.clean\") |>\n select(name, description) |>\n slice_head(n = 4)\n#> # A tibble: 4 × 2\n#> name description\n#> \n#> 1 input Layer to clean\n#> 2 type Input feature type\n#> 3 tool Cleaning tool\n#> 4 threshold Threshold (comma separated for each tool)\nclean = qgis_run_algorithm(\"grass:v.clean\",\n input = union_sf, \n tool = \"rmarea\", threshold = 25000\n)\nclean_sf = st_as_sf(clean)"},{"path":"gis.html","id":"qgis-raster","chapter":"10 Bridges to GIS software","heading":"10.2.2 Raster data","text":"Digital elevation models (DEMs) contain elevation information raster cell.\nused many purposes, including satellite navigation, water flow models, surface analysis, visualization.\n, interested deriving new information DEM raster used predictors statistical learning.\nVarious terrain parameters can helpful, example, prediction landslides (see Chapter 12).section, use dem.tif – digital elevation model Mongón study area (downloaded Land Process Distributed Active Archive Center, see also ?dem.tif).\nresolution 30 x 30 meters uses projected CRS.terra package’s terrain() command already allows calculation several fundamental topographic characteristics slope, aspect, TPI (Topographic Position Index), TRI (Topographic Ruggedness Index), roughness, flow directions.\nHowever, GIS programs offer many terrain characteristics, can suitable certain contexts.\nexample, topographic wetness index (TWI) found useful studying hydrological biological processes (Sørensen, Zinko, Seibert 2006).\nLet’s search algorithm list index using \"wetness\" keyword.output code suggests desired algorithm exists SAGA software.62\nThough SAGA hybrid GIS, main focus raster processing, , particularly digital elevation models (soil properties, terrain attributes, climate parameters).\nHence, SAGA especially good fast processing large (high-resolution) raster datasets (Conrad et al. 2015).\"sagang:sagawetnessindex\" algorithm actually modified TWI, results realistic soil moisture potential cells located valley floors (Böhner Selige 2006)., stick default values arguments.\nTherefore, specify one argument – input DEM.\ncourse, applying algorithm make sure parameter values correspondence study aim.63Before running SAGA algorithm within QGIS, change default raster output format .tif SAGA’s native raster format .sdat.\nHence, output rasters specify now written .sdat format.\nDepending software versions (SAGA, GDAL) using, might necessary, often enough save trouble trying read-output rasters created SAGA.\"sagang:sagawetnessindex\" returns one four rasters – catchment area, catchment slope, modified catchment area, topographic wetness index.\ncan read selected output providing output name qgis_as_terra() function.\nsince done SAGA processing within QGIS, change raster output format back .tif.can see TWI map left panel Figure 10.3.\ntopographic wetness index unitless: low values represent areas accumulate water, higher values show areas accumulate water increasing levels.Information digital elevation models can also categorized, example, geomorphons – geomorphological phenotypes consisting ten classes represent terrain forms, slopes, ridges, valleys (Jasiewicz Stepinski 2013).\nphenotypes used many studies, including landslide susceptibility, ecosystem services, human mobility, digital soil mapping.original implementation geomorphons’ algorithm created GRASS GIS, can find qgisprocess list \"grass:r.geomorphon\":Calculation geomorphons requires input DEM (elevation) can customized set optional arguments.\nincludes, search – length line--sight calculated, -m – flag specifying search value provided meters (number cells).\ninformation additional arguments can found original paper GRASS GIS documentation.output, dem_geomorph$forms, contains raster file ten categories – representing terrain form.\ncan read R qgis_as_terra(), visualize (Figure 10.3, right panel) use subsequent calculations.Interestingly, connections geomorphons TWI values, shown Figure 10.3.\nlargest TWI values mostly occur valleys hollows, lowest values seen, expected, ridges.\nFIGURE 10.3: Topographic wetness index (TWI, left panel) geomorphons (right panel) derived Mongón study area.\n","code":"\nlibrary(qgisprocess)\nlibrary(terra)\ndem = system.file(\"raster/dem.tif\", package = \"spDataLarge\")\nqgis_search_algorithms(\"wetness\") |>\n dplyr::select(provider_title, algorithm) |>\n head(2)\n#> # A tibble: 2 × 2\n#> provider_title algorithm\n#> \n#> 1 SAGA Next Gen sagang:sagawetnessindex\n#> 2 SAGA Next Gen sagang:topographicwetnessindexonestep\nqgis_show_help(\"sagang:sagawetnessindex\")\noptions(qgisprocess.tmp_raster_ext = \".sdat\")\ndem_wetness = qgis_run_algorithm(\"sagang:sagawetnessindex\",\n DEM = dem\n)\ndem_wetness_twi = qgis_as_terra(dem_wetness$TWI)\n# plot(dem_wetness_twi)\noptions(qgisprocess.tmp_raster_ext = \".tif\")\nqgis_search_algorithms(\"geomorphon\")\n#> [1] \"grass:r.geomorphon\" \"sagang:geomorphons\" \nqgis_show_help(\"grass:r.geomorphon\")\n# output not shown\ndem_geomorph = qgis_run_algorithm(\"grass:r.geomorphon\",\n elevation = dem,\n `-m` = TRUE, search = 120\n)\ndem_geomorph_terra = qgis_as_terra(dem_geomorph$forms)"},{"path":"gis.html","id":"saga","chapter":"10 Bridges to GIS software","heading":"10.3 SAGA","text":"System Automated Geoscientific Analyses (SAGA; Table 10.1) provides possibility execute SAGA modules via command-line interface (saga_cmd.exe Windows just saga_cmd Linux) (see SAGA wiki modules).\naddition, Python interface (SAGA Python API).\nRsagacmd uses former run SAGA within R.use Rsagacmd section delineate areas similar values normalized difference vegetation index (NDVI) Mongón study area Peru September year 2000 (Figure 10.4, left panel) using seeded region growing algorithm SAGA.64To start using Rsagacmd, need run saga_gis() function.\nserves two main purposes:dynamically65 creates new object contains links valid SAGA libraries toolsIt sets general package options, raster_backend (R package use handling raster data), vector_backend (R package use handling vector data), cores (maximum number CPU cores used processing, default: )saga object contains connections available SAGA tools.\norganized list libraries (groups tools), inside library list tools.\ncan access tool $ sign (remember use TAB autocompletion).seeded region growing algorithm works two main steps (Adams Bischof 1994; Böhner, Selige, Ringeler 2006).\nFirst, initial cells (“seeds”) generated finding cells smallest variance local windows specified size.\nSecond, region growing algorithm used merge neighboring pixels seeds create homogeneous areas.example, first pointed imagery_segmentation library seed_generation tool.\nalso assigned sg object, retype whole tool code next steps.66\njust type sg, get quick summary tool data frame parameters, descriptions, defaults.\nmay also use tidy(sg) extract just parameters’ table.\nseed_generation tool takes raster dataset first argument (features); optional arguments include band_width specifies size initial polygons.output list three objects: variance – raster map local variance, seed_grid – raster map generated seeds, seed_points – spatial vector object generated seeds.second SAGA tool use seeded_region_growing.67\nseeded_region_growing tool requires two inputs: seed_grid calculated previous step ndvi raster object.\nAdditionally, can specify several parameters, normalize standardize input features, neighbour (4- 8-neighborhood), method.\nlast parameter can set either 0 1 (region growing based raster cells’ values positions just values).\ndetailed description method, see Böhner, Selige, Ringeler (2006)., change method 1, meaning output regions created based similarity NDVI values.tool returns list three objects: segments, similarity, table.\nsimilarity object raster showing similarity seeds cells, table data frame storing information input seeds.\nFinally, ndvi_srg$segments raster resulting areas (Figure 10.4, right panel).\ncan convert polygons .polygons() st_as_sf() (Section 6.5).\nFIGURE 10.4: Normalized difference vegetation index (NDVI, left panel) NDVI-based segments derived using seeded region growing algorithm Mongón study area.\nresulting polygons (segments) represent areas similar values.\ncan also aggregated larger polygons using various techniques, clustering (e.g., k-means), regionalization (e.g., SKATER) supervised classification methods.\ncan try Exercises.R also tools achieve goal creating polygons similar values (-called segments).\nincludes SegOptim package (Gonçalves et al. 2019) allows running several image segmentation algorithms supercells package (Nowosad Stepinski 2022) implements superpixels algorithm SLIC work geospatial data.","code":"\nndvi = rast(system.file(\"raster/ndvi.tif\", package = \"spDataLarge\"))\nlibrary(Rsagacmd)\nsaga = saga_gis(raster_backend = \"terra\", vector_backend = \"sf\")\nsg = saga$imagery_segmentation$seed_generation\nndvi_seeds = sg(ndvi, band_width = 2)\n# plot(ndvi_seeds$seed_grid)\nsrg = saga$imagery_segmentation$seeded_region_growing\nndvi_srg = srg(ndvi_seeds$seed_grid, ndvi, method = 1)\nplot(ndvi_srg$segments)\nndvi_segments = ndvi_srg$segments |>\n as.polygons() |>\n st_as_sf()"},{"path":"gis.html","id":"grass","chapter":"10 Bridges to GIS software","heading":"10.4 GRASS GIS","text":"U.S. Army - Construction Engineering Research Laboratory (USA-CERL) created core Geographical Resources Analysis Support System (GRASS GIS) (Table 10.1; Neteler Mitasova (2008)) 1982 1995.\nAcademia continued work since 1997.\nSimilar SAGA, GRASS GIS focused raster processing beginning , later since GRASS GIS 6.0, adding advanced vector functionality (Bivand, Pebesma, Gómez-Rubio 2013).GRASS GIS stores input data internal database.\nregard vector data, GRASS GIS default topological GIS, .e., stores geometry adjacent features .\nSQLite default database driver vector attribute management, attributes linked geometry, .e., GRASS GIS database, via keys (GRASS GIS vector management).one can use GRASS GIS, one set GRASS GIS database (also within R), users might find process bit intimidating beginning.\nFirst , GRASS GIS database requires directory, , turn, contains location (see GRASS GIS Database help pages grass.osgeo.org information).\nlocation stores geodata one project one area.\nWithin one location, several mapsets can exist typically refer different users different tasks.\nlocation also PERMANENT mapset – mandatory mapset created automatically.\norder share geographic data users project, database owner can add spatial data PERMANENT mapset.\naddition, PERMANENT mapset stores projection, spatial extent default resolution raster data.\n, sum – GRASS GIS database may contain many locations (data one location CRS), location can store many mapsets (groups datasets).\nPlease refer Neteler Mitasova (2008) GRASS GIS quick start information GRASS GIS spatial database system.\nquickly use GRASS GIS within R, use link2GI package, however, one can also set GRASS GIS database step--step.\nSee GRASS within R .\nPlease note code instructions following paragraphs might hard follow using GRASS GIS first time running code line--line examining intermediate results, reasoning behind become even clearer., introduce rgrass one interesting problems GIScience: traveling salesman problem.\nSuppose traveling salesman like visit 24 customers.\nAdditionally, salesman like start finish journey home makes total 25 locations covering shortest distance possible.\nsingle best solution problem; however, check possible solutions, (mostly) impossible modern computers (Longley 2015).\ncase, number possible solutions correspond (25 - 1)! / 2, .e., factorial 24 divided 2 (since differentiate forward backward direction).\nEven one iteration can done nanosecond, still corresponds 9837145 years.\nLuckily, clever, almost optimal solutions run tiny fraction inconceivable amount time.\nGRASS GIS provides one solutions (details, see v.net.salesman).\nuse case, like find shortest path first 25 bicycle stations (instead customers) London’s streets (simply assume first bike station corresponds home traveling salesman).Aside cycle hire points data, need street network area.\ncan download OpenStreetMap help osmdata package (see also Section 8.5).\n, constrain query street network (OSM language called “highway”) bounding box points, attach corresponding data sf-object.\nosmdata_sf() returns list several spatial objects (points, lines, polygons, etc.), , keep line objects related ids.68Now data, can go initiate GRASS GIS session.\nLuckily, linkGRASS() link2GI packages lets one set GRASS GIS environment just one line code.\nthing need provide spatial object determines projection extent spatial database.\nFirst, linkGRASS() finds GRASS GIS installations computer.\nSince set ver_select TRUE, can interactively choose one found GRASS GIS-installations.\njust one installation, linkGRASS() automatically chooses .\nSecond, linkGRASS() establishes connection GRASS GIS.can use GRASS GIS geoalgorithms, also need add data GRASS GIS’s spatial database.\nLuckily, convenience function write_VECT() us.\n(Use write_RAST() raster data.)\ncase, add street cycle hire point data using first attribute column, name london_streets points GRASS GIS.rgrass package expects inputs gives outputs terra objects.\nTherefore, need convert sf spatial vectors terra’s SpatVectors using vect() function able use write_VECT().69Now, datasets exist GRASS GIS database.\nperform network analysis, need topologically clean street network.\nGRASS GIS’s \"v.clean\" takes care removal duplicates, small angles dangles, among others.\n, break lines intersection ensure subsequent routing algorithm can actually turn right left intersection, save output GRASS GIS object named streets_clean.likely cycling station points lie exactly street segment.\nHowever, find shortest route , need connect nearest streets segment.\n\"v.net\"’s connect-operator exactly .\nsave output streets_points_con.resulting clean dataset serves input \"v.net.salesman\" algorithm, finally finds shortest route cycle hire stations.\nOne arguments center_cats, requires numeric range input.\nrange represents points shortest route calculated.\nSince like calculate route cycle stations, set 1-25.\naccess GRASS GIS help page traveling salesman algorithm, run execGRASS(\"g.manual\", entry = \"v.net.salesman\").see result, read result R, convert sf-object keeping geometry, visualize help mapview package (Figure 10.5 Section 9.4).\nFIGURE 10.5: Shortest route (blue line) 24 cycle hire stations (blue dots) OSM street network London.\nimportant considerations note process:used GRASS GIS’s spatial database allows faster processing.\nmeans exported geographic data beginning.\ncreated new objects imported final result back R.\nfind datasets currently available, run execGRASS(\"g.list\", type = \"vector,raster\", flags = \"p\").also accessed already existing GRASS GIS spatial database within R.\nPrior importing data R, might want perform (spatial) subsetting.\nUse \"v.select\" \"v.extract\" vector data.\n\"db.select\" lets select subsets attribute table vector layer without returning corresponding geometry.can also start R within running GRASS GIS session (information, please refer Bivand, Pebesma, Gómez-Rubio 2013).Refer excellent GRASS GIS online help execGRASS(\"g.manual\", flags = \"\") information available GRASS GIS geoalgorithm.","code":"\ndata(\"cycle_hire\", package = \"spData\")\npoints = cycle_hire[1:25, ]\nlibrary(osmdata)\nb_box = st_bbox(points)\nlondon_streets = opq(b_box) |>\n add_osm_feature(key = \"highway\") |>\n osmdata_sf()\nlondon_streets = london_streets[[\"osm_lines\"]]\nlondon_streets = select(london_streets, osm_id)\nlibrary(rgrass)\nlink2GI::linkGRASS(london_streets, ver_select = TRUE)\nwrite_VECT(terra::vect(london_streets), vname = \"london_streets\")\nwrite_VECT(terra::vect(points[, 1]), vname = \"points\")\nexecGRASS(\n cmd = \"v.clean\", input = \"london_streets\", output = \"streets_clean\",\n tool = \"break\", flags = \"overwrite\"\n)\nexecGRASS(\n cmd = \"v.net\", input = \"streets_clean\", output = \"streets_points_con\",\n points = \"points\", operation = \"connect\", threshold = 0.001,\n flags = c(\"overwrite\", \"c\")\n)\nexecGRASS(\n cmd = \"v.net.salesman\", input = \"streets_points_con\",\n output = \"shortest_route\", center_cats = paste0(\"1-\", nrow(points)),\n flags = \"overwrite\"\n)\nroute = read_VECT(\"shortest_route\") |>\n st_as_sf() |>\n st_geometry()\nmapview::mapview(route) + points"},{"path":"gis.html","id":"when-to-use-what","chapter":"10 Bridges to GIS software","heading":"10.5 When to use what?","text":"recommend single R-GIS interface hard since usage depends personal preferences, tasks hand, familiarity different GIS software packages turn probably depends domain.\nmentioned previously, SAGA especially good fast processing large (high-resolution) raster datasets frequently used hydrologists, climatologists soil scientists (Conrad et al. 2015).\nGRASS GIS, hand, GIS presented supporting topologically based spatial database especially useful network analyses also simulation studies.\nQGISS much user-friendly compared GRASS GIS SAGA, especially first-time GIS users, probably popular open-source GIS.\nTherefore, qgisprocess appropriate choice use cases.\nmain advantages :unified access several GIS, therefore provision >1000 geoalgorithms (Table 10.1) including duplicated functionality, e.g., can perform overlay-operations using QGIS-, SAGA- GRASS GIS-geoalgorithmsAutomatic data format conversions (SAGA uses .sdat grid files GRASS GIS uses database format, QGIS handle corresponding conversions)automatic passing geographic R objects QGIS geoalgorithms back RConvenience functions support named arguments automatic default value retrieval (inspired rgrass)means, use cases certainly use one R-GIS bridges.\nThough QGIS GIS providing unified interface several GIS software packages, provides access subset corresponding third-party geoalgorithms (information, please refer Muenchow, Schratz, Brenning (2017)).\nTherefore, use complete set SAGA GRASS GIS functions, stick Rsagacmd rgrass.\naddition, like run simulations help geodatabase (Krug, Roura-Pascual, Richardson 2010), use rgrass directly since qgisprocess always starts new GRASS GIS session call.\nFinally, need topological correct data /spatial database management functionality multi-user access, recommend usage GRASS GIS.Please note number GIS software packages scripting interface dedicated R package accesses : gvSig, OpenJump, Orfeo Toolbox.70","code":""},{"path":"gis.html","id":"gdal","chapter":"10 Bridges to GIS software","heading":"10.6 Bridges to GDAL","text":"discussed Chapter 8, GDAL low-level library supports many geographic data formats.\nGDAL effective GIS programs use GDAL background importing exporting geographic data, rather reinventing wheel using bespoke read-write code.\nGDAL offers data /O.\ngeoprocessing tools vector raster data, functionality create tiles serving raster data online, rapid rasterization vector data.\nSince GDAL command-line tool, commands can accessed within R via system() command.code chunk demonstrates functionality:\nlinkGDAL() searches computer working GDAL installation adds location executable files PATH variable, allowing GDAL called (usually needed Windows).Now can use system() function call GDAL tools.\nexample, ogrinfo provides metadata vector dataset.\ncall tool two additional flags: -al list features layers -get summary (complete geometry list):commonly used GDAL tools include:gdalinfo: provides metadata raster datasetgdal_translate: converts different raster file formatsogr2ogr: converts different vector file formatsgdalwarp: reprojects, transforms, clips raster datasetsgdaltransform: transforms coordinatesVisit https://gdal.org/programs/ see complete list GDAL tools read help files.‘link’ GDAL provided link2GI used foundation advanced GDAL work R system CLI.\nTauDEM (https://hydrology.usu.edu/taudem/) Orfeo Toolbox (https://www.orfeo-toolbox.org/) spatial data processing libraries/programs offering command-line interface – example shows access libraries system command line via R.\nturn starting point creating proper interface libraries form new R packages.diving project create new bridge, however, important aware power existing R packages system() calls may platform-independent (may fail computers).\nhand, sf terra brings power provided GDAL, GEOS PROJ R via R/C++ interface provided Rcpp, avoids system() calls.71","code":"\nlink2GI::linkGDAL()\nour_filepath = system.file(\"shapes/world.gpkg\", package = \"spData\")\ncmd = paste(\"ogrinfo -al -so\", our_filepath)\nsystem(cmd)\n#> INFO: Open of `.../spData/shapes/world.gpkg'\n#> using driver `GPKG' successful.\n#>\n#> Layer name: world\n#> Geometry: Multi Polygon\n#> Feature Count: 177\n#> Extent: (-180.000000, -89.900000) - (179.999990, 83.645130)\n#> Layer SRS WKT:\n#> ..."},{"path":"gis.html","id":"postgis","chapter":"10 Bridges to GIS software","heading":"10.7 Bridges to spatial databases","text":"\nSpatial database management systems (spatial DBMSs) store spatial non-spatial data structured way.\ncan organize large collections data related tables (entities) via unique identifiers (primary foreign keys) implicitly via space (think instance spatial join).\nuseful geographic datasets tend become big messy quite quickly.\nDatabases enable storing querying large datasets efficiently based spatial non-spatial fields, provide multi-user access topology support.important open source spatial database PostGIS (Obe Hsu 2015).72\nR bridges spatial DBMSs PostGIS important, allowing access huge data stores without loading several gigabytes geographic data RAM, likely crashing R session.\nremainder section shows PostGIS can called R, based “Hello real-world” PostGIS Action, Second Edition (Obe Hsu 2015).73The subsequent code requires working internet connection, since accessing PostgreSQL/PostGIS database living QGIS Cloud (https://qgiscloud.com/).74\nfirst step create connection database providing name, host name, user information.new object, conn, just established link R session database.\nstore data.Often first question , ‘tables can found database?’.\ncan answered dbListTables() follows:answer five tables.\n, interested restaurants highways tables.\nformer represents locations fast-food restaurants US, latter principal US highways.\nfind attributes available table, can run dbListFields:Now, know available datasets, can perform queries – ask database questions.\nquery needs provided language understandable database – usually, SQL.\nfirst query select US Route 1 state Maryland (MD) highways table.\nNote read_sf() allows us read geographic data database provided open connection database query.\nAdditionally, read_sf() needs know column represents geometry (: wkb_geometry).results sf-object named us_route type MULTILINESTRING.mentioned , also possible ask non-spatial questions, also query datasets based spatial properties.\nshow , next example adds 35-kilometer (35,000 m) buffer around selected highway (Figure 10.6).Note spatial query using functions (ST_Union(), ST_Buffer()) already familiar .\nfind also sf-package, though written lowercase characters (st_union(), st_buffer()).\nfact, function names sf package largely follow PostGIS naming conventions.75The last query find Hardee’s restaurants (HDE) within 35-km buffer zone (Figure 10.6).Please refer Obe Hsu (2015) detailed explanation spatial SQL query.\nFinally, good practice close database connection follows:76\nFIGURE 10.6: Visualization output previous PostGIS commands showing highway (black line), buffer (light yellow) four restaurants (red points) within buffer.\nUnlike PostGIS, sf supports spatial vector data.\nquery manipulate raster data stored PostGIS database, use rpostgis package (Bucklin Basille 2018) /use command line tools rastertopgsql comes part PostGIS installation.subsection brief introduction PostgreSQL/PostGIS.\nNevertheless, like encourage practice storing geographic non-geographic data spatial DBMS attaching subsets R’s global environment needed (geo-)statistical analysis.\nPlease refer Obe Hsu (2015) detailed description SQL queries presented comprehensive introduction PostgreSQL/PostGIS general.\nPostgreSQL/PostGIS formidable choice open-source spatial database.\ntrue lightweight SQLite/SpatiaLite database engine GRASS GIS uses SQLite background (see Section 10.4).datasets big PostgreSQL/PostGIS require massive spatial data management query performance, may worth exploring large-scale geographic querying distributed computing systems.\nsystems outside scope book, worth mentioning open source software providing functionality exists.\nProminent projects space include GeoMesa Apache Sedona.\napache.sedona package provides interface latter.","code":"\nlibrary(RPostgreSQL)\nconn = dbConnect(\n drv = PostgreSQL(),\n dbname = \"rtafdf_zljbqm\", host = \"db.qgiscloud.com\",\n port = \"5432\", user = \"rtafdf_zljbqm\", password = \"d3290ead\"\n)\ndbListTables(conn)\n#> [1] \"spatial_ref_sys\" \"topology\" \"layer\" \"restaurants\"\n#> [5] \"highways\"\ndbListFields(conn, \"highways\")\n#> [1] \"qc_id\" \"wkb_geometry\" \"gid\" \"feature\"\n#> [5] \"name\" \"state\"\nquery = paste(\n \"SELECT *\",\n \"FROM highways\",\n \"WHERE name = 'US Route 1' AND state = 'MD';\"\n)\nus_route = read_sf(conn, query = query, geom = \"wkb_geometry\")\nquery = paste(\n \"SELECT ST_Union(ST_Buffer(wkb_geometry, 35000))::geometry\",\n \"FROM highways\",\n \"WHERE name = 'US Route 1' AND state = 'MD';\"\n)\nbuf = read_sf(conn, query = query)\nquery = paste(\n \"SELECT *\",\n \"FROM restaurants r\",\n \"WHERE EXISTS (\",\n \"SELECT gid\",\n \"FROM highways\",\n \"WHERE\",\n \"ST_DWithin(r.wkb_geometry, wkb_geometry, 35000) AND\",\n \"name = 'US Route 1' AND\",\n \"state = 'MD' AND\",\n \"r.franchise = 'HDE');\"\n)\nhardees = read_sf(conn, query = query)\nRPostgreSQL::postgresqlCloseConnection(conn)"},{"path":"gis.html","id":"cloud","chapter":"10 Bridges to GIS software","heading":"10.8 Bridges to cloud technologies and services","text":"recent years, cloud technologies become prominent internet.\nalso includes use store process spatial data.\nMajor cloud computing providers (Amazon Web Services, Microsoft Azure / Planetary Computer, Google Cloud Platform, others) offer vast catalogs open Earth observation data, complete Sentinel-2 archive, platforms.\ncan use R directly connect process data archives, ideally machine cloud region.Three promising developments make working image archives cloud platforms easier efficient SpatioTemporal Asset Catalog (STAC), cloud-optimized GeoTIFF (COG) image file format, concept data cubes.\nSection 10.8.1 introduces individual developments briefly describes can used R.Besides hosting large data archives, numerous cloud-based services process Earth observation data launched last years.\nincludes OpenEO initiative – unified interface programming languages (including R) various cloud-based services.\ncan find information OpenEO Section 10.8.2.","code":""},{"path":"gis.html","id":"staccog","chapter":"10 Bridges to GIS software","heading":"10.8.1 STAC, COGs, and data cubes in the cloud","text":"SpatioTemporal Asset Catalog (STAC) general description format spatiotemporal data used describe variety datasets cloud platforms including imagery, synthetic aperture radar (SAR) data, point clouds.\nBesides simple static catalog descriptions, STAC-API presents web service query items (e.g., images) catalogs space, time, properties.\nR, rstac package (Simoes, Souza, et al. 2021) allows connect STAC-API endpoints search items.\nexample , request images Sentinel-2 Cloud-Optimized GeoTIFF (COG) dataset Amazon Web Services intersect predefined area time interest.\nresult contains found images metadata (e.g., cloud cover) URLs pointing actual files AWS.Cloud storage differs local hard disks traditional image file formats perform well cloud-based geoprocessing.\nCloud-optimized GeoTIFF makes reading rectangular subsets image reading images lower resolution much efficient.\nR user, install anything work COGs GDAL (package using ) can already work COGs.\nHowever, keep mind availability COGs big plus browsing catalogs data providers.larger areas interest, requested images still relatively difficult work : may use different map projections, may spatially overlap, spatial resolution often depends spectral band.\ngdalcubes package (Appel Pebesma 2019) can used abstract individual images create process image collections four-dimensional data cubes.code shows minimal example create lower resolution (250 m) maximum NDVI composite Sentinel-2 images returned previous STAC-API search.filter images cloud cover, provide property filter function applied STAC result item creating image collection.\nfunction receives available metadata image input list returns single logical value images function yields TRUE considered.\ncase, ignore images 10% cloud cover.\ndetails, please refer tutorial presented OpenGeoHub summer school 2021.77The combination STAC, COGs, data cubes forms cloud-native workflow analyze (large) collections satellite imagery cloud.\ntools already form backbone, example, sits package, allows land use land cover classification big Earth observation data R.\npackage builds EO data cubes image collections available cloud services performs land classification data cubes using various machine deep learning algorithms.\ninformation sits, visit https://e-sensing.github.io/sitsbook/ read related article (Simoes, Camara, et al. 2021).","code":"\nlibrary(rstac)\n# Connect to the STAC-API endpoint for Sentinel-2 data\n# and search for images intersecting our AOI\ns = stac(\"https://earth-search.aws.element84.com/v0\")\nitems = s |>\n stac_search(collections = \"sentinel-s2-l2a-cogs\",\n bbox = c(7.1, 51.8, 7.2, 52.8),\n datetime = \"2020-01-01/2020-12-31\") |>\n post_request() |>\n items_fetch()\nlibrary(gdalcubes)\n# Filter images by cloud cover and create an image collection object\ncloud_filter = function(x) {\n x[[\"eo:cloud_cover\"]] < 10\n}\ncollection = stac_image_collection(items$features, \n property_filter = cloud_filter)\n# Define extent, resolution (250m, daily) and CRS of the target data cube\nv = cube_view(srs = \"EPSG:3857\", extent = collection, dx = 250, dy = 250,\n dt = \"P1D\") # \"P1D\" is an ISO 8601 duration string\n# Create and process the data cube\ncube = raster_cube(collection, v) |>\n select_bands(c(\"B04\", \"B08\")) |>\n apply_pixel(\"(B08-B04)/(B08+B04)\", \"NDVI\") |>\n reduce_time(\"max(NDVI)\")\n# gdalcubes_options(parallel = 8)\n# plot(cube, zlim = c(0, 1))"},{"path":"gis.html","id":"openeo","chapter":"10 Bridges to GIS software","heading":"10.8.2 openEO","text":"OpenEO (Schramm et al. 2021) initiative support interoperability among cloud services defining common language processing data.\ninitial idea described r-spatial.org blog post aims making possible users change cloud services easily little code changes possible.\nstandardized processes use multidimensional data cube model interface data.\nImplementations available eight different backends (see https://hub.openeo.org) users can connect R, Python, JavaScript, QGIS, web editor define (chain) processes collections.\nSince functionality data availability differs among backends, openeo R package (Lahn 2021) dynamically loads available processes collections connected backend.\nAfterwards, users can load image collections, apply chain processes, submit jobs, explore plot results.following code connect openEO platform backend, request available datasets, processes, output formats, define process graph compute maximum NDVI image Sentinel-2 data, finally execute graph logging backend.\nopenEO platform backend includes free tier, registration possible existing institutional internet platform accounts.","code":"\nlibrary(openeo)\ncon = connect(host = \"https://openeo.cloud\")\np = processes() # load available processes\ncollections = list_collections() # load available collections\nformats = list_file_formats() # load available output formats\n# Load Sentinel-2 collection\ns2 = p$load_collection(id = \"SENTINEL2_L2A\",\n spatial_extent = list(west = 7.5, east = 8.5,\n north = 51.1, south = 50.1),\n temporal_extent = list(\"2021-01-01\", \"2021-01-31\"),\n bands = list(\"B04\", \"B08\"))\n# Compute NDVI vegetation index\ncompute_ndvi = p$reduce_dimension(data = s2, dimension = \"bands\",\n reducer = function(data, context) {\n (data[2] - data[1]) / (data[2] + data[1])\n })\n# Compute maximum over time\nreduce_max = p$reduce_dimension(data = compute_ndvi, dimension = \"t\",\n reducer = function(x, y) {\n max(x)\n })\n# Export as GeoTIFF\nresult = p$save_result(reduce_max, formats$output$GTiff)\n# Login, see https://docs.openeo.cloud/getting-started/r/#authentication\nlogin(login_type = \"oidc\", provider = \"egi\", \n config = list(client_id = \"...\", secret = \"...\"))\n# Execute processes\ncompute_result(graph = result, output_file = tempfile(fileext = \".tif\"))"},{"path":"gis.html","id":"exercises-8","chapter":"10 Bridges to GIS software","heading":"10.9 Exercises","text":"E1. Compute global solar irradiation area system.file(\"raster/dem.tif\", package = \"spDataLarge\") March 21 11:00 using r.sun GRASS GIS qgisprocess.E2. Compute catchment area catchment slope system.file(\"raster/dem.tif\", package = \"spDataLarge\") using Rsagacmd.E3. Continue working ndvi_segments object created SAGA section.\nExtract average NDVI values ndvi raster group six clusters using kmeans().\nVisualize results.E4. Attach data(random_points, package = \"spDataLarge\") read system.file(\"raster/dem.tif\", package = \"spDataLarge\") R.\nSelect point randomly random_points find dem pixels can seen point (hint: viewshed can calculated using GRASS GIS).\nVisualize result.\nexample, plot hillshade, digital elevation model, viewshed output, point.\nAdditionally, give mapview try.E5. Use gdalinfo via system call raster file stored disk choice.\nkind information can find ?E6. Use gdalwarp decrease resolution raster file (example, resolution 0.5, change 1). Note: -tr -r flags used exercise.E7. Query Californian highways PostgreSQL/PostGIS database living QGIS Cloud introduced chapter.E8. ndvi.tif raster (system.file(\"raster/ndvi.tif\", package = \"spDataLarge\")) contains NDVI calculated Mongón study area based Landsat data September 22, 2000.\nUse rstac, gdalcubes, terra download Sentinel-2 images area \n2020-08-01 2020-10-31, calculate NDVI, compare results ndvi.tif.","code":""},{"path":"algorithms.html","id":"algorithms","chapter":"11 Scripts, algorithms and functions","heading":"11 Scripts, algorithms and functions","text":"","code":""},{"path":"algorithms.html","id":"prerequisites-9","chapter":"11 Scripts, algorithms and functions","heading":"Prerequisites","text":"chapter minimal software prerequisites primarily uses base R.\n, sf package used check results algorithm develop calculate area polygons.\nterms prior knowledge, chapter assumes understanding geographic classes introduced Chapter 2 can used represent wide range input file formats (see Chapter 8).","code":""},{"path":"algorithms.html","id":"intro-algorithms","chapter":"11 Scripts, algorithms and functions","heading":"11.1 Introduction","text":"Chapter 1 established geocomputation using existing tools, developing new ones, “form shareable R scripts functions”.\nchapter teaches building blocks reproducible code.\nalso introduces low-level geometric algorithms, type used Chapter 10.\nReading help understand algorithms work write code can used many times, many people, multiple datasets.\nchapter , , make skilled programmer.\nProgramming hard requires plenty practice (Abelson, Sussman, Sussman 1996):appreciate programming intellectual activity right must turn computer programming; must read write computer programs — many .strong reasons learning program.\nAlthough chapter teach programming — see resources Wickham (2019), Gillespie Lovelace (2016), Xiao (2016) teach programming R languages — provide starting points, focused geometry data, form good foundation developing programming skills.chapter also demonstrates highlights importance reproducibility.\nadvantages reproducibility go beyond allowing others replicate work:\nreproducible code often better every way code written run , including terms computational efficiency, ‘scalability’ (capability code run large datasets) ease adapting maintaining .Scripts basis reproducible R code, topic covered Section 11.2.\nAlgorithms recipes modifying inputs using series steps, resulting output, described Section 11.3.\nease sharing reproducibility, algorithms can placed functions.\ntopic Section 11.4.\nexample finding centroid polygon used tie concepts together.\nChapter 5 already introduced centroid function st_centroid(), example highlights seemingly simple operations result comparatively complex code, affirming following observation (Wise 2001):One intriguing things spatial data problems things appear trivially easy human can surprisingly difficult computer.example also reflects secondary aim chapter , following Xiao (2016), “duplicate available , show things work”.","code":""},{"path":"algorithms.html","id":"scripts","chapter":"11 Scripts, algorithms and functions","heading":"11.2 Scripts","text":"functions distributed packages building blocks R code, scripts glue holds together.\nScripts stored executed logical order create reproducible workflows, manually workflow automation tools targets (Landau 2021).\nnew programming scripts may seem intimidating first encounter , simply plain text files.\nScripts usually saved file extension representing language contain, .py scripts written Python .rs scripts written Rust.\nR scripts saved .R extension named reflect .\nexample 11-hello.R, script file stored code folder book’s repository.\n11-hello.R simple script containing two lines code, one comment:contents script particularly exciting demonstrate point: scripts need complicated.\nSaved scripts can called executed entirety R command line source() function, demonstrated .\noutput command shows comment ignored print() command executed:can also call R scripts system command line shells bash PowerShell, assuming RScript executable configured available, example follows:strict rules can go script files nothing prevent saving broken, non-reproducible code.\nLines code contain valid R commented , adding # start line, prevent errors, shown line 1 11-hello.R script.\n, however, conventions worth following:Write script order: just like script film, scripts clear order ‘setup’, ‘data processing’ ‘save results’ (roughly equivalent ‘beginning’, ‘middle’ ‘end’ film).Add comments script people (future self) can understand \nminimum, comment state purpose script (see Figure 11.1) (long scripts) divide sections.\ncan done RStudio, example, shortcut Ctrl+Shift+R, creates ‘foldable’ code section headingsAbove , scripts reproducible: self-contained scripts work computer useful scripts run computer, good day.\ninvolves attaching required packages beginning, reading-data persistent sources (reliable website) ensuring previous steps taken78It hard enforce reproducibility R scripts, tools can help.\ndefault, RStudio ‘code-checks’ R scripts underlines faulty code red wavy line, illustrated :\nFIGURE 11.1: Code checking RStudio. example, script 11-centroid-alg.R, highlights unclosed curly bracket line 19.\n\ncontents section apply type R script.\nparticular consideration scripts geocomputation tend external dependencies, GDAL dependency needed core R packages working geographic data, made heavy use Chapter 8 data import export.\nGIS software dependencies may needed run specialist geoalgorithms, outlined Chapter 10.\nScripts working geographic data also often require input datasets available specific formats.\ndependencies mentioned comments suitable place project part, described dependencies tools renv package Docker.‘Defensive’ programming techniques good error messages can save time checking dependencies communicating users certain requirements met.\nstatements, implemented () R, can used send messages run lines code , , certain conditions met.\nfollowing lines code, example, send message users certain file missing:work undertaken 11-centroid-alg.R script demonstrated reproducible example , creates pre-requisite object named poly_mat, representing square sides 9 units length.\nexample shows source() works URLs, assuming internet connection.\n, script can called source(\"code/11-centroid-alg.R\"), assuming previously downloaded github.com/geocompx/geocompr repository running R geocompr folder.","code":"\n# Aim: provide a minimal R script\nprint(\"Hello geocompr\")\nsource(\"code/11-hello.R\")\n#> [1] \"Hello geocompr\"Rscript code/11-hello.R\nif (!file.exists(\"required_geo_data.gpkg\")) {\n message(\"No file, required_geo_data.gpkg is missing!\")\n} \n#> No file, required_geo_data.gpkg is missing!\npoly_mat = cbind(\n x = c(0, 9, 9, 0, 0),\n y = c(0, 0, 9, 9, 0)\n)\n# Short URL to code/11-centroid-alg.R in the geocompr repo\nsource(\"https://t.ly/0nzj\")#> [1] \"The area is: 81\"\n#> [1] \"The coordinates of the centroid are: 4.5, 4.5\""},{"path":"algorithms.html","id":"geometric-algorithms","chapter":"11 Scripts, algorithms and functions","heading":"11.3 Geometric algorithms","text":"Algorithms can understood computing equivalent baking recipe.\ncomplete set instructions , undertaken inputs result useful/tasty outcomes.\nInputs ingredients flour sugar case baking, data input parameters case algorithms.\ntasty cakes may result baking recipe, successful algorithms computational outcomes environmental/social/benefits.\ndiving reproducible example, brief history shows algorithms relate scripts (covered Section 11.2) functions (can used generalize algorithms make portable easy--use, ’ll see Section 11.4).word “algorithm” originated 9th century Baghdad publication Hisab al-jabr w’al-muqabala, early math textbook.\nbook translated Latin became popular author’s last name, al-Khwārizmī, “immortalized scientific term: Al-Khwarizmi became Alchoarismi, Algorismi , eventually, algorithm” (Bellos 2011).\ncomputing age, algorithm refers series steps solves problem, resulting pre-defined output.\nInputs must formally defined suitable data structure (Wise 2001).\nAlgorithms often start flow charts pseudocode showing aim process implemented code.\nease usability, common algorithms often packaged inside functions, may hide steps taken (unless look function’s source code, see Section 11.4).Geoalgorithms, encountered Chapter 10, algorithms take geographic data , generally, return geographic results (alternative terms thing include GIS algorithms geometric algorithms).\nmay sound simple deep subject entire academic field, Computational Geometry, dedicated study (Berg et al. 2008) numerous books subject.\nO’Rourke (1998), example, introduces subject range progressively harder geometric algorithms using reproducible freely available C code.example geometric algorithm one finds centroid polygon.\nmany approaches centroid calculation, work specific types spatial data.\npurposes section, choose approach easy visualize: breaking polygon many triangles finding centroid , approach discussed Kaiser Morin (1993) alongside centroid algorithms (mentioned briefly O’Rourke 1998).\nhelps break approach discrete tasks writing code (subsequently referred step 1 step 4, also presented schematic diagram pseudocode):Divide polygon contiguous trianglesFind centroid triangleFind area triangleFind area-weighted mean triangle centroidsThese steps may sound straightforward, converting words working code requires work plenty trial--error, even inputs constrained:\nalgorithm work convex polygons, contain internal angles greater 180°, star shapes allowed (packages decido sfdct can triangulate non-convex polygons using external libraries, shown algorithm vignette hosted geocompx.org).simplest data structure representing polygon matrix x y coordinates row represents vertex tracing polygon’s border order first last rows identical (Wise 2001).\ncase, create polygon five vertices base R, building example GIS Algorithms (Xiao 2016 see github.com/gisalgs Python code), illustrated Figure 11.2:Now example dataset, ready undertake step 1 outlined .\ncode shows can done creating single triangle (T1), demonstrates method; also demonstrates step 2 calculating centroid based formula \\(1/3(+ b + c)\\) \\(\\) \\(c\\) coordinates representing triangle’s vertices:\nFIGURE 11.2: Illustration polygon centroid calculation problem.\nStep 3 find area triangle, weighted mean accounting disproportionate impact large triangles accounted .\nformula calculate area triangle follows (Kaiser Morin 1993):\\[\n\\frac{A_x ( B_y − C_y ) + B_x ( C_y − A_y ) + C_x ( A_y − B_y )}{ 2 }\n\\]\\(\\) \\(C\\) triangle’s three points \\(x\\) \\(y\\) refer x y dimensions.\ntranslation formula R code works data matrix representation triangle T1 follows (function abs() ensures positive result):code chunk outputs correct result.79\nproblem code clunky must re-typed want run another triangle matrix.\nmake code generalizable, see can converted function Section 11.4.Step 4 requires steps 2 3 undertaken just one triangle (demonstrated ) triangles.\nrequires iteration create triangles representing polygon, illustrated Figure 11.3.\nlapply() vapply() used iterate triangle provide concise solution base R:80\nFIGURE 11.3: Illustration iterative centroid algorithm triangles. X represents area-weighted centroid iterations 2 3.\nnow position complete step 4 calculate total area sum() centroid coordinates polygon weighted.mean(C[, 1], ) weighted.mean(C[, 2], ) (exercise alert readers: verify commands work).\ndemonstrate link algorithms scripts, contents section condensed 11-centroid-alg.R.\nsaw end Section 11.2 script can calculate centroid square.\ngreat thing scripting algorithm works new poly_mat object (see exercises verify results reference st_centroid()):example shows low-level geographic operations can developed first principles base R.\nalso shows tried--tested solution already exists, may worth re-inventing wheel:\naimed find centroid polygon, quicker represent poly_mat sf object use pre-existing sf::st_centroid() function instead.\nHowever, great benefit writing algorithms 1st principles understand every step process, something guaranteed using peoples’ code.\nconsideration performance: R may slow compared low-level languages C++ number crunching (see Section 1.4) optimization difficult.\naim develop new methods, computational efficiency prioritized.\ncaptured saying “premature optimization root evil (least ) programming” (Knuth 1974).Algorithm development hard.\napparent amount work gone developing centroid algorithm base R just one, rather inefficient, approach problem limited real-world applications (convex polygons uncommon practice).\nexperience lead appreciation low-level geographic libraries GEOS CGAL (Computational Geometry Algorithms Library) run fast work wide range input geometry types.\ngreat advantage open source nature libraries source code readily available study, comprehension (skills confidence) modification.81","code":"\n# generate a simple matrix representation of a polygon:\nx_coords = c(10, 20, 12, 0, 0, 10)\ny_coords = c(0, 15, 20, 10, 0, 0)\npoly_mat = cbind(x_coords, y_coords)\n# create a point representing the origin:\nOrigin = poly_mat[1, ]\n# create 'triangle matrix':\nT1 = rbind(Origin, poly_mat[2:3, ], Origin) \nC1 = (T1[1,] + T1[2,] + T1[3,]) / 3\n# calculate the area of the triangle represented by matrix T1:\nabs(T1[1, 1] * (T1[2, 2] - T1[3, 2]) +\n T1[2, 1] * (T1[3, 2] - T1[1, 2]) +\n T1[3, 1] * (T1[1, 2] - T1[2, 2])) / 2\n#> [1] 85\ni = 2:(nrow(poly_mat) - 2)\nT_all = lapply(i, function(x) {\n rbind(Origin, poly_mat[x:(x + 1), ], Origin)\n})\n\nC_list = lapply(T_all, function(x) (x[1, ] + x[2, ] + x[3, ]) / 3)\nC = do.call(rbind, C_list)\n\nA = vapply(T_all, function(x) {\n abs(x[1, 1] * (x[2, 2] - x[3, 2]) +\n x[2, 1] * (x[3, 2] - x[1, 2]) +\n x[3, 1] * (x[1, 2] - x[2, 2]) ) / 2\n }, FUN.VALUE = double(1))\nsource(\"code/11-centroid-alg.R\")\n#> [1] \"The area is: 245\"\n#> [1] \"The coordinates of the centroid are: 8.83, 9.22\""},{"path":"algorithms.html","id":"functions","chapter":"11 Scripts, algorithms and functions","heading":"11.4 Functions","text":"Like algorithms, functions take input return output.\nFunctions, however, refer implementation particular programming language, rather ‘recipe’ .\nR, functions objects right, can created joined together modular fashion.\ncan, example, create function undertakes step 2 centroid generation algorithm follows:example demonstrates two key components functions:\n1) function body, code inside curly brackets define function inputs; 2) arguments, list arguments function works — x case (third key component, environment, beyond scope section).\ndefault, functions return last object calculated (coordinates centroid case t_centroid()).82The function now works inputs pass , illustrated command calculates area 1st triangle example polygon previous section (see Figure 11.3).can also create function calculate triangle’s area, name t_area():Note function’s creation, triangle’s area can calculated single line code, avoiding duplication verbose code:\nfunctions mechanism generalizing code.\nnewly created function t_area() takes object x, assumed dimensions ‘triangle matrix’ data structure ’ve using, returns area, illustrated T1 follows:can test generalizability function using find area new triangle matrix, height 1 base 3:useful feature functions modular.\nProvided know output , one function can used building block another.\nThus, functions t_centroid() t_area() can used sub-components larger function work script 11-centroid-alg.R: calculate area convex polygon.\ncode chunk creates function poly_centroid() mimic behavior sf::st_centroid() convex polygons.83Functions, poly_centroid(), can extended provide different types output.\nreturn result object class sfg, example, ‘wrapper’ function can used modify output poly_centroid() returning result:can verify output output sf::st_centroid() follows:","code":"\nt_centroid = function(x) {\n (x[1, ] + x[2, ] + x[3, ]) / 3\n}\nt_centroid(T1)\n#> x_coords y_coords \n#> 14.0 11.7\nt_area = function(x) {\n abs(\n x[1, 1] * (x[2, 2] - x[3, 2]) +\n x[2, 1] * (x[3, 2] - x[1, 2]) +\n x[3, 1] * (x[1, 2] - x[2, 2])\n ) / 2\n}\nt_area(T1)\n#> [1] 85\nt_new = cbind(x = c(0, 3, 3, 0),\n y = c(0, 0, 1, 0))\nt_area(t_new)\n#> x \n#> 1.5\npoly_centroid = function(poly_mat) {\n Origin = poly_mat[1, ] # create a point representing the origin\n i = 2:(nrow(poly_mat) - 2)\n T_all = lapply(i, function(x) {rbind(Origin, poly_mat[x:(x + 1), ], Origin)})\n C_list = lapply(T_all, t_centroid)\n C = do.call(rbind, C_list)\n A = vapply(T_all, t_area, FUN.VALUE = double(1))\n c(weighted.mean(C[, 1], A), weighted.mean(C[, 2], A))\n}\npoly_centroid(poly_mat)\n#> [1] 8.83 9.22\npoly_centroid_sfg = function(x) {\n centroid_coords = poly_centroid(x)\n sf::st_point(centroid_coords)\n}\npoly_sfc = sf::st_polygon(list(poly_mat))\nidentical(poly_centroid_sfg(poly_mat), sf::st_centroid(poly_sfc))\n#> [1] TRUE"},{"path":"algorithms.html","id":"programming","chapter":"11 Scripts, algorithms and functions","heading":"11.5 Programming","text":"chapter moved quickly, scripts functions via tricky topic algorithms.\ndiscussed abstract, also created working examples solve specific problem:script 11-centroid-alg.R introduced demonstrated ‘polygon matrix’individual steps allowed script work described algorithm, computational recipeTo generalize algorithm converted modular functions eventually combined create function poly_centroid() previous sectionEach may seem straightforward.\nHowever, skillful programming complex involves combining element — scripts, algorithms functions — system, efficiency style.\noutcome robust user-friendly tools people can use.\nnew programming, expect people reading book , able follow reproduce results preceding sections major achievement.\nProgramming takes many hours dedicated study practice become proficient.challenge facing developers aiming implement new algorithms efficient way put perspective considering amount work gone creating simple function intended use production: current state, poly_centroid() fails (non-convex) polygons!\nraises question: generalize function?\nTwo options (1) find ways triangulate non-convex polygons (topic covered online Algorithms Extended article hosted geocompx.github.io/geocompkg/articles/) (2) explore centroid algorithms rely triangular meshes.wider question : worth programming solution high performance algorithms already implemented packaged functions st_centroid()?\nreductionist answer specific case ‘’.\nwider context, considering benefits learning program, answer ‘depends’.\nprogramming, ’s easy waste hours trying implement method, find someone already done hard work.\ncan understand chapter stepping stone towards geometric algorithm programming wizardry.\nHowever, can also seen lesson try program generalized solution, use existing higher-level solutions.\nsurely occasions writing new functions best way forward, also times using functions already exist best way forward.“reinvent wheel” applies much, , programming walks life.\nbit research thinking outset project can help decide programming time best spent.\nThree principles can also help maximize use effort writing code, whether ’s simple script package composed hundreds functions:DRY (don’t repeat ): minimize repetition code aim use fewer lines code solve particular problem.\nprinciple explained reference use functions reduce code repetition Functions chapter R Data Science (Grolemund Wickham 2016).KISS (keep simple stupid): principle suggests simple solutions tried first preferred complex solutions, using dependencies needed, aiming keep scripts concise.\nprinciple computing analogy quote “things made simple possible, simpler”.Modularity: code easier maintain ’s divided well-defined pieces.\nfunction one thing really well.\nfunction becoming long, think splitting multiple small functions, re-used purposes, supporting DRY KISS principles.guarantee chapter instantly enable create perfectly formed functions work.\n, however, confident contents help decide appropriate time try (existing functions solve problem, programming task within capabilities benefits solution likely outweigh time costs developing ).\nusing principles , combination practical experience working examples , build scripting, package-writing programming skills.\nFirst steps towards programming can slow (exercises rushed) long-term rewards can large.","code":""},{"path":"algorithms.html","id":"ex-algorithms","chapter":"11 Scripts, algorithms and functions","heading":"11.6 Exercises","text":"E1. Read script 11-centroid-alg.R code folder book’s GitHub repository.best practices covered Section 11.2 follow?Create version script computer IDE RStudio (preferably typing-script line--line, coding style comments, rather copy-pasting — help learn type scripts). Using example square polygon (e.g., created poly_mat = cbind(x = c(0, 9, 9, 0, 0), y = c(0, 0, 9, 9, 0))) execute script line--line.changes made script make reproducible?documentation improved?E2. geometric algorithms section calculated area geographic centroid polygon represented poly_mat 245 8.8, 9.2, respectively.Reproduce results computer reference script 11-centroid-alg.R, implementation algorithm (bonus: type commands - try avoid copy-pasting).results correct? Verify converting poly_mat sfc object (named poly_sfc) st_polygon() (hint: function takes objects class list()) using st_area() st_centroid().E3. stated algorithm created works convex hulls. Define convex hulls (see geometry operations chapter) test algorithm polygon convex hull.Bonus 1: Think method works convex hulls note changes need made algorithm make work types polygon.Bonus 2: Building contents 11-centroid-alg.R, write algorithm using base R functions can find total length linestrings represented matrix form.E4. functions section created different versions poly_centroid() function generated outputs class sfg (poly_centroid_sfg()) type-stable matrix outputs (poly_centroid_type_stable()).\nextend function creating version (e.g., called poly_centroid_sf()) type stable (accepts inputs class sf) returns sf objects (hint: may need convert object x matrix command sf::st_coordinates(x)).Verify works running poly_centroid_sf(sf::st_sf(sf::st_sfc(poly_sfc)))error message get try run poly_centroid_sf(poly_mat)?","code":""},{"path":"spatial-cv.html","id":"spatial-cv","chapter":"12 Statistical learning","heading":"12 Statistical learning","text":"","code":""},{"path":"spatial-cv.html","id":"prerequisites-10","chapter":"12 Statistical learning","heading":"Prerequisites","text":"chapter assumes proficiency geographic data analysis, example gained studying contents working-exercises Chapters 2 7.\nfamiliarity generalized linear models (GLM) machine learning highly recommended (example . Zuur et al. 2009; James et al. 2013).chapter uses following packages:84Required data attached due course.","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(future) # parallel processing\nlibrary(lgr) # logging framework for R\nlibrary(mlr3) # unified interface to machine learning algorithms\nlibrary(mlr3learners) # most important machine learning algorithms\nlibrary(mlr3extralearners) # access to even more learning algorithms\nlibrary(mlr3proba) # make probabilistic predictions, here only needed for mlr3extralearners::list_learners()\nlibrary(mlr3spatiotempcv) # spatio-temporal resampling strategies\nlibrary(mlr3tuning) # hyperparameter tuning\nlibrary(mlr3viz) # plotting functions for mlr3 objects\nlibrary(progressr) # report progress updates\nlibrary(pROC) # compute roc values"},{"path":"spatial-cv.html","id":"intro-cv1","chapter":"12 Statistical learning","heading":"12.1 Introduction","text":"Statistical learning concerned use statistical computational models identifying patterns data predicting patterns.\nDue origins, statistical learning one R’s great strengths (see Section 1.4).85\nStatistical learning combines methods statistics machine learning can categorized supervised unsupervised techniques.\nincreasingly used disciplines ranging physics, biology ecology geography economics (James et al. 2013).chapter focuses supervised techniques training dataset, opposed unsupervised techniques clustering.\nResponse variables can binary (landslide occurrence), categorical (land use), integer (species richness count) numeric (soil acidity measured pH).\nSupervised techniques model relationship responses — known sample observations — one predictors.primary aim much machine learning research make good predictions.\nMachine learning thrives age ‘big data’ methods make assumptions input variables can handle huge datasets.\nMachine learning conducive tasks prediction future customer behavior, recommendation services (music, movies, buy next), face recognition, autonomous driving, text classification predictive maintenance (infrastructure, industry).chapter based case study: modeling occurrence landslides.\napplication links applied nature geocomputation, defined Chapter 1, illustrates machine learning borrows field statistics sole aim prediction.\nTherefore, chapter first introduces modeling cross-validation concepts help Generalized Linear Model (. Zuur et al. 2009).\nBuilding , chapter implements typical machine learning algorithm, namely Support Vector Machine (SVM).\nmodels’ predictive performance assessed using spatial cross-validation (CV), accounts fact geographic data special.CV determines model’s ability generalize new data, splitting dataset (repeatedly) training test sets.\nuses training data fit model, checks performance predicting test data.\nCV helps detect overfitting since models predict training data closely (noise) tend perform poorly test data.Randomly splitting spatial data can lead training points neighbors space test points.\nDue spatial autocorrelation, test training datasets independent scenario, consequence CV fails detect possible overfitting.\nSpatial CV alleviates problem central theme chapter.Hence, emphasize , chapter focusing predictive performance models.\nteach predictive mapping.\ntopic Chapter 15.","code":""},{"path":"spatial-cv.html","id":"case-landslide","chapter":"12 Statistical learning","heading":"12.2 Case study: Landslide susceptibility","text":"case study based dataset landslide locations Southern Ecuador, illustrated Figure 12.1 described detail Muenchow, Brenning, Richter (2012).\nsubset dataset used paper provided spDataLarge package, can loaded follows:code loads three objects: data.frame named lsl, sf object named study_mask SpatRaster (see Section 2.3.4) named ta containing terrain attribute rasters.\nlsl contains factor column lslpts TRUE corresponds observed landslide ‘initiation point’, coordinates stored columns x y.86\n175 landslide 175 non-landslide points, shown summary(lsl$lslpts).\n175 non-landslide points sampled randomly study area, restriction must fall outside small buffer around landslide polygons.\nFIGURE 12.1: Landslide initiation points (red) points unaffected landsliding (blue) Southern Ecuador.\nfirst three rows lsl, rounded two significant digits, can found Table 12.1.\nTABLE 12.1: TABLE 12.2: Structure lsl dataset.\nmodel landslide susceptibility, need predictors.\nSince terrain attributes frequently associated landsliding (Muenchow, Brenning, Richter 2012), already extracted following terrain attributes ta lsl:slope: slope angle (°)cplan: plan curvature (rad m−1) expressing convergence divergence slope thus water flowcprof: profile curvature (rad m-1) measure flow acceleration, also known downslope change slope angleelev: elevation (m .s.l.) representation different altitudinal zones vegetation precipitation study arealog10_carea: decadic logarithm catchment area (log10 m2) representing amount water flowing towards locationIt might worthwhile exercise compute terrain attributes help R-GIS bridges (see Chapter 10) extract landslide points (see Exercise section end chapter).","code":"\ndata(\"lsl\", \"study_mask\", package = \"spDataLarge\")\nta = terra::rast(system.file(\"raster/ta.tif\", package = \"spDataLarge\"))"},{"path":"spatial-cv.html","id":"conventional-model","chapter":"12 Statistical learning","heading":"12.3 Conventional modeling approach in R","text":"introducing mlr3 package, umbrella-package providing unified interface dozens learning algorithms (Section 12.5), worth taking look conventional modeling interface R.\nintroduction supervised statistical learning provides basis spatial CV, contributes better grasp mlr3 approach presented subsequently.Supervised learning involves predicting response variable function predictors (Section 12.4).\nR, modeling functions usually specified using formulas (see ?formula details R formulas).\nfollowing command specifies runs generalized linear model:worth understanding three input arguments:formula, specifies landslide occurrence (lslpts) function predictorsA family, specifies type model, case binomial response binary (see ?family)data frame contains response predictors (columns)results model can printed follows (summary(fit) provides detailed account results):model object fit, class glm, contains coefficients defining fitted relationship response predictors.\ncan also used prediction.\ndone generic predict() method, case calls function predict.glm().\nSetting type response returns predicted probabilities (landslide occurrence) observation lsl, illustrated (see ?predict.glm).Spatial distribution maps can made applying coefficients predictor rasters.\ncan done manually terra::predict().\naddition model object (fit), latter function also expects SpatRaster predictors (raster layers) named model’s input data frame (Figure 12.2).\nFIGURE 12.2: Spatial distribution mapping landslide susceptibility using GLM.\n, making predictions neglect spatial autocorrelation since assume average predictive accuracy remains without spatial autocorrelation structures.\nHowever, possible include spatial autocorrelation structures models well predictions.\nThough, beyond scope book, give interested reader pointers look :predictions regression kriging combines predictions regression kriging regression’s residuals (Goovaerts 1997; Hengl 2007; Bivand, Pebesma, Gómez-Rubio 2013).One can also add spatial correlation (dependency) structure generalized least squares model (nlme::gls(); . Zuur et al. (2009); . F. Zuur et al. (2017)).One can also use mixed-effect modeling approaches.\nBasically, random effect imposes dependency structure response variable turn allows observations one class similar another class (. Zuur et al. 2009).\nClasses can , example, bee hives, owl nests, vegetation transects altitudinal stratification.\nmixed modeling approach assumes normal independent distributed random intercepts.\ncan even extended using random intercept normal spatially dependent.\n, however, resort likely Bayesian modeling approaches since frequentist software tools rather limited respect especially complex models (Blangiardo Cameletti 2015; . F. Zuur et al. 2017).Spatial distribution mapping one important outcome model (Figure 12.2).\nEven important good underlying model making since prediction map useless model’s predictive performance bad.\nOne popular measures assess predictive performance binomial model Area Receiver Operator Characteristic Curve (AUROC).\nvalue 0.5 1.0, 0.5 indicating model better random 1.0 indicating perfect prediction two classes.\nThus, higher AUROC, better model’s predictive power.\nfollowing code chunk computes AUROC value model roc(), takes response predicted values inputs.\nauc() returns area curve.AUROC value 0.82 represents good fit.\nHowever, overoptimistic estimation since computed complete dataset.\nderive biased-reduced assessment, use cross-validation case spatial data make use spatial CV.","code":"\nfit = glm(lslpts ~ slope + cplan + cprof + elev + log10_carea,\n family = binomial(),\n data = lsl)\nclass(fit)\n#> [1] \"glm\" \"lm\"\nfit\n#> \n#> Call: glm(formula = lslpts ~ slope + cplan + cprof + elev + log10_carea, \n#> family = binomial(), data = lsl)\n#> \n#> Coefficients:\n#> (Intercept) slope cplan cprof elev log10_carea \n#> 2.51e+00 7.90e-02 -2.89e+01 -1.76e+01 1.79e-04 -2.27e+00 \n#> \n#> Degrees of Freedom: 349 Total (i.e. Null); 344 Residual\n#> Null Deviance: 485 \n#> Residual Deviance: 373 AIC: 385\npred_glm = predict(object = fit, type = \"response\")\nhead(pred_glm)\n#> 1 2 3 4 5 6 \n#> 0.1901 0.1172 0.0952 0.2503 0.3382 0.1575\n# making the prediction\npred = terra::predict(ta, model = fit, type = \"response\")\npROC::auc(pROC::roc(lsl$lslpts, fitted(fit)))\n#> Area under the curve: 0.8216"},{"path":"spatial-cv.html","id":"intro-cv","chapter":"12 Statistical learning","heading":"12.4 Introduction to (spatial) cross-validation","text":"Cross-validation belongs family resampling methods (James et al. 2013).\nbasic idea split (repeatedly) dataset training test sets whereby training data used fit model applied test set.\nComparing predicted values known response values test set (using performance measure AUROC binomial case) gives bias-reduced assessment model’s capability generalize learned relationship independent data.\nexample, 100-repeated 5-fold cross-validation means randomly split data five partitions (folds) fold used test set (see upper row Figure 12.3).\nguarantees observation used one test sets, requires fitting five models.\nSubsequently, procedure repeated 100 times.\ncourse, data splitting differ repetition.\nOverall, sums 500 models, whereas mean performance measure (AUROC) models model’s overall predictive power.However, geographic data special.\nsee Chapter 13, ‘first law’ geography states points close , generally, similar points away (Miller 2004).\nmeans points statistically independent training test points conventional CV often close (see first row Figure 12.3).\n‘Training’ observations near ‘test’ observations can provide kind ‘sneak preview’:\ninformation unavailable training dataset.\nalleviate problem ‘spatial partitioning’ used split observations spatially disjointed subsets (using observations’ coordinates k-means clustering; Brenning (2012b); second row Figure 12.3).\npartitioning strategy difference spatial conventional CV.\nresult, spatial CV leads bias-reduced assessment model’s predictive performance, hence helps avoid overfitting.\nFIGURE 12.3: Spatial visualization selected test training observations cross-validation one repetition. Random (upper row) spatial partitioning (lower row).\n","code":""},{"path":"spatial-cv.html","id":"spatial-cv-with-mlr3","chapter":"12 Statistical learning","heading":"12.5 Spatial CV with mlr3","text":"\ndozens packages statistical learning, described example CRAN machine learning task view.\nGetting acquainted packages, including undertake cross-validation hyperparameter tuning, can time-consuming process.\nComparing model results different packages can even laborious.\nmlr3 package ecosystem developed address issues.\nacts ‘meta-package’, providing unified interface popular supervised unsupervised statistical learning techniques including classification, regression, survival analysis clustering (Lang et al. 2019; Becker et al. 2022).\nstandardized mlr3 interface based eight ‘building blocks’.\nillustrated Figure 12.4, clear order.\nFIGURE 12.4: Basic building blocks mlr3 package. Source: Becker et al. (2022). (Permission reuse figure kindly granted.)\nmlr3 modeling process consists three main stages.\nFirst, task specifies data (including response predictor variables) model type (regression classification).\nSecond, learner defines specific learning algorithm applied created task.\nThird, resampling approach assesses predictive performance model, .e., ability generalize new data (see also Section 12.4).","code":""},{"path":"spatial-cv.html","id":"glm","chapter":"12 Statistical learning","heading":"12.5.1 Generalized linear model","text":"use GLM mlr3, must create task containing landslide data.\nSince response binary (two-category variable) spatial dimension, create classification task as_task_classif_st() mlr3spatiotempcv package (Schratz et al. 2021, non-spatial tasks, use mlr3::as_task_classif() mlr3::as_task_regr() regression tasks, see ?Task task types).87\nfirst essential argument as_task_ functions x.\nx expects input data includes response predictor variables.\ntarget argument indicates name response variable (case lslpts) positive determines two factor levels response variable indicate landslide initiation point (case TRUE).\nvariables lsl dataset serve predictors.\nspatial CV, need provide extra arguments.\ncoordinate_names argument expects names coordinate columns (see Section 12.4 Figure 12.3).\nAdditionally, indicate used CRS (crs) decide want use coordinates predictors modeling (coords_as_features).Note mlr3spatiotempcv::as_task_classif_st() also accepts sf-object input backend parameter.\ncase, might want additionally specify coords_as_features argument.\nconvert lsl sf-object as_task_classif_st() just turn back non-spatial data.table object background.short data exploration, autoplot() function mlr3viz package might come handy since plots response predictors predictors predictors (shown).created task, need choose learner determines statistical learning method use.\nclassification learners start classif. regression learners regr. (see ?Learner details).\nmlr3extralearners::list_mlr3learners() lists available learners package mlr3 imports (Table 12.3).\nfind learners able model binary response variable, can run:TABLE 12.3: Sample available learners binomial tasks mlr3 package.yields learners able model two-class problems (landslide yes ).\nopt binomial classification method used Section 12.3 implemented classif.log_reg mlr3learners.\nAdditionally, need specify predict.type determines type prediction prob resulting predicted probability landslide occurrence 0 1 (corresponds type = response predict.glm()).access help page learner find package taken, can run:set-steps modeling mlr3 may seem tedious.\nremember, single interface provides access 130+ learners shown mlr3extralearners::list_mlr3learners(); far tedious learn interface learner!\nadvantages simple parallelization resampling techniques ability tune machine learning hyperparameters (see Section 12.5.2).\nimportantly, (spatial) resampling mlr3spatiotempcv (Schratz et al. 2021) straightforward, requiring two steps: specifying resampling method running .\nuse 100-repeated 5-fold spatial CV: five partitions chosen based provided coordinates task partitioning repeated 100 times:88To execute spatial resampling, run resample() using previously specified task, learner, resampling strategy.\ntakes time (around 15 seconds modern laptop) computes 500 resampling partitions 500 models.\nperformance measure, choose AUROC.\nretrieve , use score() method resampling result output object (score_spcv_glm).\nreturns data.table object 500 rows – one model.output preceding code chunk bias-reduced assessment model’s predictive performance.\nsaved extdata/12-bmr_score.rds book’s GitHub repository.\nrequired, can read follows:compute mean AUROC 500 models, run:put results perspective, let us compare AUROC values 100-repeated 5-fold non-spatial cross-validation (Figure 12.5; code non-spatial cross-validation shown explored exercise section).\nexpected (see Section 12.4), spatially cross-validated result yields lower AUROC values average conventional cross-validation approach, underlining -optimistic predictive performance latter due spatial autocorrelation.\nFIGURE 12.5: Boxplot showing difference GLM AUROC values spatial conventional 100-repeated 5-fold cross-validation.\n","code":"\n# 1. create task\ntask = mlr3spatiotempcv::as_task_classif_st(\n mlr3::as_data_backend(lsl), \n target = \"lslpts\", \n id = \"ecuador_lsl\",\n positive = \"TRUE\",\n coordinate_names = c(\"x\", \"y\"),\n crs = \"EPSG:32717\",\n coords_as_features = FALSE\n )\n# plot response against each predictor\nmlr3viz::autoplot(task, type = \"duo\")\n# plot all variables against each other\nmlr3viz::autoplot(task, type = \"pairs\")\nmlr3extralearners::list_mlr3learners(\n filter = list(class = \"classif\", properties = \"twoclass\"), \n select = c(\"id\", \"mlr3_package\", \"required_packages\")) |>\n head()\n# 2. specify learner\nlearner = mlr3::lrn(\"classif.log_reg\", predict_type = \"prob\")\nlearner$help()\n# 3. specify resampling\nresampling = mlr3::rsmp(\"repeated_spcv_coords\", folds = 5, repeats = 100)\n# reduce verbosity\nlgr::get_logger(\"mlr3\")$set_threshold(\"warn\")\n# run spatial cross-validation and save it to resample result glm (rr_glm)\nrr_spcv_glm = mlr3::resample(task = task,\n learner = learner,\n resampling = resampling)\n# compute the AUROC as a data.table\nscore_spcv_glm = rr_spcv_glm$score(measure = mlr3::msr(\"classif.auc\"))\n# keep only the columns you need\nscore_spcv_glm = dplyr::select(score_spcv_glm, task_id, learner_id, \n resampling_id, classif.auc)\nscore = readRDS(\"extdata/12-bmr_score.rds\")\nscore_spcv_glm = dplyr::filter(score, learner_id == \"classif.log_reg\", \n resampling_id == \"repeated_spcv_coords\")\nmean(score_spcv_glm$classif.auc) |>\n round(2)\n#> [1] 0.77"},{"path":"spatial-cv.html","id":"svm","chapter":"12 Statistical learning","heading":"12.5.2 Spatial tuning of machine-learning hyperparameters","text":"Section 12.4 introduced machine learning part statistical learning.\nrecap, adhere following definition machine learning Jason Brownlee:Machine learning, specifically field predictive modeling, primarily concerned minimizing error model making accurate predictions possible, expense explainability.\napplied machine learning borrow, reuse steal algorithms many different fields, including statistics use towards ends.Section 12.5.1 GLM used predict landslide susceptibility.\nsection introduces support vector machines (SVM) purpose.\nRandom forest models might popular SVMs; however, positive effect tuning hyperparameters model performance much pronounced case SVMs (Probst, Wright, Boulesteix 2018).\nSince (spatial) hyperparameter tuning major aim section, use SVM.\nwishing apply random forest model, recommend read chapter, proceed Chapter 15 apply currently covered concepts techniques make spatial distribution maps based random forest model.SVMs search best possible ‘hyperplanes’ separate classes (classification case) estimate ‘kernels’ specific hyperparameters create non-linear boundaries classes (James et al. 2013).\nMachine learning algorithms often feature hyperparameters parameters.\nParameters can estimated data hyperparameters set learning begins (see also machine mastery blog hyperparameter optimization chapter mlr3 book).\noptimal hyperparameter configuration usually found within specific search space determined help cross-validation methods.\ncalled hyperparameter tuning main topic section.SVM implementations provided kernlab allow hyperparameters tuned automatically, usually based random sampling (see upper row Figure 12.3).\nworks non-spatial data less use spatial data ‘spatial tuning’ undertaken.defining spatial tuning, set mlr3 building blocks, introduced Section 12.5.1, SVM.\nclassification task remains , hence can simply reuse task object created Section 12.5.1.\nLearners implementing SVM can found using list_mlr3learners() command mlr3extralearners.options, use ksvm() kernlab package (Karatzoglou et al. 2004).\nallow non-linear relationships, use popular radial basis function (Gaussian) kernel (\"rbfdot\") also default ksvm().\nSetting type argument \"C-svc\" makes sure ksvm() solving classification task.\nmake sure tuning stop one failing model, additionally define fallback learner (information please refer https://mlr3book.mlr-org.com/chapters/chapter10/advanced_technical_aspects_of_mlr3.html#sec-fallback).next stage specify resampling strategy.\nuse 100-repeated 5-fold spatial CV.Note exact code used resampling GLM Section 12.5.1; simply repeated reminder.far, process identical described Section 12.5.1.\nnext step new, however: tune hyperparameters.\nUsing data performance assessment tuning potentially lead overoptimistic results (Cawley Talbot 2010).\ncan avoided using nested spatial CV.\nFIGURE 12.6: Schematic hyperparameter tuning performance estimation levels CV. (Figure taken Schratz et al. (2019). Permission reuse kindly granted.)\nmeans split fold five spatially disjoint subfolds used determine optimal hyperparameters (tune_level object code chunk ; see Figure 12.6 visual representation).\nrandom selection values C Sigma additionally restricted predefined tuning space (search_space object).\nrange tuning space chosen values recommended literature (Schratz et al. 2019).\nfind optimal hyperparameter combination, fit 50 models (terminator object code chunk ) subfolds randomly selected values hyperparameters C Sigma.next stage modify learner lrn_ksvm accordance characteristics defining hyperparameter tuning auto_tuner().tuning now set-fit 250 models determine optimal hyperparameters one fold.\nRepeating fold, end 1,250 (250 * 5) models repetition.\nRepeated 100 times means fitting total 125,000 models identify optimal hyperparameters (Figure 12.3).\nused performance estimation, requires fitting another 500 models (5 folds * 100 repetitions; see Figure 12.3).\nmake performance estimation processing chain even clearer, let us write commands given computer:Performance level (upper left part Figure 12.6) - split dataset five spatially disjoint (outer) subfoldsTuning level (lower left part Figure 12.6) - use first fold performance level split spatially five (inner) subfolds hyperparameter tuning.\nUse 50 randomly selected hyperparameters inner subfolds, .e., fit 250 modelsPerformance estimation - Use best hyperparameter combination previous step (tuning level) apply first outer fold performance level estimate performance (AUROC)Repeat steps 2 3 remaining four outer foldsRepeat steps 2 4, 100 timesThe process hyperparameter tuning performance estimation computationally intensive.\ndecrease model runtime, mlr3 offers possibility use parallelization help future package.\nSince run nested cross-validation, can decide like parallelize inner outer loop (see lower left part Figure 12.6).\nSince former run 125,000 models, whereas latter runs 500, quite obvious parallelize inner loop.\nset parallelization inner loop, run:Additionally, instructed future use half instead available cores (default), setting allows possible users work high performance computing cluster case one used.Now set computing nested spatial CV.\nSpecifying resample() parameters follows exact procedure presented using GLM, difference store_models encapsulate arguments.\nSetting former TRUE allow extraction hyperparameter tuning results important plan follow-analyses tuning.\nlatter ensures processing continues even one models throws error.\navoids process stopping just one failed model, desirable large model runs.\nprocessing completed, one can look failed models.\nprocessing, good practice explicitly stop parallelization future:::ClusterRegistry(\"stop\").\nFinally, save output object (result) disk case like use another R session.\nrunning subsequent code, aware time-consuming since run spatial cross-validation 125,500 models.\ncan easily run half day modern laptop.\nNote runtime depends many aspects: CPU speed, selected algorithm, selected number cores dataset.case want run code locally, saved score_svm book’s GitHub repository.\ncan loaded follows:Let us look final AUROC: model’s ability discriminate two classes.appears GLM (aggregated AUROC 0.77) slightly better SVM specific case.\nguarantee absolute fair comparison, one also make sure two models use exact partitions – something shown silently used background (see code/12_cv.R book’s GitHub repository information).\n, mlr3 offers functions benchmark_grid() benchmark() (see also https://mlr3book.mlr-org.com/chapters/chapter3/evaluation_and_benchmarking.html#sec-benchmarking, Becker et al. 2022).\nexplore functions detail Exercises.\nPlease note also using 50 iterations random search SVM probably yield hyperparameters result models better AUROC (Schratz et al. 2019).\nhand, increasing number random search iterations also increase total number models thus runtime.far spatial CV used assess ability learning algorithms generalize unseen data.\npredictive mapping purposes, one tune hyperparameters complete dataset.\ncovered Chapter 15.","code":"\nmlr3_learners = mlr3extralearners::list_mlr3learners()\n#> This will take a few seconds.\nmlr3_learners |>\n dplyr::filter(class == \"classif\" & grepl(\"svm\", id)) |>\n dplyr::select(id, class, mlr3_package, required_packages)\n#> id class mlr3_package required_packages\n#> \n#> 1: classif.ksvm classif mlr3extralearners mlr3,mlr3extralearners,kernlab\n#> 2: classif.lssvm classif mlr3extralearners mlr3,mlr3extralearners,kernlab\n#> 3: classif.svm classif mlr3learners mlr3,mlr3learners,e1071\nlrn_ksvm = mlr3::lrn(\"classif.ksvm\", predict_type = \"prob\", kernel = \"rbfdot\",\n type = \"C-svc\")\nlrn_ksvm$encapsulate(method = \"try\", \n fallback = lrn(\"classif.featureless\", \n predict_type = \"prob\"))\n# performance estimation level\nperf_level = mlr3::rsmp(\"repeated_spcv_coords\", folds = 5, repeats = 100)\n# five spatially disjoint partitions\ntune_level = mlr3::rsmp(\"spcv_coords\", folds = 5)\n# define the outer limits of the randomly selected hyperparameters\nsearch_space = paradox::ps(\n C = paradox::p_dbl(lower = -12, upper = 15, trafo = function(x) 2^x),\n sigma = paradox::p_dbl(lower = -15, upper = 6, trafo = function(x) 2^x)\n)\n# use 50 randomly selected hyperparameters\nterminator = mlr3tuning::trm(\"evals\", n_evals = 50)\ntuner = mlr3tuning::tnr(\"random_search\")\nat_ksvm = mlr3tuning::auto_tuner(\n learner = lrn_ksvm,\n resampling = tune_level,\n measure = mlr3::msr(\"classif.auc\"),\n search_space = search_space,\n terminator = terminator,\n tuner = tuner\n)\nlibrary(future)\n# execute the outer loop sequentially and parallelize the inner loop\nfuture::plan(list(\"sequential\", \"multisession\"), \n workers = floor(availableCores() / 2))\nprogressr::with_progress(expr = {\n rr_spcv_svm = mlr3::resample(task = task,\n learner = at_ksvm, \n # outer resampling (performance level)\n resampling = perf_level,\n store_models = FALSE,\n encapsulate = \"evaluate\")\n})\n# stop parallelization\nfuture:::ClusterRegistry(\"stop\")\n# compute the AUROC values\nscore_spcv_svm = rr_spcv_svm$score(measure = mlr3::msr(\"classif.auc\")) \n# keep only the columns you need\nscore_spcv_svm = dplyr::select(score_spcv_svm, task_id, learner_id, \n resampling_id, classif.auc)\nscore = readRDS(\"extdata/12-bmr_score.rds\")\nscore_spcv_svm = dplyr::filter(score, learner_id == \"classif.ksvm.tuned\", \n resampling_id == \"repeated_spcv_coords\")\n# final mean AUROC\nround(mean(score_spcv_svm$classif.auc), 2)\n#> [1] 0.74"},{"path":"spatial-cv.html","id":"conclusions","chapter":"12 Statistical learning","heading":"12.6 Conclusions","text":"Resampling methods important part data scientist’s toolbox (James et al. 2013).\nchapter used cross-validation assess predictive performance various models.\ndescribed Section 12.4, observations spatial coordinates may statistically independent due spatial autocorrelation, violating fundamental assumption cross-validation.\nSpatial CV addresses issue reducing bias introduced spatial autocorrelation.mlr3 package facilitates (spatial) resampling techniques combination popular statistical learning techniques including linear regression, semi-parametric models generalized additive models machine learning techniques random forests, SVMs, boosted regression trees (Bischl et al. 2016; Schratz et al. 2019).\nMachine learning algorithms often require hyperparameter inputs, optimal ‘tuning’ can require thousands model runs require large computational resources, consuming much time, RAM /cores.\nmlr3 tackles issue enabling parallelization.Machine learning overall, use understand spatial data, large field chapter provided basics, learn.\nrecommend following resources direction:mlr3 book (Becker et al. (2022); https://mlr3book.mlr-org.com/) especially chapter handling spatiotemporal dataAn academic paper hyperparameter tuning (Schratz et al. 2019)academic paper use mlr3spatiotempcv (Schratz et al. 2021)case spatiotemporal data, one account spatial temporal autocorrelation CV (Meyer et al. 2018)","code":""},{"path":"spatial-cv.html","id":"exercises-9","chapter":"12 Statistical learning","heading":"12.7 Exercises","text":"E1. Compute following terrain attributes elev dataset loaded terra::rast(system.file(\"raster/ta.tif\", package = \"spDataLarge\"))$elev help R-GIS bridges (see bridges GIS software chapter):SlopePlan curvatureProfile curvatureCatchment areaE2. Extract values corresponding output rasters lsl data frame (data(\"lsl\", package = \"spDataLarge\") adding new variables called slope, cplan, cprof, elev log_carea.E3. Use derived terrain attribute rasters combination GLM make spatial prediction map similar shown Figure 12.2.\nRunning data(\"study_mask\", package = \"spDataLarge\") attaches mask study area.E4. Compute 100-repeated 5-fold non-spatial cross-validation spatial CV based GLM learner compare AUROC values resampling strategies help boxplots.Hint: need specify non-spatial resampling strategy.Another hint: might want solve Excercises 4 6 one go help mlr3::benchmark() mlr3::benchmark_grid() (information, please refer https://mlr3book.mlr-org.com/chapters/chapter10/advanced_technical_aspects_of_mlr3.html#sec-fallback).\n, keep mind computation can take long, probably several days.\n, course, depends system.\nComputation time shorter RAM cores disposal.E5. Model landslide susceptibility using quadratic discriminant analysis (QDA).\nAssess predictive performance QDA.\ndifference spatially cross-validated mean AUROC value QDA GLM?E6. Run SVM without tuning hyperparameters.\nUse rbfdot kernel \\(\\sigma\\) = 1 C = 1.\nLeaving hyperparameters unspecified kernlab’s ksvm() otherwise initialize automatic non-spatial hyperparameter tuning.","code":""},{"path":"transport.html","id":"transport","chapter":"13 Transportation","heading":"13 Transportation","text":"","code":""},{"path":"transport.html","id":"prerequisites-11","chapter":"13 Transportation","heading":"Prerequisites","text":"chapter uses following packages:89","code":"\nlibrary(sf)\nlibrary(dplyr)\nlibrary(spDataLarge)\nlibrary(stplanr) # for processing geographic transport data\nlibrary(tmap) # map-making (see Chapter 9)\nlibrary(ggplot2) # data visualization package\nlibrary(sfnetworks) # spatial network classes and functions "},{"path":"transport.html","id":"introduction-7","chapter":"13 Transportation","heading":"13.1 Introduction","text":"sectors geographic space tangible transportation.\neffort moving (overcoming distance) central ‘first law’ geography, defined Waldo Tobler 1970 follows (Waldo R. Tobler 1970):Everything related everything else, near things related distant things.‘law’ basis spatial autocorrelation key geographic concepts.\napplies phenomena diverse friendship networks ecological diversity can explained costs transport — terms time, energy money — constitute ‘friction distance’.\nperspective, transport technologies disruptive, changing spatial relationships geographic entities including mobile humans goods: “purpose transportation overcome space” (Rodrigue, Comtois, Slack 2013).Transport inherently spatial activity, involving moving origin point ‘’ destination point ‘B’, infinite localities .\ntherefore unsurprising transport researchers long turned geographic computational methods understand movement patterns, interventions can improve performance (Lovelace 2021).chapter introduces geographic analysis transport systems different geographic levels:Areal units: transport patterns can understood reference zonal aggregates, main mode travel (car, bike foot, example), average distance trips made people living particular zone, covered Section 13.3Desire lines: straight lines represent ‘origin-destination’ data records many people travel (travel) places (points zones) geographic space, topic Section 13.4Nodes: points transport system can represent common origins destinations public transport stations bus stops rail stations, topic Section 13.5Routes: lines representing path along route network along desire lines nodes.\nRoutes (can represented single linestrings multiple short segments) routing engines generate , covered Section 13.6Route networks: represent system roads, paths linear features area covered Section 13.7.\ncan represented geographic features (typically short segments road add create full network) structured interconnected graph, level traffic different segments referred ‘flow’ transport modelers (Hollander 2016)Another key level agents, mobile entities like vehicles enable us move bikes buses.\ncan represented computationally software MATSim /B Street, represent dynamics transport systems using agent-based modeling (ABM) framework, usually high levels spatial temporal resolution (Horni, Nagel, Axhausen 2016).\nABM powerful approach transport research great potential integration R’s spatial classes (Thiele 2014; Lovelace Dumont 2016), outside scope chapter.\nBeyond geographic levels agents, basic unit analysis many transport models trip, single purpose journey origin ‘’ destination ‘B’ (Hollander 2016).\nTrips join-different levels transport systems can represented simplistically geographic desire lines connecting zone centroids (nodes) routes follow transport route network.\ncontext, agents usually point entities move within transport network.Transport systems dynamic (Xie Levinson 2011).\nfocus chapter geographic analysis transport systems, provides insights approach can used simulate scenarios change, Section 13.8.\npurpose geographic transport modeling can interpreted simplifying complexity spatiotemporal systems ways capture essence.\nSelecting appropriate levels geographic analysis can help simplify complexity without losing important features variables, enabling better decision-making effective interventions (Hollander 2016).Typically, models designed tackle particular problem, improve safety environmental performance transport systems.\nreason, chapter based around policy scenario, introduced next section, asks: increase cycling city Bristol?\nChapter 14 demonstrates related application geocomputation: prioritizing location new bike shops.\nlink chapters: new effectively-located cycling infrastructure can get people cycling, boosting demand bike shops local economic activity.\nhighlights important feature transport systems: closely linked broader phenomena land-use patterns.","code":""},{"path":"transport.html","id":"bris-case","chapter":"13 Transportation","heading":"13.2 A case study of Bristol","text":"case study used chapter located Bristol, city west England, around 30 km east Welsh capital Cardiff.\noverview region’s transport network illustrated Figure 13.1, shows diversity transport infrastructure, cycling, public transport, private motor vehicles.\nFIGURE 13.1: Bristol’s transport network represented colored lines active (green), public (railways, blue) private motor (red) modes travel. Black border lines represent inner city boundary (highlighted yellow) larger Travel Work Area (TTWA).\nBristol 10th largest city council England, population half million people, although travel catchment area larger (see Section 13.3).\nvibrant economy aerospace, media, financial service tourism companies, alongside two major universities.\nBristol shows high average income per person also contains areas severe deprivation (Bristol City Council 2015).terms transport, Bristol well served rail road links, relatively high level active travel.\n19% citizens cycle 88% walk least per month according Active People Survey (national average 15% 81%, respectively).\n8% population said cycled work 2011 census, compared 3% nationwide.Like many cities, Bristol major congestion, air quality physical inactivity problems.\nCycling can tackle issues efficiently: greater potential replace car trips walking, typical speeds 15-20 km/h vs 4-6 km/h walking.\nreason Bristol’s Transport Strategy ambitious plans cycling.highlight importance policy considerations transportation research, chapter guided need provide evidence people (transport planners, politicians stakeholders) tasked getting people cars onto sustainable modes — walking cycling particular.\nbroader aim demonstrate geocomputation can support evidence-based transport planning.\nchapter learn :Describe geographical patterns transport behavior citiesIdentify key public transport nodes supporting multi-modal tripsAnalyze travel ‘desire lines’ find many people drive short distancesIdentify cycle route locations encourage less car driving cyclingTo get wheels rolling practical aspects chapter, next section begins loading zonal data travel patterns.\nzone-level datasets small often vital gaining basic understanding settlement’s overall transport system.","code":""},{"path":"transport.html","id":"transport-zones","chapter":"13 Transportation","heading":"13.3 Transport zones","text":"Although transport systems primarily based linear features nodes — including pathways stations — often makes sense start areal data, break continuous space tangible units (Hollander 2016).\naddition boundary defining study area (Bristol case), two zone types particular interest transport researchers: origin destination zones.\nOften, geographic units used origins destinations.\nHowever, different zoning systems, ‘Workplace Zones’, may appropriate represent increased density trip destinations areas many ‘trip attractors’ schools shops (Office National Statistics 2014).simplest way define study area often first matching boundary returned OpenStreetMap.\ncan done command osmdata::getbb(\"Bristol\", format_out = \"sf_polygon\", limit = 1).\nreturns sf object (list sf objects limit = 1 specified) representing bounds largest matching city region, either rectangular polygon bounding box detailed polygonal boundary.90\nBristol, detailed polygon returned, represented bristol_region object spDataLarge package.\nSee inner blue boundary Figure 13.1: couple issues approach:first boundary returned OSM may official boundary used local authoritiesEven OSM returns official boundary, may inappropriate transport research bear little relation people travelTravel Work Areas (TTWAs) address issues creating zoning system analogous hydrological watersheds.\nTTWAs first defined contiguous zones within 75% population travels work (Coombes, Green, Openshaw 1986), definition used chapter.\nBristol major employer attracting travel surrounding towns, TTWA substantially larger city bounds (see Figure 13.1).\npolygon representing transport-orientated boundary stored object bristol_ttwa, provided spDataLarge package loaded beginning chapter.origin destination zones used chapter : officially defined zones intermediate geographic resolution (official name Middle layer Super Output Areas MSOAs).\nhouses around 8,000 people.\nadministrative zones can provide vital context transport analysis, type people might benefit particular interventions (e.g., Moreno-Monroy, Lovelace, Ramos (2017)).geographic resolution zones important: small zones high geographic resolution usually preferable high number large regions can consequences processing (especially origin-destination analysis number possibilities increases non-linear function number zones) (Hollander 2016).\nAnother issue small zones related anonymity rules. make\nimpossible infer identity individuals zones, detailed\nsocio-demographic variables often available low geographic\nresolution. Breakdowns travel mode age sex, example, \navailable Local Authority level UK, much\nhigher Output Area level, contains around 100 households.\ndetails, see www.ons.gov.uk/methodology/geography.\n102 zones used chapter stored bristol_zones, illustrated Figure 13.2.\nNote zones get smaller densely populated areas: houses similar number people.\nbristol_zones contains attribute data transport, however, name code zone:add travel data, perform attribute join, common task described Section 3.2.4.\nuse travel data UK’s 2011 census question travel work, data stored bristol_od, provided ons.gov.uk data portal.\nbristol_od origin-destination (OD) dataset travel work zones UK’s 2011 Census (see Section 13.4).\nfirst column ID zone origin second column zone destination.\nbristol_od rows bristol_zones, representing travel zones rather zones :results previous code chunk shows 10 OD pairs every zone, meaning need aggregate origin-destination data joined bristol_zones, illustrated (origin-destination data described Section 13.4).preceding chunk:Grouped data zone origin (contained column o)Aggregated variables bristol_od dataset numeric, find total number people living zone mode transport91Renamed grouping variable o matches ID column geo_code bristol_zones objectThe resulting object zones_attr data frame rows representing zones ID variable.\ncan verify IDs match zones dataset using %% operator follows:results show 102 zones present new object zone_attr form can joined onto zones.92\ndone using joining function left_join() (note inner_join() produce result): result zones_joined, contains new columns representing total number trips originating zone study area (almost 1/4 million) mode travel (bicycle, foot, car train).\ngeographic distribution trip origins illustrated left-hand map Figure 13.2.\nshows zones 0 4,000 trips originating study area.\ntrips made people living near center Bristol fewer outskirts.\n?\nRemember dealing trips within study region: low trip numbers outskirts region can explained fact many people peripheral zones travel regions outside study area.\nTrips outside study region can included regional model special destination ID covering trips go zone represented model (Hollander 2016).\ndata bristol_od, however, simply ignores trips: ‘intra-zonal’ model.way OD datasets can aggregated zone origin, can also aggregated provide information destination zones.\nPeople tend gravitate towards central places.\nexplains spatial distribution represented right panel Figure 13.2 relatively uneven, common destination zones concentrated Bristol city center.\nresult zones_od, contains new column reporting number trip destinations mode, created follows:simplified version Figure 13.2 created code (see 13-zones.R code folder book’s GitHub repository reproduce figure Section 9.2.7 details faceted maps tmap):\nFIGURE 13.2: Number trips (commuters) living working region. left map shows zone origin commute trips; right map shows zone destination (generated script 13-zones.R).\n","code":"\nnames(bristol_zones)\n#> [1] \"geo_code\" \"name\" \"geometry\"\nnrow(bristol_od)\n#> [1] 2910\nnrow(bristol_zones)\n#> [1] 102\nzones_attr = bristol_od |> \n group_by(o) |> \n summarize(across(where(is.numeric), sum)) |> \n dplyr::rename(geo_code = o)\nsummary(zones_attr$geo_code %in% bristol_zones$geo_code)\n#> Mode TRUE \n#> logical 102\nzones_joined = left_join(bristol_zones, zones_attr, by = \"geo_code\")\nsum(zones_joined$all)\n#> [1] 238805\nnames(zones_joined)\n#> [1] \"geo_code\" \"name\" \"all\" \"bicycle\" \"foot\" \n#> [6] \"car_driver\" \"train\" \"geometry\"\nzones_destinations = bristol_od |> \n group_by(d) |> \n summarize(across(where(is.numeric), sum)) |> \n select(geo_code = d, all_dest = all)\nzones_od = inner_join(zones_joined, zones_destinations, by = \"geo_code\")\nqtm(zones_od, c(\"all\", \"all_dest\")) +\n tm_layout(panel.labels = c(\"Origin\", \"Destination\"))"},{"path":"transport.html","id":"desire-lines","chapter":"13 Transportation","heading":"13.4 Desire lines","text":"Desire lines connect origins destinations, representing people desire go, typically zones.\nrepresent quickest ‘bee line’ ‘crow flies’ route B taken, obstacles buildings windy roads getting way (see convert desire lines routes next section).\nTypically, desire lines represented geographically starting ending geographic (population weighted) centroid zone.\ntype desire line create use section, although worth aware ‘jittering’ techniques enable multiple start end points increase spatial coverage accuracy analyses building OD data (Lovelace, Félix, Carlino 2022).already loaded data representing desire lines dataset bristol_od.\norigin-destination (OD) data frame object represents number people traveling zone represented o d, illustrated Table 13.1.\narrange OD data trips filter-top 5, type (please refer Chapter 3 detailed description non-spatial attribute operations):TABLE 13.1: Sample top 5 origin-destination pairs Bristol OD data frame, representing travel desire lines zones study area.resulting table provides snapshot Bristolian travel patterns terms commuting (travel work).\ndemonstrates walking popular mode transport among top 5 origin-destination pairs, zone E02003043 popular destination (Bristol city center, destination top 5 OD pairs), intrazonal trips, one part zone E02003043 another (first row Table 13.1), constitute traveled OD pair dataset.\npolicy perspective, raw data presented Table 13.1 limited use: aside fact contains tiny portion 2,910 OD pairs, tells us little policy measures needed, proportion trips made walking cycling.\nfollowing command calculates percentage desire line made active modes:two main types OD pairs: interzonal intrazonal.\nInterzonal OD pairs represent travel zones destination different origin.\nIntrazonal OD pairs represent travel within zone (see top row Table 13.1).\nfollowing code chunk splits od_bristol two types:next step convert interzonal OD pairs sf object representing desire lines can plotted map stplanr function od2line().93An illustration results presented Figure 13.3, simplified version created following command (see code 13-desire.R reproduce figure exactly Chapter 9 details visualization tmap):\nFIGURE 13.3: Desire lines representing trip patterns Bristol, width representing number trips color representing percentage trips made active modes (walking cycling). four black lines represent interzonal OD pairs Table 13.1.\nmap shows city center dominates transport patterns region, suggesting policies prioritized , although number peripheral sub-centers can also seen.\nDesire lines important generalized components transport systems.\nconcrete components include nodes, specific destinations (rather hypothetical straight lines represented desire lines).\nNodes covered next section.","code":"\nod_top5 = bristol_od |> \n slice_max(all, n = 5)\nbristol_od$Active = (bristol_od$bicycle + bristol_od$foot) /\n bristol_od$all * 100\nod_intra = filter(bristol_od, o == d)\nod_inter = filter(bristol_od, o != d)\ndesire_lines = od2line(od_inter, zones_od)\n#> Creating centroids representing desire line start and end points.\nqtm(desire_lines, lines.lwd = \"all\")"},{"path":"transport.html","id":"nodes","chapter":"13 Transportation","heading":"13.5 Nodes","text":"Nodes geographic transport datasets points among predominantly linear features comprise transport networks.\nBroadly two main types transport nodes:Nodes directly network zone centroids individual origins destinations houses workplacesNodes part transport networks.\nTechnically, node can located point transport network practice often special kinds vertex intersections pathways (junctions) points entering exiting transport network bus stops train stations94Transport networks can represented graphs, segment connected (via edges representing geographic lines) one edges network.\nNodes outside network can added “centroid connectors”, new route segments nearby nodes network (Hollander 2016).95\nEvery node network connected one ‘edges’ represent individual segments network.\nsee transport networks can represented graphs Section 13.7.Public transport stops particularly important nodes can represented either type node: bus stop part road, large rail station represented pedestrian entry point hundreds meters railway tracks.\nuse railway stations illustrate public transport nodes, relation research question increasing cycling Bristol.\nstations provided spDataLarge bristol_stations.common barrier preventing people switching away cars commuting work distance home work far walk cycle.\nPublic transport can reduce barrier providing fast high-volume option common routes cities.\nactive travel perspective, public transport ‘legs’ longer journeys divide trips three:origin leg, typically residential areas public transport stationsThe public transport leg, typically goes station nearest trip’s origin station nearest destinationThe destination leg, station alighting destinationBuilding analysis conducted Section 13.4, public transport nodes can used construct three-part desire lines trips can taken bus (mode used example) rail.\nfirst stage identify desire lines public transport travel, case easy previously created dataset desire_lines already contains variable describing number trips train (public transport potential also estimated using public transport routing services OpenTripPlanner).\nmake approach easier follow, select top three desire lines terms rails use:challenge now ‘break-’ lines three pieces, representing travel via public transport nodes.\ncan done converting desire line multilinestring object consisting three line geometries representing origin, public transport destination legs trip.\noperation can divided three stages: matrix creation (origins, destinations ‘via’ points representing rail stations), identification nearest neighbors conversion multilinestrings.\nundertaken line_via().\nstplanr function takes input lines points returns copy desire lines — see ?line_via() details works.\noutput input line, except new geometry columns representing journey via public transport nodes, demonstrated :illustrated Figure 13.4, initial desire_rail lines now three additional geometry list columns representing travel home origin station, destination, finally destination station destination.\ncase, destination leg short (walking distance) origin legs may sufficiently far justify investment cycling infrastructure encourage people cycle stations outward leg peoples’ journey work residential areas surrounding three origin stations Figure 13.4.\nFIGURE 13.4: Station nodes (red dots) used intermediary points convert straight desire lines high rail usage (thin green lines) three legs: origin station (orange) via public transport (blue) destination (pink, visible short).\n","code":"\ndesire_rail = top_n(desire_lines, n = 3, wt = train)\nncol(desire_rail)\n#> [1] 9\ndesire_rail = line_via(desire_rail, bristol_stations)\nncol(desire_rail)\n#> [1] 12"},{"path":"transport.html","id":"routes","chapter":"13 Transportation","heading":"13.6 Routes","text":"\ngeographical perspective, routes desire lines longer straight: origin destination points desire line representation travel, pathway get B complex.\ngeometries routes typically (always) determined transport network.desire lines contain two vertices (beginning end points), routes can contain number vertices, representing points B joined straight lines: definition linestring geometry.\nRoutes covering large distances following intricate network can many hundreds vertices; routes grid-based simplified road networks tend fewer.Routes generated desire lines , commonly, matrices containing coordinate pairs representing desire lines.\nrouting process done range broadly-defined routing engines: software web services return geometries attributes describing get origins destinations.\nRouting engines can classified based run relative R:-memory routing using R packages enable route calculation (described Section 13.6.2)Locally hosted routing engines external R can called R (Section 13.6.3)Remotely hosted routing engines external entities provide web API can called R (Section 13.6.4)describing , worth outlining ways categorizing routing engines.\nRouting engines can multi-modal, meaning can calculate trips composed one mode transport, .\nMulti-modal routing engines can return results consisting multiple legs, one made different mode transport.\noptimal route residential area commercial area involve 1) walking nearest bus stop, 2) catching bus nearest node destination, 3) walking destination, given set input parameters.\ntransition points three legs commonly referred ‘ingress’ ‘egress’, meaning getting /public transport vehicle.\nMulti-modal routing engines R5 sophisticated larger input data requirements ‘uni-modal’ routing engines OSRM (described Section 13.6.3).major strength multi-modal engines ability represent ‘transit’ (public transport) trips trains, buses etc.\nMulti-model routing engines require input datasets representing public transport networks, typically General Transit Feed Specification (GTFS) files, can processed functions tidytransit gtfstools packages (packages tools working GTFS files available).\nSingle mode routing engines may sufficient projects focused specific (non public) modes transport.\nAnother way classifying routing engines (settings) geographic level outputs: routes, legs segments.","code":""},{"path":"transport.html","id":"route-legs-segments","chapter":"13 Transportation","heading":"13.6.1 Routes, legs and segments","text":"Routing engines can generate outputs three geographic levels routes, legs segments:Route level outputs contain single feature (typically multilinestring associated row data frame representation) per origin-destination pair, meaning single row data per tripLeg level outputs contain single feature associated attributes mode within origin-destination pair, described Section 13.5. trips involving one mode (example driving home work, ignoring short walk car) leg route: car journey. trips involving public transport, legs provide key information. r5r function detailed_itineraries() returns legs , confusingly, sometimes referred ‘segments’Segment level outputs provide detailed information routes, records small section transport network. Typically segments similar length, identical , ways OpenStreetMap. cyclestreets function journey() returns data segment level can aggregated grouping origin destination level data returned route() function stplanrMost routing engines return route level default, although multi-modal engines generally provide outputs leg level (one feature per continuous movement single mode transport).\nSegment level outputs advantage providing detail.\ncyclestreets package returns multiple ‘quietness’ levels per route, enabling identification ‘weakest link’ cycle networks.\nDisadvantages segment level outputs include increased file sizes complexities associated extra detail.Route level results can converted segment level results using function stplanr::overline() (Morgan Lovelace 2020).\nworking segment leg-level data, route-level statistics can returned grouping columns representing trip start end points summarizing/aggregating columns containing segment-level data.","code":""},{"path":"transport.html","id":"memengine","chapter":"13 Transportation","heading":"13.6.2 In-memory routing with R","text":"Routing engines R enable route networks stored R objects memory used basis route calculation.\nOptions include sfnetworks, dodgr cppRouting packages, provide class system represent route networks, topic next section.fast flexible, native R routing options generally harder set-dedicated routing engines realistic route calculation.\nRouting hard problem many hundreds hours put open source routing engines can downloaded hosted locally.\nhand, R-based routing engines may well suited model experiments statistical analysis impacts changes network.\nChanging route network characteristics (weights associated different route segment types), re-calculating routes, analyzing results many scenarios single language benefits research applications.","code":""},{"path":"transport.html","id":"localengine","chapter":"13 Transportation","heading":"13.6.3 Locally hosted dedicated routing engines","text":"Locally hosted routing engines include OpenTripPlanner, Valhalla, R5 (multi-modal), OpenStreetMap Routing Machine (OSRM) (‘uni-modal’).\ncan accessed R packages opentripplanner, valhallr, r5r osrm (Morgan et al. 2019; Pereira et al. 2021).\nLocally hosted routing engines run user’s computer process separate R.\nbenefit speed execution control weighting profile different modes transport.\nDisadvantages include difficulty representing complex networks locally; temporal dynamics (primarily due traffic); need specialized external software.","code":""},{"path":"transport.html","id":"remoteengine","chapter":"13 Transportation","heading":"13.6.4 Remotely hosted dedicated routing engines","text":"Remotely hosted routing engines use web API send queries origins destinations return results.\nRouting services based open source routing engines, OSRM’s publicly available service, work called R locally hosted instances, simply requiring arguments specifying ‘base URLs’ updated.\nHowever, fact external routing services hosted dedicated machine (usually funded commercial company incentives generate accurate routes) can give advantages, including:Provision routing services worldwide (usually least large region)Established routing services usually updated regularly can often respond traffic levelsRouting services usually run dedicated hardware software including systems load balancers ensure consistent performanceDisadvantages remote routing services include speed batch jobs possible (often rely data transfer internet route--route basis), price (Google routing API, example, limits number free queries) licensing issues.\ngoogleway mapbox packages demonstrate approach providing access routing services Google Mapbox, respectively.\nFree (rate limited) routing service include OSRM openrouteservice.org can accessed R osrm openrouteservice packages, latter CRAN.\nalso specific routing services provided CycleStreets.net, cycle journey planner --profit transport technology company “cyclists, cyclists”.\nR users can access CycleStreets routes via package cyclestreets, many routing services lack R interfaces, representing substantial opportunity package development: building R package provide interface web API can rewarding experience.","code":""},{"path":"transport.html","id":"contraction-hierarchies-and-traffic-assigment","chapter":"13 Transportation","heading":"13.6.5 Contraction hierarchies and traffic assigment","text":"Contraction hierarchies traffic assignment advanced important topics transport modeling worth aware , especially want code scale large networks.\nCalculating many routes computationally resource intensive can take hours, leading development several algorithms speed-routing calculations.\nContraction hierarchies well-known algorithm can lead substantial (1000x+ cases) speed-routing tasks, depending network size.\nContraction hierarchies used behind scenes routing engines mentioned previous sections.Traffic assignment problem closely related routing: practice, shortest path two points always fastest, especially congestion.\nprocess takes OD datasets, kind described Section 13.4, assigns traffic segment network, generating route networks kind described Section 13.7.\nestablished solution Wardrop’s principle user equilibrium shows , realistic, congestion considered estimating flows network, reference mathematically defined relationship cost flow (Wardrop 1952).\noptimization problem can solved iterative algorithms implemented cppRouting package, also implements contraction hierarchies fast routing.","code":""},{"path":"transport.html","id":"routing-a-worked-example","chapter":"13 Transportation","heading":"13.6.6 Routing: A worked example","text":"Instead routing desire lines generated Section 13.4, focus subset highly policy relevant.\nRunning computationally intensive operation subset trying process whole dataset often sensible, applies routing.\nRouting can time memory-consuming, resulting large objects, due detailed geometries extra attributes route objects.\ntherefore filter desire lines calculating routes section.Cycling beneficial replaces car trips.\nShort trips (around 5 km, can cycled 15 minutes speed 20 km/hr) relatively high probability cycled, maximum distance increases trips made electric bike (Lovelace et al. 2017).\nconsiderations inform following code chunk filters desire lines returns object desire_lines_short representing OD pairs many (100+) short (2.5 5 km Euclidean distance) trips driven:code st_length() calculated length desire line, described Section 4.2.3.\nfilter() function dplyr filtered desire_lines dataset based criteria outlined , described Section 3.2.1.\nnext stage convert desire lines routes.\ndone using publicly available OSRM service stplanr functions route() route_osrm() code chunk :output routes_short, sf object representing routes transport network suitable cycling (according OSRM routing engine least), one desire line.\nNote: calls external routing engines command work internet connection (sometimes API key stored environment variable, although case).\naddition columns contained desire_lines object, new route dataset contains distance (referring route distance time) duration columns (seconds), provide potentially useful extra information nature route.\nplot desire lines along many short car journeys take place alongside cycling routes.\nMaking width routes proportional number car journeys potentially replaced provides effective way prioritize interventions road network (Lovelace et al. 2017).\nFigure 13.5 shows routes along people drive short distances (see github.com/geocompx source code).96\nFIGURE 13.5: Routes along many (100+) short (<5km Euclidean distance) car journeys made (red) overlaying desire lines representing trips (black) zone centroids (dots).\nVisualizing results interactive map shows many short car trips take place around Bradley Stoke, around 10 km North central Bristol.\neasy find explanations area’s high level car dependency: according Wikipedia, Bradley Stoke “Europe’s largest new town built private investment”, suggesting limited public transport provision.\nFurthermore, town surrounded large (cycling unfriendly) road structures, including M4 M5 motorways (Tallon 2007).many benefits converting travel desire lines routes.\nimportant remember sure many () trips follow exact routes calculated routing engines.\nHowever, route street/way/segment level results can highly policy relevant.\nRoute segment results can enable prioritization investment needed, according available data (Lovelace et al. 2017).","code":"\ndesire_lines$distance_km = as.numeric(st_length(desire_lines)) / 1000\ndesire_lines_short = desire_lines |> \n filter(car_driver >= 100, distance_km <= 5, distance_km >= 2.5)\nroutes_short = route(l = desire_lines_short, route_fun = route_osrm,\n osrm.profile = \"bike\")\n#> "},{"path":"transport.html","id":"route-networks","chapter":"13 Transportation","heading":"13.7 Route networks","text":"\nroutes generally contain data travel behavior, geographic level desire lines OD pairs, route network datasets usually represent physical transport network.\nsegment route network roughly corresponds continuous section street junctions appears , although average length segments depends data source (segments OSM-derived bristol_ways dataset used section average length just 200 m, standard deviation nearly 500 m).\nVariability segment lengths can explained fact rural locations junctions far apart dense urban areas crossings segment breaks every meters.Route networks can input , output , transport data analysis projects, .\ntransport research involves route calculation requires route network dataset internal external routing engines (latter case route network data necessarily imported R).\nHowever, route networks also important outputs many transport research projects: summarizing data potential number trips made particular segments represented route network, can help prioritize investment needed.\ndemonstrate create route networks output derived route level data, imagine simple scenario mode shift.\nImagine 50% car trips 0 3 km route distance replaced cycling, percentage drops 10 percentage points every additional km route distance 20% car trips 6 km replaced cycling car trips 8 km longer replaced cycling.\ncourse unrealistic scenario (Lovelace et al. 2017), useful starting point.\ncase, can model mode shift cars bikes follows:created scenario approximately 4000 trips switched driving cycling, can now model updated modeled cycling activity take place.\n, use function overline() stplanr package.\nfunction breaks linestrings junctions (two linestring geometries meet), calculates aggregate statistics unique route segment (Morgan Lovelace 2020), taking object containing routes names attributes summarize first second argument:outputs two preceding code chunks summarized Figure 13.6 .\nFIGURE 13.6: Illustration percentage car trips switching cycling function distance (left) route network level results function (right).\nTransport networks records segment level, typically attributes road type width, constitute common type route network.\nroute network datasets available worldwide OpenStreetMap, can downloaded packages osmdata osmextract.\nsave time downloading preparing OSM, use bristol_ways object spDataLarge package, sf object LINESTRING geometries attributes representing sample transport network case study region (see ?bristol_ways details), shown output :output shows bristol_ways represents just 6 thousand segments transport network.\ngeographic networks can represented mathematical graphs, nodes network, connected edges.\nnumber R packages developed dealing graphs, notably igraph.\ncan manually convert route network igraph object, geographic attributes lost.\novercome limitation igraph, sfnetworks package (van der Meer et al. 2023), represent route networks simultaneously graphs geographic lines, developed.\ndemonstrate sfnetworks functionality bristol_ways object.output previous code chunk (final output shortened contain important 8 lines due space considerations) shows ways_sfn composite object, containing nodes edges graph spatial form.\nways_sfn class sfnetwork, builds igraph class igraph package.\nexample , ‘edge betweenness’, meaning number shortest paths passing edge, calculated (see ?igraph::betweenness details).\noutput edge betweenness calculation shown Figure 13.7, cycle route network dataset calculated overline() function overlay comparison.\nresults demonstrate graph edge represents segment: segments near center road network highest betweenness values, whereas segments closer central Bristol higher cycling potential, based simplistic datasets.\nFIGURE 13.7: Illustration route network datasets. grey lines represent simplified road network, segment thickness proportional betweenness. green lines represent potential cycling flows (one way) calculated code .\nOne can also find shortest route origins destinations using graph representation route network sfnetworks package.\n\nmethods presented section relatively simple compared possible.\ndual graph/spatial capabilities sfnetworks enable many new powerful techniques can fully covered section.\nsection , however, provide strong starting point exploration research area.\nfinal point example dataset used relatively small.\nmay also worth considering work adapt larger networks: testing methods subset data, ensuring enough RAM help, although ’s also worth exploring tools can transport network analysis optimized large networks, R5 (Alessandretti et al. 2022).","code":"\nuptake = function(x) {\n case_when(\n x <= 3 ~ 0.5,\n x >= 8 ~ 0,\n TRUE ~ (8 - x) / (8 - 3) * 0.5\n )\n}\nroutes_short_scenario = routes_short |> \n mutate(uptake = uptake(distance / 1000)) |> \n mutate(bicycle = bicycle + car_driver * uptake,\n car_driver = car_driver * (1 - uptake))\nsum(routes_short_scenario$bicycle) - sum(routes_short$bicycle)\n#> [1] 598\nroute_network_scenario = overline(routes_short_scenario, attrib = \"bicycle\")\nsummary(bristol_ways)\n#> highway maxspeed ref geometry \n#> cycleway:1721 Length:6160 Length:6160 LINESTRING :6160 \n#> rail :1017 Class :character Class :character epsg:4326 : 0 \n#> road :3422 Mode :character Mode :character +proj=long...: 0\nbristol_ways$lengths = st_length(bristol_ways)\nways_sfn = as_sfnetwork(bristol_ways)\nclass(ways_sfn)\n#> [1] \"sfnetwork\" \"tbl_graph\" \"igraph\"\nways_sfn\n#> # A sfnetwork with 5728 nodes and 4915 edges\n#> # A directed multigraph with 1013 components with spatially explicit edges\n#> # Node Data: 5,728 × 1 (active)\n#> # Edge Data: 4,915 × 7\n#> from to highway maxspeed ref geometry lengths\n#> [m]\n#> 1 1 2 road B3130 (-2.61 51.4, -2.61 51.4, -2.61 51.… 218.\n#> # … \nways_centrality = ways_sfn |> \n activate(\"edges\") |> \n mutate(betweenness = tidygraph::centrality_edge_betweenness(lengths)) "},{"path":"transport.html","id":"prioritizing-new-infrastructure","chapter":"13 Transportation","heading":"13.8 Prioritizing new infrastructure","text":"section demonstrates geocomputation can create policy relevant outcomes field transport planning.\nidentify promising locations investment sustainable transport infrastructure, using simple approach educational purposes.advantage data driven approach outlined chapter modularity: aspect can useful , feed wider analyses.\nsteps got us stage included identifying short car-dependent commuting routes (generated desire lines) Section 13.6 analysis route network characteristics sfnetworks package Section 13.7.\nfinal code chunk chapter combines strands analysis, overlaying estimates cycling potential previous section top new dataset representing areas within short distance cycling infrastructure.\nnew dataset created code chunk : 1) filters cycleway entities bristol_ways object representing transport network; 2) ‘unions’ individual LINESTRING entities cycleways single multilinestring object (speed buffering); 3) creates 100 m buffer around create polygon.next stage create dataset representing points network high cycling potential little provision cycling.results preceding code chunks shown Figure 13.8, shows routes high levels car dependency high cycling potential cycleways.\nFIGURE 13.8: Potential routes along prioritise cycle infrastructure Bristol reduce car dependency. static map provides overview overlay existing infrastructure routes high car-bike switching potential (left). screenshot interactive map generated qtm() function highlights Whiteladies Road somewhere benefit new cycleway (right).\nmethod limitations: reality, people travel zone centroids always use shortest route algorithm particular mode.\nHowever, results demonstrate geographic data analysis can used highlight places new investment cycleways particularly beneficial, despite simplicity approach.\nanalysis need substantially expanded — including larger input datasets — inform transport planning design practice.","code":"\nexisting_cycleways_buffer = bristol_ways |> \n filter(highway == \"cycleway\") |> # 1) filter out cycleways\n st_union() |> # 2) unite geometries\n st_buffer(dist = 100) # 3) create buffer\nroute_network_no_infra = st_difference(\n route_network_scenario,\n route_network_scenario |> st_set_crs(st_crs(existing_cycleways_buffer)),\n existing_cycleways_buffer\n)\ntmap_mode(\"view\")\nqtm(route_network_no_infra, basemaps = leaflet::providers$Esri.WorldTopoMap,\n lines.lwd = 5)"},{"path":"transport.html","id":"future-directions-of-travel","chapter":"13 Transportation","heading":"13.9 Future directions of travel","text":"chapter provided taste possibilities using geocomputation transport research, explored key geographic elements make-city’s transport system open data reproducible code.\nresults help plan investment needed.Transport systems operate multiple interacting levels, meaning geocomputational methods great potential generate insights work, likely impacts different interventions.\nmuch done area: possible build foundations presented chapter many directions.\nTransport fastest growing source greenhouse gas emissions many countries, set become “largest GHG emitting sector, especially developed countries” (see EURACTIV.com).\nTransport-related emissions unequally distributed across society (unlike food heating) essential well-.\ngreat potential sector rapidly decarbonize demand reduction, electrification vehicle fleet uptake active travel modes walking cycling.\nNew technologies can reduce car dependency enabling car sharing.\n‘Micro-mobility’ systems dockless bike e-scooter schemes also emerging, creating valuable datasets General Bikeshare Feed Specification (GBFS) format, can imported processed gbfs package.\nchanges large impacts accessibility, ability people reach employment service locations need, something can quantified currently scenarios change packages accessibility packages.\nexploration ‘transport futures’ local, regional national levels yield important new insights.Methodologically, foundations presented chapter extended including variables analysis.\nCharacteristics route speed limits, busyness provision protected cycling walking paths linked ‘mode-split’ (proportion trips made different modes transport).\naggregating OpenStreetMap data using buffers geographic data methods presented Chapters 3 4, example, possible detect presence green space close proximity transport routes.\nUsing R’s statistical modeling capabilities, used predict current future levels cycling, example.type analysis underlies Propensity Cycle Tool (PCT), publicly accessible (see www.pct.bike) mapping tool developed R used prioritize investment cycling across England (Lovelace et al. 2017).\nSimilar tools used encourage evidence-based transport policies related topics air pollution public transport access around world.","code":""},{"path":"transport.html","id":"ex-transport","chapter":"13 Transportation","heading":"13.10 Exercises","text":"E1. much analysis presented chapter focused active modes, driving trips?proportion trips desire_lines object made driving?proportion desire_lines straight line length 5 km distance?proportion trips desire lines longer 5 km length made driving?Plot desire lines less 5 km length along 50% trips made car.notice location car dependent yet short desire lines?E2. additional length cycleways result routes presented last Figure, sections beyond 100 m existing cycleways, constructed?E3. proportion trips represented desire_lines accounted routes_short_scenario object?Bonus: proportion trips happen desire lines cross routes_short_scenario?E4. analysis presented chapter designed teaching geocomputation methods can applied transport research.\nreal, government transport consultancy, top 3 things differently?E5. Clearly, routes identified last Figure provide part picture.\nextend analysis?E6. Imagine want extend scenario creating key areas (routes) investment place-based cycling policies car-free zones, cycle parking points reduced car parking strategy.\nraster datasets assist work?Bonus: develop raster layer divides Bristol region 100 cells (10 10) estimate average speed limit roads , bristol_ways dataset (see Chapter 14).","code":""},{"path":"location.html","id":"location","chapter":"14 Geomarketing","heading":"14 Geomarketing","text":"","code":""},{"path":"location.html","id":"prerequisites-12","chapter":"14 Geomarketing","heading":"Prerequisites","text":"chapter requires following packages (tmaptools must also installed):Required data downloaded due course.convenience reader ensure easy reproducibility, made available downloaded data spDataLarge package.","code":"\nlibrary(sf)\nlibrary(dplyr)\nlibrary(purrr)\nlibrary(terra)\nlibrary(osmdata)\nlibrary(spDataLarge)"},{"path":"location.html","id":"introduction-8","chapter":"14 Geomarketing","heading":"14.1 Introduction","text":"chapter demonstrates skills learned Parts II can applied particular domain: geomarketing (sometimes also referred location analysis location intelligence).\nbroad field research commercial application.\ntypical example geomarketing locate new shop.\naim attract visitors , ultimately, make profit.\nalso many non-commercial applications can use technique public benefit, example locate new health services (Tomintz, Clarke, Rigby 2008).People fundamental location analysis, particular likely spend time resources.\nInterestingly, ecological concepts models quite similar used store location analysis.\nAnimals plants can best meet needs certain ‘optimal’ locations, based variables change space (Muenchow et al. (2018); see also Chapter 15).\none great strengths geocomputation GIScience general: concepts methods transferable fields.\nPolar bears, example, prefer northern latitudes temperatures lower food (seals sea lions) plentiful.\nSimilarly, humans tend congregate certain places, creating economic niches (high land prices) analogous ecological niche Arctic.\nmain task location analysis find , based available data, ‘optimal locations’ specific services.\nTypical research questions include:target groups live areas frequent?competing stores services located?many people can easily reach specific stores?existing services - -utilize market potential?market share company specific area?chapter demonstrates geocomputation can answer questions based hypothetical case study based real data.","code":""},{"path":"location.html","id":"case-study","chapter":"14 Geomarketing","heading":"14.2 Case study: bike shops in Germany","text":"Imagine starting chain bike shops Germany.\nstores placed urban areas many potential customers possible.\nAdditionally, hypothetical survey (invented chapter, commercial use!) suggests single young males (aged 20 40) likely buy products: target audience.\nlucky position sufficient capital open number shops.\nplaced?\nConsulting companies (employing geomarketing analysts) happily charge high rates answer questions.\nLuckily, can help open data open source software.\nfollowing sections demonstrate techniques learned first chapters book can applied undertake common steps service location analysis:Tidy input data German census (Section 14.3)Convert tabulated census data raster objects (Section 14.4)Identify metropolitan areas high population densities (Section 14.5)Download detailed geographic data (OpenStreetMap, osmdata) areas (Section 14.6)Create rasters scoring relative desirability different locations using map algebra (Section 14.7)Although applied steps specific case study, generalized many scenarios store location public service provision.","code":""},{"path":"location.html","id":"tidy-the-input-data","chapter":"14 Geomarketing","heading":"14.3 Tidy the input data","text":"German government provides gridded census data either 1 km 100 m resolution.\nfollowing code chunk downloads, unzips reads 1 km data.Please note census_de also available spDataLarge package:census_de object data frame containing 13 variables 360,000 grid cells across Germany.\nwork, need subset : Easting (x) Northing (y), number inhabitants (population; pop), mean average age (mean_age), proportion women (women) average household size (hh_size).\nvariables selected renamed German English code chunk summarized Table 14.1.\n, mutate() used convert values -1 -9 (meaning “unknown”) NA.TABLE 14.1: Categories variable census data Datensatzbeschreibung…xlsx located downloaded file census.zip (see Figure 14.1 spatial distribution).","code":"\ndownload.file(\"https://tinyurl.com/ybtpkwxz\", \n destfile = \"census.zip\", mode = \"wb\")\nunzip(\"census.zip\") # unzip the files\ncensus_de = readr::read_csv2(list.files(pattern = \"Gitter.csv\"))\ndata(\"census_de\", package = \"spDataLarge\")\n# pop = population, hh_size = household size\ninput = select(census_de, x = x_mp_1km, y = y_mp_1km, pop = Einwohner,\n women = Frauen_A, mean_age = Alter_D, hh_size = HHGroesse_D)\n# set -1 and -9 to NA\ninput_tidy = mutate(input, across(.cols = c(pop, women, mean_age, hh_size), \n .fns = ~ifelse(.x %in% c(-1, -9), NA, .x)))"},{"path":"location.html","id":"create-census-rasters","chapter":"14 Geomarketing","heading":"14.4 Create census rasters","text":"preprocessing, data can converted SpatRaster object (see Sections 2.3.4 3.3.1) help rast() function.\nsetting type argument xyz, x y columns input data frame correspond coordinates regular grid.\nremaining columns (: pop, women, mean_age, hh_size) serve values raster layers (Figure 14.1; see also code/14-location-figures.R GitHub repository).\nFIGURE 14.1: Gridded German census data 2011 (see Table 14.1 description classes).\nnext stage reclassify values rasters stored input_ras accordance survey mentioned Section 14.2, using terra function classify(), introduced Section 4.3.3.\ncase population data, convert classes numeric data type using class means.\nRaster cells assumed population 127 value 1 (cells ‘class 1’ contain 3 250 inhabitants) 375 value 2 (containing 250 500 inhabitants), (see Table 14.1).\ncell value 8000 inhabitants chosen ‘class 6’ cells contain 8000 people.\ncourse, approximations true population, precise values.97\nHowever, level detail sufficient delineate metropolitan areas (see next section).contrast pop variable, representing absolute estimates total population, remaining variables re-classified weights corresponding weights used survey.\nClass 1 variable women, instance, represents areas 0 40% population female;\nreclassified comparatively high weight 3 target demographic predominantly male.\nSimilarly, classes containing youngest people highest proportion single households reclassified high weights.Note made sure order reclassification matrices list elements input_ras.\ninstance, first element corresponds cases population.\nSubsequently, -loop applies reclassification matrix corresponding raster layer.\nFinally, code chunk ensures reclass layers name layers input_ras.","code":"\ninput_ras = rast(input_tidy, type = \"xyz\", crs = \"EPSG:3035\")\ninput_ras\n#> class : SpatRaster \n#> dimensions : 868, 642, 4 (nrow, ncol, nlyr)\n#> resolution : 1000, 1000 (x, y)\n#> extent : 4031000, 4673000, 2684000, 3552000 (xmin, xmax, ymin, ymax)\n#> coord. ref. : ETRS89-extended / LAEA Europe (EPSG:3035) \n#> source(s) : memory\n#> names : pop, women, mean_age, hh_size \n#> min values : 1, 1, 1, 1 \n#> max values : 6, 5, 5, 5\nrcl_pop = matrix(c(1, 1, 127, 2, 2, 375, 3, 3, 1250, \n 4, 4, 3000, 5, 5, 6000, 6, 6, 8000), \n ncol = 3, byrow = TRUE)\nrcl_women = matrix(c(1, 1, 3, 2, 2, 2, 3, 3, 1, 4, 5, 0), \n ncol = 3, byrow = TRUE)\nrcl_age = matrix(c(1, 1, 3, 2, 2, 0, 3, 5, 0),\n ncol = 3, byrow = TRUE)\nrcl_hh = rcl_women\nrcl = list(rcl_pop, rcl_women, rcl_age, rcl_hh)\nreclass = input_ras\nfor (i in seq_len(nlyr(reclass))) {\n reclass[[i]] = classify(x = reclass[[i]], rcl = rcl[[i]], right = NA)\n}\nnames(reclass) = names(input_ras)\nreclass # full output not shown\n#> ... \n#> names : pop, women, mean_age, hh_size \n#> min values : 127, 0, 0, 0 \n#> max values : 8000, 3, 3, 3"},{"path":"location.html","id":"define-metropolitan-areas","chapter":"14 Geomarketing","heading":"14.5 Define metropolitan areas","text":"deliberately define metropolitan areas pixels 20 km2 inhabited 500,000 people.\nPixels coarse resolution can rapidly created using aggregate(), introduced Section 5.3.3.\ncommand uses argument fact = 20 reduce resolution result twenty-fold (recall original raster resolution 1 km2).next stage keep cells half million people.Plotting reveals eight metropolitan regions (Figure 14.2).\nregion consists one raster cells.\nnice join cells belonging one region.\nterra’s patches() command exactly .\nSubsequently, .polygons() converts raster object spatial polygons, st_as_sf() converts sf object.\nFIGURE 14.2: aggregated population raster (resolution: 20 km) identified metropolitan areas (golden polygons) corresponding names.\nresulting eight metropolitan areas suitable bike shops (Figure 14.2; see also code/14-location-figures.R creating figure) still missing name.\nreverse geocoding approach can settle problem: given coordinate, finds corresponding address.\nConsequently, extracting centroid coordinate metropolitan area can serve input reverse geocoding API.\nexactly rev_geocode_OSM() function tmaptools package expects.\nSetting additionally .data.frame TRUE give back data.frame several columns referring location including street name, house number city.\nHowever, , interested name city.make sure reader uses exact results, put spDataLarge object metro_names.TABLE 14.2: Result reverse geocoding.Overall, satisfied city column serving metropolitan names (Table 14.2) apart one exception, namely Velbert belongs greater region Düsseldorf.\nHence, replace Velbert Düsseldorf (Figure 14.2).\nUmlauts like ü might lead trouble , example determining bounding box metropolitan area opq() (see ), avoid .","code":"\npop_agg = aggregate(reclass$pop, fact = 20, fun = sum, na.rm = TRUE)\nsummary(pop_agg)\n#> pop \n#> Min. : 127 \n#> 1st Qu.: 39886 \n#> Median : 66008 \n#> Mean : 99503 \n#> 3rd Qu.: 105696 \n#> Max. :1204870 \n#> NA's :447\npop_agg = pop_agg[pop_agg > 500000, drop = FALSE] \nmetros = pop_agg |> \n patches(directions = 8) |>\n as.polygons() |>\n st_as_sf()\nmetro_names = sf::st_centroid(metros, of_largest_polygon = TRUE) |>\n tmaptools::rev_geocode_OSM(as.data.frame = TRUE) |>\n select(city, town, state)\n# smaller cities are returned in column town. To have all names in one column,\n# we move the town name to the city column in case it is NA\nmetro_names = dplyr::mutate(metro_names, city = ifelse(is.na(city), town, city))\nmetro_names = metro_names$city |> \n as.character() |>\n {\\(x) ifelse(x == \"Velbert\", \"Düsseldorf\", x)}() |>\n {\\(x) gsub(\"ü\", \"ue\", x)}()"},{"path":"location.html","id":"points-of-interest","chapter":"14 Geomarketing","heading":"14.6 Points of interest","text":"\nosmdata package provides easy--use access OSM data (see also Section 8.5).\nInstead downloading shops whole Germany, restrict query defined metropolitan areas, reducing computational load providing shop locations areas interest.\nsubsequent code chunk using number functions including:map() (tidyverse equivalent lapply()), iterates eight metropolitan names subsequently define bounding box OSM query function opq() (see Section 8.5)add_osm_feature() specify OSM elements key value shop (see wiki.openstreetmap.org list common key:value pairs)osmdata_sf(), converts OSM data spatial objects (class sf)(), tries two times download data download failed first time98Before running code: please consider download almost 2GB data.\nsave time resources, put output named shops spDataLarge.\nmake available environment run data(\"shops\", package = \"spDataLarge\").highly unlikely shops defined metropolitan areas.\nfollowing condition simply checks least one shop region.\n, recommend try download shops /specific region/s.make sure list element (sf data frame) comes columns99 keep osm_id shop columns help map_dfr loop additionally combines shops one large sf object.Note: shops provided spDataLarge can accessed follows:thing left convert spatial point object raster (see Section 6.4).\nsf object, shops, converted raster parameters (dimensions, resolution, CRS) reclass object.\nImportantly, length() function used count number shops cell.result subsequent code chunk therefore estimate shop density (shops/km2).\nst_transform() used rasterize() ensure CRS inputs match.raster layers (population, women, mean age, household size) poi raster reclassified four classes (see Section 14.4).\nDefining class intervals arbitrary undertaking certain degree.\nOne can use equal breaks, quantile breaks, fixed values others.\n, choose Fisher-Jenks natural breaks approach minimizes within-class variance, result provides input reclassification matrix.","code":"\nshops = purrr::map(metro_names, function(x) {\n message(\"Downloading shops of: \", x, \"\\n\")\n # give the server a bit time\n Sys.sleep(sample(seq(5, 10, 0.1), 1))\n query = osmdata::opq(x) |>\n osmdata::add_osm_feature(key = \"shop\")\n points = osmdata::osmdata_sf(query)\n # request the same data again if nothing has been downloaded\n iter = 2\n while (nrow(points$osm_points) == 0 && iter > 0) {\n points = osmdata_sf(query)\n iter = iter - 1\n }\n # return only the point features\n points$osm_points\n})\n# checking if we have downloaded shops for each metropolitan area\nind = purrr::map_dbl(shops, nrow) == 0\nif (any(ind)) {\n message(\"There are/is still (a) metropolitan area/s without any features:\\n\",\n paste(metro_names[ind], collapse = \", \"), \"\\nPlease fix it!\")\n}\n# select only specific columns\nshops = purrr::map_dfr(shops, select, osm_id, shop)\ndata(\"shops\", package = \"spDataLarge\")\nshops = sf::st_transform(shops, st_crs(reclass))\n# create poi raster\npoi = rasterize(x = shops, y = reclass, field = \"osm_id\", fun = \"length\")\n# construct reclassification matrix\nint = classInt::classIntervals(values(poi), n = 4, style = \"fisher\")\nint = round(int$brks)\nrcl_poi = matrix(c(int[1], rep(int[-c(1, length(int))], each = 2), \n int[length(int)] + 1), ncol = 2, byrow = TRUE)\nrcl_poi = cbind(rcl_poi, 0:3) \n# reclassify\npoi = classify(poi, rcl = rcl_poi, right = NA) \nnames(poi) = \"poi\""},{"path":"location.html","id":"identifying-suitable-locations","chapter":"14 Geomarketing","heading":"14.7 Identifying suitable locations","text":"steps remain combining layers add poi reclass raster stack remove population layer .\nreasoning latter twofold.\nFirst , already delineated metropolitan areas, areas population density average compared rest Germany.\nSecond, though advantageous many potential customers within specific catchment area, sheer number alone might actually represent desired target group.\ninstance, residential tower blocks areas high population density necessarily high purchasing power expensive cycle components.common data science projects, data retrieval ‘tidying’ consumed much overall workload far.\nclean data, final step — calculating final score summing raster layers — can accomplished single line code.instance, score greater 9 might suitable threshold indicating raster cells bike shop placed (Figure 14.3; see also code/14-location-figures.R).\nFIGURE 14.3: Suitable areas (.e., raster cells score > 9) accordance hypothetical survey bike stores Berlin.\n","code":"\n# remove population raster and add poi raster\nreclass = reclass[[names(reclass) != \"pop\"]] |>\n c(poi)\n# calculate the total score\nresult = sum(reclass)"},{"path":"location.html","id":"discussion-and-next-steps","chapter":"14 Geomarketing","heading":"14.8 Discussion and next steps","text":"presented approach typical example normative usage GIS (Longley 2015).\ncombined survey data expert-based knowledge assumptions (definition metropolitan areas, defining class intervals, definition final score threshold).\napproach less suitable scientific research applied analysis provides evidence based indication areas suitable bike shops compared sources information.\nnumber changes approach improve analysis:used equal weights calculating final scores factors, household size, important portion women mean ageWe used points interest related bike shops, --, hardware, bicycle, fishing, hunting, motorcycles, outdoor sports shops (see range shop values available OSM Wiki) may yielded refined resultsData higher resolution may improve output (see exercises)used limited set variables data sources, INSPIRE geoportal data cycle paths OpenStreetMap, may enrich analysis (see also Section 8.5)Interactions remained unconsidered, possible relationships portion men single householdsIn short, analysis extended multiple directions.\nNevertheless, given first impression understanding obtain deal spatial data R within geomarketing context.Finally, point presented analysis merely first step finding suitable locations.\nfar identified areas, 1 1 km size, representing potentially suitable locations bike shop accordance survey.\nSubsequent steps analysis taken:Find optimal location based number inhabitants within specific catchment area.\nexample, shop reachable many people possible within 15 minutes traveling bike distance (catchment area routing).\nThereby, account fact away people shop, unlikely becomes actually visit (distance decay function)Also good idea take account competitors.\n, already bike shop vicinity chosen location, possible customers (sales potential) distributed competitors (Huff 1963; Wieland 2017)need find suitable affordable real estate, e.g., terms accessibility, availability parking spots, desired frequency passers-, big windows, etc.","code":""},{"path":"location.html","id":"exercises-10","chapter":"14 Geomarketing","heading":"14.9 Exercises","text":"E1. Download csv file containing inhabitant information 100 m cell resolution (https://www.zensus2011.de/SharedDocs/Downloads/DE/Pressemitteilung/DemografischeGrunddaten/csv_Bevoelkerung_100m_Gitter.zip?__blob=publicationFile&v=3).\nPlease note unzipped file size 1.23 GB.\nread R can use readr::read_csv.\ntakes 30 seconds machine 16 GB RAM.\ndata.table::fread() might even faster, returns object class data.table().\nUse dplyr::as_tibble() convert tibble.\nBuild inhabitant raster, aggregate cell resolution 1 km, compare difference inhabitant raster (inh) created using class mean values.E2. Suppose bike shop predominantly sold electric bikes older people.\nChange age raster accordingly, repeat remaining analyses compare changes original result.","code":""},{"path":"eco.html","id":"eco","chapter":"15 Ecology","heading":"15 Ecology","text":"","code":""},{"path":"eco.html","id":"prerequisites-13","chapter":"15 Ecology","heading":"Prerequisites","text":"chapter assumes strong grasp geographic data analysis processing, covered Chapters 2 5.\nchapter makes use bridges GIS software, spatial cross-validation, covered Chapters 10 12 respectively.chapter uses following packages:","code":"\nlibrary(sf)\nlibrary(terra)\nlibrary(dplyr)\nlibrary(data.table) # fast data frame manipulation (used by mlr3)\nlibrary(mlr3) # machine learning (see Chapter 12)\nlibrary(mlr3spatiotempcv) # spatiotemporal resampling \nlibrary(mlr3tuning) # hyperparameter tuning package\nlibrary(mlr3learners) # interface to most important machine learning pkgs\nlibrary(paradox) # defining hyperparameter spaces\nlibrary(ranger) # random forest package\nlibrary(qgisprocess) # bridge to QGIS (Chapter 10)\nlibrary(tree) # decision tree package\nlibrary(vegan) # community ecology package"},{"path":"eco.html","id":"introduction-9","chapter":"15 Ecology","heading":"15.1 Introduction","text":"chapter models floristic gradient fog oases reveal distinctive vegetation belts clearly controlled water availability.\ncase study provides opportunity bring together extend concepts presented previous chapters enhance skills using R geocomputation.Fog oases, locally called lomas, vegetation formations found mountains along coastal deserts Peru Chile.\nSimilar ecosystems can found elsewhere, including deserts Namibia along coasts Yemen Oman (Galletti, Turner, Myint 2016).\nDespite arid conditions low levels precipitation around 30-50 mm per year average, fog deposition increases amount water available plants austral winter, resulting green southern-facing mountain slopes along coastal strip Peru.\nfog, develops temperature inversion caused cold Humboldt current austral winter, provides name habitat.\nEvery years, El Niño phenomenon brings torrential rainfall sun-baked environment, providing tree seedlings chance develop roots long enough survive following arid conditions (Dillon, Nakazawa, Leiva 2003).Unfortunately, fog oases heavily endangered, primarily due agriculture anthropogenic climate change.\nEvidence composition spatial distribution native flora can support efforts protect remaining fragments fog oases (Muenchow, Bräuning, et al. 2013; Muenchow, Hauenstein, et al. 2013).chapter analyze composition spatial distribution vascular plants (referring mostly flowering plants) southern slope Mt. Mongón, lomas mountain near Casma central northern coast Peru (Figure 15.1).\nfield study Mt. Mongón, vascular plants living 100 randomly sampled 4x4 m2 plots austral winter 2011 recorded (Muenchow, Bräuning, et al. 2013).\nsampling coincided strong La Niña event year, shown data published National Oceanic Atmospheric Administration (NOAA).\nled even higher levels aridity usual coastal desert increased fog activity southern slopes Peruvian lomas mountains.\nFIGURE 15.1: Mt. Mongón study area, Muenchow, Schratz, Brenning (2017).\nchapter also demonstrates apply techniques covered previous chapters important applied field: ecology.\nSpecifically, :Load needed data compute environmental predictors (Section 15.2)Extract main floristic gradient species composition matrix help dimension-reducing technique (ordinations; Section 15.3)Model first ordination axis, .e., floristic gradient, function environmental predictors altitude, slope, catchment area NDVI (Section 15.4).\n, make use random forest model — popular machine learning algorithm (Breiman 2001). guarantee optimal prediction, advisable tune beforehand hyperparameters help spatial cross-validation (see Section 12.5.2)Make spatial distribution map floristic composition anywhere study area (Section 15.4.2)","code":""},{"path":"eco.html","id":"data-and-data-preparation","chapter":"15 Ecology","heading":"15.2 Data and data preparation","text":"data needed subsequent analyses available via spDataLarge package.study_area polygon representing outline study area, random_points sf object containing 100 randomly chosen sites.\ncomm community matrix wide data format (Wickham 2014) rows represent visited sites field columns observed species.100The values represent species cover per site, recorded area covered species proportion site area (%; please note one site can >100% due overlapping cover individual plants).\nrownames comm correspond id column random_points.\ndem digital elevation model (DEM) study area, ndvi Normalized Difference Vegetation Index (NDVI) computed red near-infrared channels Landsat scene (see Section 4.3.3 ?spDataLarge::ndvi.tif).\nVisualizing data helps get familiar , shown Figure 15.2 dem overplotted random_points study_area.\nFIGURE 15.2: Study mask (polygon), location sampling sites (black points) DEM background.\nnext step compute variables needed modeling predictive mapping (see Section 15.4.2) also aligning non-metric multidimensional scaling (NMDS) axes main gradient study area, altitude humidity, respectively (see Section 15.3).Specifically, compute catchment slope catchment area digital elevation model using R-GIS bridges (see Chapter 10).\nCurvatures might also represent valuable predictors, exercise section can find impact modeling result.compute catchment area catchment slope, can make use sagang:sagawetnessindex function.101\nqgis_show_help() returns function parameters default values specific geoalgorithm.\n, present selection complete output.Subsequently, can specify needed parameters using R named arguments (see Section 10.2).\nRemember can use path file disk SpatRaster living R’s global environment specify input raster DEM (see Section 10.2).\nSpecifying 1 SLOPE_TYPE makes sure algorithm return catchment slope.\nresulting rasters saved temporary files .sdat extension native SAGA raster format.returns list named ep containing paths computed output rasters.\nLet’s read catchment area well catchment slope multilayer SpatRaster object (see Section 2.3.4).\nAdditionally, add two raster objects , namely dem ndvi.Additionally, catchment area values highly skewed right (hist(ep$carea)).\nlog10-transformation makes distribution normal.convenience reader, added ep spDataLarge:Finally, can extract terrain attributes field observations (see also Section 6.3).","code":"\ndata(\"study_area\", \"random_points\", \"comm\", package = \"spDataLarge\")\ndem = rast(system.file(\"raster/dem.tif\", package = \"spDataLarge\"))\nndvi = rast(system.file(\"raster/ndvi.tif\", package = \"spDataLarge\"))\n# sites 35 to 40 and corresponding occurrences of the first five species in the\n# community matrix\ncomm[35:40, 1:5]\n#> Alon_meri Alst_line Alte_hali Alte_porr Anth_eccr\n#> 35 0 0 0 0.0 1.000\n#> 36 0 0 1 0.0 0.500\n#> 37 0 0 0 0.0 0.125\n#> 38 0 0 0 0.0 3.000\n#> 39 0 0 0 0.0 2.000\n#> 40 0 0 0 0.2 0.125\n# if not already done, enable the saga next generation plugin\nqgisprocess::qgis_enable_plugins(\"processing_saga_nextgen\")\n# show help\nqgisprocess::qgis_show_help(\"sagang:sagawetnessindex\")\n#> Saga wetness index (sagang:sagawetnessindex)\n#> ...\n#> ----------------\n#> Arguments\n#> ----------------\n#> \n#> DEM: Elevation\n#> Argument type: raster\n#> Acceptable values:\n#> - Path to a raster layer\n#> ...\n#> SLOPE_TYPE: Type of Slope\n#> Argument type: enum\n#> Available values:\n#> - 0: [0] local slope\n#> - 1: [1] catchment slope\n#> ...\n#> AREA: Catchment area\n#> Argument type: rasterDestination\n#> Acceptable values:\n#> - Path for new raster layer\n#>... \n#> ----------------\n#> Outputs\n#> ----------------\n#> \n#> AREA: \n#> Catchment area\n#> SLOPE: \n#> Catchment slope\n#> ...\n# environmental predictors: catchment slope and catchment area\nep = qgisprocess::qgis_run_algorithm(\n alg = \"sagang:sagawetnessindex\",\n DEM = dem,\n SLOPE_TYPE = 1, \n SLOPE = tempfile(fileext = \".sdat\"),\n AREA = tempfile(fileext = \".sdat\"),\n .quiet = TRUE)\n# read in catchment area and catchment slope\nep = ep[c(\"AREA\", \"SLOPE\")] |>\n unlist() |>\n rast()\nnames(ep) = c(\"carea\", \"cslope\") # assign better names \norigin(ep) = origin(dem) # make sure rasters have the same origin\nep = c(dem, ndvi, ep) # add dem and ndvi to the multilayer SpatRaster object\nep$carea = log10(ep$carea)\nep = rast(system.file(\"raster/ep.tif\", package = \"spDataLarge\"))\n# terra::extract adds automatically a for our purposes unnecessary ID column\nep_rp = terra::extract(ep, random_points, ID = FALSE)\nrandom_points = cbind(random_points, ep_rp)"},{"path":"eco.html","id":"nmds","chapter":"15 Ecology","heading":"15.3 Reducing dimensionality","text":"Ordinations popular tool vegetation science extract main information, frequently corresponding ecological gradients, large species-plot matrices mostly filled 0s.\nHowever, also used remote sensing, soil sciences, geomarketing many fields.\nunfamiliar ordination techniques need refresher, look Michael W. Palmer’s web page short introduction popular ordination techniques ecology Borcard, Gillet, Legendre (2011) deeper look apply techniques R.\nvegan’s package documentation also helpful resource (vignette(package = \"vegan\")).Principal component analysis (PCA) probably famous ordination technique.\ngreat tool reduce dimensionality one can expect linear relationships variables, joint absence variable two plots (observations) can considered similarity.\nbarely case vegetation data.one, presence plant often follows unimodal, .e. non-linear, relationship along gradient (e.g., humidity, temperature salinity) peak favorable conditions declining ends towards unfavorable conditions.Secondly, joint absence species two plots hardly indication similarity.\nSuppose plant species absent driest (e.g., extreme desert) moistest locations (e.g., tree savanna) sampling.\nreally refrain counting similarity likely thing two completely different environmental settings common terms floristic composition shared absence species (except rare ubiquitous species).Non-metric multidimensional scaling (NMDS) one popular dimension-reducing technique used ecology (von Wehrden et al. 2009).\nNMDS reduces rank-based differences distances objects original matrix distances ordinated objects.\ndifference expressed stress.\nlower stress value, better ordination, .e., low-dimensional representation original matrix.\nStress values lower 10 represent excellent fit, stress values around 15 still good, values greater 20 represent poor fit (McCune, Grace, Urban 2002).\nR, metaMDS() vegan package can execute NMDS.\ninput, expects community matrix sites rows species columns.\nOften ordinations using presence-absence data yield better results (terms explained variance) though prize , course, less informative input matrix (see also Exercises).\ndecostand() converts numerical observations presences absences 1 indicating occurrence species 0 absence species.\nOrdination techniques NMDS require least one observation per site.\nHence, need dismiss sites species found.resulting matrix serves input NMDS.\nk specifies number output axes, , set 4.102\nNMDS iterative procedure trying make ordinated space similar input matrix step.\nmake sure algorithm converges, set number steps 500 using try parameter.stress value 9 represents good result, means reduced ordination space represents large majority variance input matrix.\nOverall, NMDS puts objects similar (terms species composition) closer together ordination space.\nHowever, opposed ordination techniques, axes arbitrary necessarily ordered importance (Borcard, Gillet, Legendre 2011).\nHowever, already know humidity represents main gradient study area (Muenchow, Bräuning, et al. 2013; Muenchow, Schratz, Brenning 2017).\nSince humidity highly correlated elevation, rotate NMDS axes accordance elevation (see also ?MDSrotate details rotating NMDS axes).\nPlotting result reveals first axis , intended, clearly associated altitude (Figure 15.3).\nFIGURE 15.3: Plotting first NMDS axis altitude.\nscores first NMDS axis represent different vegetation formations, .e., floristic gradient, appearing along slope Mt. Mongón.\nspatially visualize , can model NMDS scores previously created predictors (Section 15.2), use resulting model predictive mapping (see next section).","code":"\n# presence-absence matrix\npa = vegan::decostand(comm, \"pa\") # 100 rows (sites), 69 columns (species)\n# keep only sites in which at least one species was found\npa = pa[rowSums(pa) != 0, ] # 84 rows, 69 columns\nset.seed(25072018)\nnmds = vegan::metaMDS(comm = pa, k = 4, try = 500)\nnmds$stress\n#> ...\n#> Run 498 stress 0.08834745 \n#> ... Procrustes: rmse 0.004100446 max resid 0.03041186 \n#> Run 499 stress 0.08874805 \n#> ... Procrustes: rmse 0.01822361 max resid 0.08054538 \n#> Run 500 stress 0.08863627 \n#> ... Procrustes: rmse 0.01421176 max resid 0.04985418 \n#> *** Solution reached\n#> 0.08831395\nelev = dplyr::filter(random_points, id %in% rownames(pa)) |> \n dplyr::pull(dem)\n# rotating NMDS in accordance with altitude (proxy for humidity)\nrotnmds = vegan::MDSrotate(nmds, elev)\n# extracting the first two axes\nsc = vegan::scores(rotnmds, choices = 1:2, display = \"sites\")\n# plotting the first axis against altitude\nplot(y = sc[, 1], x = elev, xlab = \"elevation in m\", \n ylab = \"First NMDS axis\", cex.lab = 0.8, cex.axis = 0.8)"},{"path":"eco.html","id":"modeling-the-floristic-gradient","chapter":"15 Ecology","heading":"15.4 Modeling the floristic gradient","text":"predict floristic gradient spatially, use random forest model.\nRandom forest models frequently applied environmental ecological modeling, often provide best results terms predictive performance (Hengl et al. 2018; Schratz et al. 2019).\n, shortly introduce decision trees bagging, since form basis random forests.\nrefer reader James et al. (2013) detailed description random forests related techniques.introduce decision trees example, first construct response-predictor matrix joining rotated NMDS scores field observations (random_points).\nalso use resulting data frame mlr3 modeling later .Decision trees split predictor space number regions.\nillustrate , apply decision tree data using scores first NMDS axis response (sc) altitude (dem) predictor.\nFIGURE 15.4: Simple example decision tree three internal nodes four terminal nodes.\nresulting tree consists three internal nodes four terminal nodes (Figure 15.4).\nfirst internal node top tree assigns observations 328.5 m left observations right branch.\nobservations falling left branch mean NMDS score -1.198.\nOverall, can interpret tree follows: higher elevation, higher NMDS score becomes.\nmeans simple decision tree already revealed four distinct floristic assemblages.\n-depth interpretation please refer 15.4.2 section.\nDecision trees tendency overfit, mirror closely input data including noise turn leads bad predictive performances (Section 12.4; James et al. (2013)).\nBootstrap aggregation (bagging) ensemble technique can help overcome problem.\nEnsemble techniques simply combine predictions multiple models.\nThus, bagging takes repeated samples input data averages predictions.\nreduces variance overfitting result much better predictive accuracy compared decision trees.\nFinally, random forests extend improve bagging decorrelating trees desirable since averaging predictions highly correlated trees shows higher variance thus lower reliability averaging predictions decorrelated trees (James et al. 2013).\nachieve , random forests use bagging, contrast traditional bagging tree allowed use available predictors, random forests use random sample available predictors.","code":"\n# construct response-predictor matrix\n# id- and response variable\nrp = data.frame(id = as.numeric(rownames(sc)), sc = sc[, 1])\n# join the predictors (dem, ndvi and terrain attributes)\nrp = inner_join(random_points, rp, by = \"id\")\ntree_mo = tree::tree(sc ~ dem, data = rp)\nplot(tree_mo)\ntext(tree_mo, pretty = 0)"},{"path":"eco.html","id":"mlr3-building-blocks","chapter":"15 Ecology","heading":"15.4.1 mlr3 building blocks","text":"code section largely follows steps introduced Section 12.5.2.\ndifferences following:response variable numeric, hence regression task replace classification task Section 12.5.2Instead AUROC can used categorical response variables, use root mean squared error (RMSE) performance measureWe use random forest model instead support vector machine naturally goes along different hyperparametersWe leaving assessment bias-reduced performance measure exercise reader (see Exercises).\nInstead show tune hyperparameters (spatial) predictionsRemember 125,500 models necessary retrieve bias-reduced performance estimates using 100-repeated 5-fold spatial cross-validation random search 50 iterations Section 12.5.2.\nhyperparameter tuning level, found best hyperparameter combination turn used outer performance level predicting test data specific spatial partition (see also Figure 12.6).\ndone five spatial partitions, repeated 100 times yielding total 500 optimal hyperparameter combinations.\none use making spatial distribution maps?\nanswer simple: none .\nRemember, tuning done retrieve bias-reduced performance estimate, best possible spatial prediction.\nlatter, one estimates best hyperparameter combination complete dataset.\nmeans, inner hyperparameter tuning level longer needed makes perfect sense since applying model new data (unvisited field observations) true outcomes unavailable, hence testing impossible case.\nTherefore, tune hyperparameters good spatial prediction complete dataset via 5-fold spatial CV one repetition.\nalready constructed input variables (rp), set specifying mlr3 building blocks (task, learner, resampling).\nspecifying spatial task, use mlr3spatiotempcv package (Schratz et al. 2021 & Section 12.5), since response (sc) numeric, use regression task.Using sf object backend automatically provides geometry information needed spatial partitioning later .\nAdditionally, got rid columns id spri since variables used predictors modeling.\nNext, go construct random forest learner ranger package (Wright Ziegler 2017).opposed , example, support vector machines (see Section 12.5.2), random forests often already show good performances used default values hyperparameters (may one reason popularity).\nStill, tuning often moderately improves model results, thus worth effort (Probst, Wright, Boulesteix 2018).\nrandom forests, hyperparameters mtry, min.node.size sample.fraction determine degree randomness, tuned (Probst, Wright, Boulesteix 2018).\nmtry indicates many predictor variables used tree.\npredictors used, corresponds fact bagging (see beginning Section 15.4).\nsample.fraction parameter specifies fraction observations used tree.\nSmaller fractions lead greater diversity, thus less correlated trees often desirable (see ).\nmin.node.size parameter indicates number observations terminal node least (see also Figure 15.4).\nNaturally, trees computing time become larger, lower min.node.size.Hyperparameter combinations selected randomly fall inside specific tuning limits (created paradox::ps()).\nmtry range 1 number predictors (4) , sample.fraction range 0.2 0.9 min.node.size range 1 10 (Probst, Wright, Boulesteix 2018).defined search space, set specifying tuning via AutoTuner() function.\nSince deal geographic data, make use spatial cross-validation tune hyperparameters (see Sections 12.4 12.5).\nSpecifically, use five-fold spatial partitioning one repetition (rsmp()).\nspatial partitions, run 50 models (trm()) using randomly selected hyperparameter configurations (tnr()) within predefined limits (seach_space) find optimal hyperparameter combination (see also Section 12.5.2 https://mlr3book.mlr-org.com/chapters/chapter4/hyperparameter_optimization.html#sec-autotuner, Becker et al. 2022).\nperformance measure root mean squared error (RMSE).Calling train()-method AutoTuner-object finally runs hyperparameter tuning, find optimal hyperparameter combination specified parameters.","code":"\n# create task\ntask = mlr3spatiotempcv::as_task_regr_st(\n select(rp, -id, -spri),\n target = \"sc\",\n id = \"mongon\"\n)\nlrn_rf = lrn(\"regr.ranger\", predict_type = \"response\")\n# specifying the search space\nsearch_space = paradox::ps(\n mtry = paradox::p_int(lower = 1, upper = ncol(task$data()) - 1),\n sample.fraction = paradox::p_dbl(lower = 0.2, upper = 0.9),\n min.node.size = paradox::p_int(lower = 1, upper = 10)\n)\nautotuner_rf = mlr3tuning::auto_tuner(\n learner = lrn_rf,\n resampling = mlr3::rsmp(\"spcv_coords\", folds = 5), # spatial partitioning\n measure = mlr3::msr(\"regr.rmse\"), # performance measure\n terminator = mlr3tuning::trm(\"evals\", n_evals = 50), # specify 50 iterations\n search_space = search_space, # predefined hyperparameter search space\n tuner = mlr3tuning::tnr(\"random_search\") # specify random search\n)\n# hyperparameter tuning\nset.seed(24092024)\nautotuner_rf$train(task)\nautotuner_rf$tuning_result\n#> mtry sample.fraction min.node.size learner_param_vals x_domain regr.rmse\n#> \n#> 1: 4 0.784 10 0.382"},{"path":"eco.html","id":"predictive-mapping","chapter":"15 Ecology","heading":"15.4.2 Predictive mapping","text":"tuned hyperparameters can now used prediction.\n, need run predict method fitted AutoTuner object.predict method apply model observations used modeling.\nGiven multilayer SpatRaster containing rasters named predictors used modeling, terra::predict() also make spatial distribution maps, .e., predict new data.\nFIGURE 15.5: Predictive mapping floristic gradient clearly revealing distinct vegetation belts.\ncase, terra::predict() support model algorithm, can still make predictions manually.predictive mapping clearly reveals distinct vegetation belts (Figure 15.5).\nPlease refer Muenchow, Hauenstein, et al. (2013) detailed description vegetation belts lomas mountains.\nblue color tones represent -called Tillandsia-belt.\nTillandsia highly adapted genus especially found high quantities sandy quite desertic foot lomas mountains.\nyellow color tones refer herbaceous vegetation belt much higher plant cover compared Tillandsia-belt.\norange colors represent bromeliad belt, features highest species richness plant cover.\ncan found directly beneath temperature inversion (ca. 750-850 m asl) humidity due fog highest.\nWater availability naturally decreases temperature inversion, landscape becomes desertic succulent species (succulent belt; red colors).\nInterestingly, spatial prediction clearly reveals bromeliad belt interrupted interesting finding detected without predictive mapping.","code":"\n# predicting using the best hyperparameter combination\nautotuner_rf$predict(task)\n#> for 84 observations:\n#> row_ids truth response\n#> 1 -1.084 -1.176\n#> 2 -0.975 -1.176\n#> 3 -0.912 -1.168\n#> --- --- ---\n#> 82 0.814 0.594\n#> 83 0.814 0.746\n#> 84 0.808 0.807\npred = terra::predict(ep, model = autotuner_rf, fun = predict)\nnewdata = as.data.frame(as.matrix(ep))\ncolSums(is.na(newdata)) # 0 NAs\n# but assuming there were 0s results in a more generic approach\nind = rowSums(is.na(newdata)) == 0\ntmp = autotuner_rf$predict_newdata(newdata = newdata[ind, ], task = task)\nnewdata[ind, \"pred\"] = data.table::as.data.table(tmp)[[\"response\"]]\npred_2 = ep$dem\n# now fill the raster with the predicted values\npred_2[] = newdata$pred\n# check if terra and our manual prediction is the same\nall(values(pred - pred_2) == 0)"},{"path":"eco.html","id":"conclusions-1","chapter":"15 Ecology","heading":"15.5 Conclusions","text":"chapter ordinated community matrix lomas Mt. Mongón help NMDS (Section 15.3).\nfirst axis, representing main floristic gradient study area, modeled function environmental predictors partly derived R-GIS bridges (Section 15.2).\nmlr3 package provided building blocks spatially tune hyperparameters mtry, sample.fraction min.node.size (Section 15.4.1).\ntuned hyperparameters served input final model turn applied environmental predictors spatial representation floristic gradient (Section 15.4.2).\nresult demonstrates spatially astounding biodiversity middle desert.\nSince lomas mountains heavily endangered, prediction map can serve basis informed decision-making delineating protection zones, making local population aware uniqueness found immediate neighborhood.terms methodology, additional points addressed:interesting also model second ordination axis, subsequently find innovative way visualizing jointly modeled scores two axes one prediction mapIf interested interpreting model ecologically meaningful way, probably use (semi-)parametric models (Muenchow, Bräuning, et al. 2013; . Zuur et al. 2009; . F. Zuur et al. 2017)/\nHowever, least approaches help interpret machine learning models random forests (see, e.g., https://mlr-org.com/posts/2018-04-30-interpretable-machine-learning-iml--mlr/)sequential model-based optimization (SMBO) might preferable random search hyperparameter optimization used chapter (Probst, Wright, Boulesteix 2018)Finally, please note random forest machine learning models frequently used setting lots observations many predictors, much used chapter, unclear variables variable interactions contribute explaining response.\nAdditionally, relationships might highly non-linear.\nuse case, relationship response predictors pretty clear, slight amount non-linearity number observations predictors low.\nHence, might worth trying linear model.\nlinear model much easier explain understand random forest model, therefore preferred (law parsimony), additionally computationally less demanding (see Exercises).\nlinear model cope degree non-linearity present data, one also try generalized additive model (GAM).\npoint toolbox data scientist consists one tool, responsibility select tool best suited task purpose hand.\n, wanted introduce reader random forest modeling use corresponding results predictive mapping purposes.\npurpose, well-studied dataset known relationships response predictors, appropriate.\nHowever, imply random forest model returned best result terms predictive performance.","code":""},{"path":"eco.html","id":"exercises-11","chapter":"15 Ecology","heading":"15.6 Exercises","text":"solutions assume following packages attached (packages attached needed):E1. Run NMDS using percentage data community matrix.\nReport stress value compare stress value retrieved NMDS using presence-absence data.\nmight explain observed difference?E2. Compute predictor rasters used chapter (catchment slope, catchment area), put SpatRaster-object.\nAdd dem ndvi .\nNext, compute profile tangential curvature add additional predictor rasters (hint: grass7:r.slope.aspect).\nFinally, construct response-predictor matrix.\nscores first NMDS axis (result using presence-absence community matrix) rotated accordance elevation represent response variable, joined random_points (use inner join).\ncomplete response-predictor matrix, extract values environmental predictor raster object random_points.E3. Retrieve bias-reduced RMSE random forest linear model using spatial cross-validation.\nrandom forest modeling include estimation optimal hyperparameter combinations (random search 50 iterations) inner tuning loop.\nParallelize tuning level.\nReport mean RMSE use boxplot visualize retrieved RMSEs.\nPlease exercise best solved using mlr3 functions benchmark_grid() benchmark() (see https://mlr3book.mlr-org.com/perf-eval-cmp.html#benchmarking information).","code":""},{"path":"conclusion.html","id":"conclusion","chapter":"16 Conclusion","heading":"16 Conclusion","text":"","code":""},{"path":"conclusion.html","id":"introduction-10","chapter":"16 Conclusion","heading":"16.1 Introduction","text":"Like introduction, concluding chapter contains code chunks.\naim synthesize contents book, reference recurring themes/concepts, inspire future directions application development.\nchapter prerequisites.\nHowever, may get read attempted exercises Part (Foundations), tried advances approaches Part II (Extensions), considered geocomputation can help solve work, research problems, reference chapters Part III (Applications).chapter organized follows.\nSection 16.2 discusses wide range options handling geographic data R.\nChoice key feature open source software; section provides guidance choosing various options.\nSection 16.3 describes gaps book’s contents explains areas research deliberately omitted, others emphasized.\nNext, Section 16.4 provides advice ask good questions get stuck, search solutions online.\nSection 16.5 answers following question: read book, go next?\nSection 16.6 returns wider issues raised Chapter 1.\nconsider geocomputation part wider ‘open source approach’ ensures methods publicly accessible, reproducible supported collaborative communities.\nfinal section book also provides pointers get involved.","code":""},{"path":"conclusion.html","id":"package-choice","chapter":"16 Conclusion","heading":"16.2 Package choice","text":"feature R, open source software general, often multiple ways achieve result.\ncode chunk illustrates using three functions, covered Chapters 3 5, combine 16 regions New Zealand single geometry:Although classes, attributes column names resulting objects nz_u1 nz_u3 differ, geometries identical, verified using base R function identical().103\nuse?\ndepends: former processes geometry data contained nz faster, options performed attribute operations, may useful subsequent steps.\nWhether use base R function aggregate() dplyr function summarise() matter preference, latter readable many.wider point often multiple options choose working geographic data R, even within single package.\nrange options grows R packages considered: achieve result using older sp package, example.\nHowever, based goal providing good advice, recommend using recent, performant future-proof sf package.\napplies packages showcased book, although can helpful (distracting) aware alternatives able justify choice software.common choice, simple answer, tidyverse base R geocomputation.\nfollowing code chunk, example, shows tidyverse base R ways extract Name column nz object, described Chapter 3:raises question: use?\nanswer : depends.\napproach advantages: base R tends stable, well-known, minimal dependencies, often preferred software (package) development.\ntidyverse approach, hand, often preferred interactive programming.\nChoosing two approaches therefore matter preference application.book covers commonly needed functions — base R [ subsetting operator dplyr function select() demonstrated code chunk — many functions working geographic data, packages, mentioned.\nChapter 1 mentions 20+ influential packages working geographic data, handful covered book.\nHundreds packages available working geographic data R, many developed year.\n2024, 160 packages mentioned Spatial Task View countless functions geographic data analysis developed year.rate evolution R’s spatial ecosystem may fast, strategies deal wide range options.\nadvice start learning one approach depth general understanding breadth available options.\nadvice applies equally solving geographic problems R fields knowledge application.\nSection 16.5 covers developments languages.course, packages perform better others task, case ’s important know use.\nbook aimed focus packages future-proof (work long future), high performance (relative R packages), well maintained (user developer communities surrounding ) complementary.\nstill overlaps packages used, illustrated diversity packages making maps, highlighted Chapter 9, example.Overlapping functionality can good.\nnew package similar (identical) functionality compared existing package can increase resilience, performance (partly driven friendly competition mutual learning developers) choice, key benefits geocomputation open source software.\ncontext, deciding combination sf, tidyverse, terra packages use made knowledge alternatives.\nsp ecosystem sf superseded, example, can many things covered book , due age, built many packages.\ntime writing 2024, 463 packages Depend Import sp, slightly 452 October 2018, showing data structures widely used extended many directions.\nequivalent numbers sf 69 2018 431 2024, highlighting package future-proof growing user base developer community (Bivand 2021).\nAlthough best known point pattern analysis, spatstat package also supports raster vector geometries provides powerful functionality spatial statistics (Baddeley Turner 2005).\nmay also worth researching new alternatives development needs met established packages.","code":"\nlibrary(spData)\nnz_u1 = sf::st_union(nz)\nnz_u2 = aggregate(nz[\"Population\"], list(rep(1, nrow(nz))), sum)\nnz_u3 = dplyr::summarise(nz, t = sum(Population))\nidentical(nz_u1, nz_u2$geometry)\n#> [1] TRUE\nidentical(nz_u1, nz_u3$geom)\n#> [1] TRUE\nlibrary(dplyr) # attach a tidyverse package\nnz_name1 = nz[\"Name\"] # base R approach\nnz_name2 = nz |> # tidyverse approach\n select(Name)\nidentical(nz_name1$Name, nz_name2$Name) # check results\n#> [1] TRUE"},{"path":"conclusion.html","id":"gaps","chapter":"16 Conclusion","heading":"16.3 Gaps and overlaps","text":"Geocomputation big area, inevitably gaps book.\nselective, deliberately highlighting certain topics, techniques packages, omitting others.\ntried emphasize topics commonly needed real-world applications geographic data operations, basics coordinate reference systems, read/write data operations visualization techniques.\ntopics themes appear repeatedly, aim building essential skills geocomputation, showing go , advanced topics specific applications.deliberately omitted topics covered -depth elsewhere.\nStatistical modeling spatial data point pattern analysis, spatial interpolation (e.g., kriging) spatial regression, example, mentioned context machine learning Chapter 12 covered detail.\nalready excellent resources methods, including statistically orientated chapters Pebesma Bivand (2023c) books point pattern analysis (Baddeley, Rubak, Turner 2015), Bayesian techniques applied spatial data (Gómez-Rubio 2020; Moraga 2023), books focused particular applications health (Moraga 2019) wildfire severity analysis (Wimberly 2023).\ntopics received limited attention remote sensing using R alongside (rather bridge ) dedicated GIS software.\nmany resources topics, including discussion remote sensing R, Wegmann, Leutner, Dech (2016) GIS-related teaching materials available Marburg University.focused machine learning rather spatial statistical inference Chapters 12 15 abundance quality resources topic.\nresources include . Zuur et al. (2009), . F. Zuur et al. (2017) focus ecological use cases, freely available teaching material code Geostatistics & Open-source Statistical Computing hosted css.cornell.edu/faculty/dgr2.\nR Geographic Data Science provides introduction R geographic data science modeling.largely omitted geocomputation ‘big data’ mean datasets fit high-spec laptop.\ndecision justified fact majority geographic datasets needed common research policy applications fit consumer hardware, large high-resolution remote sensing datasets notable exception (see Section 10.8).\npossible get RAM computer temporarily ‘rent’ compute power available platforms GitHub Codespaces, can used run code book.\nFurthermore, learning solve problems small datasets prerequisite solving problems huge datasets emphasis book getting started, skills learn useful move bigger datasets.\nAnalysis ‘big data’ often involves extracting small amount data database specific statistical analysis.\nSpatial databases, covered Chapter 10, can help analysis datasets fit memory.\n‘Earth observation cloud back-ends’ can accessed R openeo package (Section 10.8.2).\nneed work big geographic datasets, also recommend exploring projects Apache Sedona emerging file formats GeoParquet.","code":""},{"path":"conclusion.html","id":"questions","chapter":"16 Conclusion","heading":"16.4 Getting help","text":"Geocomputation large challenging field, making issues temporary blockers work near inevitable.\nmany cases may just ‘get stuck’ particular point data analysis workflow facing cryptic error messages hard debug.\nmay get unexpected results clues going .\nsection provides pointers help overcome problems, clearly defining problem, searching existing knowledge solutions , approaches solve problem, art asking good questions.\nget stuck particular point, worth first taking step back working approach likely solve issue.\nTrying following steps — skipping steps already taken — provides structured approach problem-solving:Define exactly trying achieve, starting first principles (often sketch, outlined )Diagnose exactly code unexpected results arise, running exploring outputs individual lines code individual components (can run individual parts complex command selecting cursor pressing Ctrl+Enter RStudio, example)Read documentation function diagnosed ‘point failure’ previous step. Simply understanding required inputs functions, running examples often provided bottom help pages, can help solve surprisingly large proportion issues (run command ?terra::rast scroll examples worth reproducing getting started function, example)reading R’s built-documentation, outlined previous step, help solve problem, probably time broader search online see others written issue ’re seeing. See list places search help belowIf previous steps fail, find solution online searches, may time compose question reproducible example post appropriate placeSteps 1 3 outlined fairly self-explanatory , due vastness internet multitude search options, worth considering effective search strategies deciding compose question.","code":""},{"path":"conclusion.html","id":"searching-for-solutions-online","chapter":"16 Conclusion","heading":"16.4.1 Searching for solutions online","text":"Search engines logical place start many issues.\n‘Googling ’ can cases result discovery blog posts, forum messages online content precise issue ’re .\nSimply typing clear description problem/question valid approach , important specific (e.g., reference function package names input dataset sources problem dataset-specific).\ncan also make online searches effective including additional detail:\nUse quotation marks maximize chances ‘hits’ relate exact issue ’re reducing number results returned. example, try fail save GeoJSON file location already exists, get error containing message “GDAL Error 6: DeleteLayer() supported dataset”. specific search query \"GDAL Error 6\" sf likely yield solution searching GDAL Error 6 without quotation marksSet time restraints, example returning content created within last year can useful searching help evolving packageMake use additional search engine features, example restricting searches content hosted CRAN site:r-project.org","code":""},{"path":"conclusion.html","id":"help","chapter":"16 Conclusion","heading":"16.4.2 Places to search for (and ask) for help","text":"cases online searches yield solution, worth asking help.\nmany forums can , including:R’s Special Interest Group Geographic data email list (R-SIG-GEO)GIS Stackexchange website gis.stackexchange.comThe large general purpose programming Q&site stackoverflow.comOnline forums associated particular entity, Posit Community, rOpenSci Discuss web forum forums associated particular software tools Stan forumSoftware development platforms GitHub, hosts issue trackers majority R-spatial packages also, increasingly, built-discussion pages created encourage discussion (just bug reporting) around sfnetworks package (see luukvdmeer/sfnetworks/discussions)Online chat rooms forums associated communities rOpenSci geocompx community (Discord server can ask questions), book part","code":""},{"path":"conclusion.html","id":"reprex","chapter":"16 Conclusion","heading":"16.4.3 Reproducible examples with reprex","text":"terms asking good question, clearly stated question supported accessible fully reproducible example key (see also https://r4ds.hadley.nz/workflow-help.html).\nalso helpful, showing code ‘work’ user’s perspective, explain like see.\nuseful tool creating reproducible examples reprex package.\nhighlight unexpected behavior, can write completely reproducible code demonstrates issue use reprex() function create copy code can pasted forum online space.Imagine trying create map world blue sea green land.\nsimply ask one places outlined previous section.\nHowever, likely get better response provide reproducible example tried far.\nfollowing code creates map world blue sea green land, land filled :post code forum, likely get specific useful response.\nexample, someone might respond following code, demonstrably solves problem, illustrated Figure 16.1:\nFIGURE 16.1: map world green land, illustrating question reproducible example (left) solution (right).\nExercise reader: copy code, run command reprex::reprex() (paste command reprex() function call) paste output forum online space.strength open source collaborative approaches geocomputation generate vast ever evolving body knowledge, book part.\nDemonstrating efforts solve problem, providing reproducible example problem, way contributing body knowledge.","code":"\nlibrary(sf)\nlibrary(spData)\nplot(st_geometry(world), col = \"green\")\nlibrary(sf)\nlibrary(spData)\n# use the bg argument to fill in the land\nplot(st_geometry(world), col = \"green\", bg = \"lightblue\")"},{"path":"conclusion.html","id":"defining-and-sketching-the-problem","chapter":"16 Conclusion","heading":"16.4.4 Defining and sketching the problem","text":"cases, may able find solution problem online, may able formulate question can answered search engine.\nbest starting point cases, developing new geocomputational methodology, may pen paper (equivalent digital sketching tools Excalidraw tldraw allow collaborative sketching rapid sharing ideas).\ncreative early stages methodological development work, software kind can slow thoughts direct away important abstract thoughts.\nFraming question mathematics also highly recommended, reference minimal example can sketch ‘’ versions numerically.\nskills problem warrants , describing approach algebraically can cases help develop effective implementations.","code":""},{"path":"conclusion.html","id":"next","chapter":"16 Conclusion","heading":"16.5 Where to go next?","text":"indicated Section 16.3, book covered fraction R’s geographic ecosystem, much discover.\nprogressed quickly, geographic data models Chapter 2, advanced applications Chapter 15.\nConsolidation skills learned, discovery new packages approaches handling geographic data, application methods new datasets domains suggested future directions.\nsection expands general advice suggesting specific ‘next steps’, highlighted bold .addition learning geographic methods applications R, example reference work cited previous section, deepening understanding R logical next step.\nR’s fundamental classes data.frame matrix foundation sf terra classes, studying improve understanding geographic data.\ncan done reference documents part R, can found command help.start() additional resources subject Wickham (2019) Chambers (2016).Another software-related direction future learning discovering geocomputation languages.\ngood reasons learning R language geocomputation, described Chapter 1, option.104\npossible study Geocomputation : Python, C++, JavaScript, Scala Rust equal depth.\nevolving geospatial capabilities.\nrasterio, example, Python package similar functionality terra package used book.\nSee Geocomputation Python, introduction geocomputation Python.Dozens geospatial libraries developed C++, including well-known libraries GDAL GEOS, less well-known libraries Orfeo Toolbox processing remote sensing (raster) data.\nTurf.js example potential geocomputation JavaScript.\nGeoTrellis provides functions working raster vector data Java-based language Scala.\nWhiteBoxTools provides example rapidly evolving command line GIS implemented Rust.\npackages/libraries/languages advantages geocomputation many discover, documented curated list open source geospatial resources Awesome-Geospatial.geocomputation software, however.\ncan recommend exploring learning new research topics methods academic theoretical perspectives.\nMany methods written yet implemented.\nLearning geographic methods potential applications can therefore rewarding, writing code.\nexample geographic methods increasingly implemented R sampling strategies scientific applications.\nnext step case read-relevant articles area Brus (2018), accompanied reproducible code tutorial content hosted github.com/DickBrus/TutorialSampling4DSM.","code":""},{"path":"conclusion.html","id":"benefit","chapter":"16 Conclusion","heading":"16.6 The open source approach","text":"technical book, makes sense next steps, outlined previous section, also technical.\nHowever, wider issues worth considering final section, returns definition geocomputation.\nOne elements term introduced Chapter 1 geographic methods positive impact.\ncourse, define measure ‘positive’ subjective, philosophical question beyond scope book.\nRegardless worldview, consideration impacts geocomputational work useful exercise:\npotential positive impacts can provide powerful motivation future learning , conversely, new methods can open-many possible fields application.\nconsiderations lead conclusion geocomputation part wider ‘open source approach’.Section 1.1 presented terms mean roughly thing geocomputation, including geographic data science (GDS) ‘GIScience’.\ncapture essence working geographic data, geocomputation advantages: concisely captures ‘computational’ way working geographic data advocated book — implemented code therefore encouraging reproducibility — builds desirable ingredients early definition (Openshaw Abrahart 2000):creative use geographic dataApplication real-world problemsBuilding ‘scientific’ toolsReproducibilityWe added final ingredient: reproducibility barely mentioned early work geocomputation, yet strong case can made vital component first two ingredients.Reproducibility:Encourages creativity shifting focus away basics (readily available shared code) toward applicationsDiscourages people ‘reinventing wheel’: need redo others done methods can used othersMakes research conducive real-world applications, enabling anyone sector apply one’s methods new areasIf reproducibility defining asset geocomputation (command line GIS), worth considering makes reproducible.\nbrings us ‘open source approach’, three main components:command line interface (CLI), encouraging scripts recording geographic work shared reproducedOpen source software, can inspected potentially improved anyone worldAn active user developer community, collaborates self-organizes build complementary modular toolsLike term geocomputation, open source approach technical entity.\ncommunity composed people interacting daily shared aims: produce high-performance tools, free commercial legal restrictions, accessible anyone use.\nopen source approach working geographic data advantages transcend technicalities software works, encouraging learning, collaboration efficient division labor.many ways engage community, especially emergence code hosting sites, GitHub, encourage communication collaboration.\ngood place start simply browsing source code, ‘issues’ ‘commits’ geographic package interest.\nquick glance r-spatial/sf GitHub repository, hosts code underlying sf package, shows 100+ people contributed codebase documentation.\nDozens people contributed asking questions contributing ‘upstream’ packages sf uses.\n1,500 issues closed issue tracker, representing huge amount work make sf faster, stable user-friendly.\nexample, just one package dozens, shows scale intellectual operation underway make R highly effective continuously evolving language geocomputation.instructive watch incessant development activity happen public fora GitHub, even rewarding become active participant.\none greatest features open source approach: encourages people get involved.\nbook result open source approach:\nmotivated amazing developments R’s geographic capabilities last two decades, made practically possible dialogue code-sharing platforms collaboration.\nhope addition disseminating useful methods working geographic data, book inspires take open source approach.","code":""},{"path":"references.html","id":"references","chapter":"References","heading":"References","text":"","code":""}] diff --git a/transport.html b/transport.html index 03c4d33e6..ba2cea86e 100644 --- a/transport.html +++ b/transport.html @@ -652,14 +652,13 @@

    mutate(bicycle = bicycle + car_driver * uptake, car_driver = car_driver * (1 - uptake)) sum(routes_short_scenario$bicycle) - sum(routes_short$bicycle) -#> [1] 692

    +#> [1] 598

    Having created a scenario in which approximately 4000 trips have switched from driving to cycling, we can now model where this updated modeled cycling activity will take place. For this, we will use the function overline() from the stplanr package. The function breaks linestrings at junctions (were two or more linestring geometries meet), and calculates aggregate statistics for each unique route segment (Morgan and Lovelace 2020), taking an object containing routes and the names of the attributes to summarize as the first and second argument:

     route_network_scenario = overline(routes_short_scenario, attrib = "bicycle")

    The outputs of the two preceding code chunks are summarized in Figure 13.6 below.

    -
    #> [plot mode] fit legend/component: Some legend items or map compoments do not fit well, and are therefore rescaled. Set the tmap option 'component.autoscale' to FALSE to disable rescaling.
    Illustration of the percentage of car trips switching to cycling as a function of distance (left) and route network level results of this function (right).Illustration of the percentage of car trips switching to cycling as a function of distance (left) and route network level results of this function (right).

    @@ -669,7 +668,7 @@

    Transport networks with records at the segment level, typically with attributes such as road type and width, constitute a common type of route network. Such route network datasets are available worldwide from OpenStreetMap, and can be downloaded with packages such as osmdata and osmextract. To save time downloading and preparing OSM, we will use the bristol_ways object from the spDataLarge package, an sf object with LINESTRING geometries and attributes representing a sample of the transport network in the case study region (see ?bristol_ways for details), as shown in the output below:

    -
    +
     summary(bristol_ways)
     #>      highway       maxspeed             ref                     geometry   
     #>  cycleway:1721   Length:6160        Length:6160        LINESTRING   :6160  
    @@ -681,12 +680,12 @@ 

    You can manually convert a route network into an igraph object, but the geographic attributes will be lost. To overcome this limitation of igraph, the sfnetworks package (van der Meer et al. 2023), which to represent route networks simultaneously as graphs and geographic lines, was developed. We will demonstrate sfnetworks functionality on the bristol_ways object.

    -
    +
     bristol_ways$lengths = st_length(bristol_ways)
     ways_sfn = as_sfnetwork(bristol_ways)
     class(ways_sfn)
     #> [1] "sfnetwork" "tbl_graph" "igraph"
    -
    +
     ways_sfn
     #> # A sfnetwork with 5728 nodes and 4915 edges
     #> # A directed multigraph with 1013 components with spatially explicit edges
    @@ -701,11 +700,10 @@ 

    In the example below, the ‘edge betweenness’, meaning the number of shortest paths passing through each edge, is calculated (see ?igraph::betweenness for further details). The output of the edge betweenness calculation is shown Figure 13.7, which has the cycle route network dataset calculated with the overline() function as an overlay for comparison. The results demonstrate that each graph edge represents a segment: the segments near the center of the road network have the highest betweenness values, whereas segments closer to central Bristol have higher cycling potential, based on these simplistic datasets.

    -
    +
     ways_centrality = ways_sfn |> 
       activate("edges") |>  
       mutate(betweenness = tidygraph::centrality_edge_betweenness(lengths)) 
    -
    #> [plot mode] fit legend/component: Some legend items or map compoments do not fit well, and are therefore rescaled. Set the tmap option 'component.autoscale' to FALSE to disable rescaling.
    Illustration of route network datasets. The grey lines represent a simplified road network, with segment thickness proportional to betweenness. The green lines represent potential cycling flows (one way) calculated with the code above.

    @@ -730,20 +728,20 @@

    The steps that got us to this stage included identifying short but car-dependent commuting routes (generated from desire lines) in Section 13.6 and analysis of route network characteristics with the sfnetworks package in Section 13.7. The final code chunk of this chapter combines these strands of analysis, by overlaying estimates of cycling potential from the previous section on top of a new dataset representing areas within a short distance of cycling infrastructure. This new dataset is created in the code chunk below which: 1) filters out the cycleway entities from the bristol_ways object representing the transport network; 2) ‘unions’ the individual LINESTRING entities of the cycleways into a single multilinestring object (for speed of buffering); and 3) creates a 100 m buffer around them to create a polygon.

    -
    +
     existing_cycleways_buffer = bristol_ways |> 
       filter(highway == "cycleway") |>    # 1) filter out cycleways
       st_union() |>                       # 2) unite geometries
       st_buffer(dist = 100)               # 3) create buffer

    The next stage is to create a dataset representing points on the network where there is high cycling potential but little provision for cycling.

    -
    +
     route_network_no_infra = st_difference(
       route_network_scenario,
       route_network_scenario |> st_set_crs(st_crs(existing_cycleways_buffer)),
       existing_cycleways_buffer
     )

    The results of the preceding code chunks are shown in Figure 13.8, which shows routes with high levels of car dependency and high cycling potential but no cycleways.

    -
    +
     tmap_mode("view")
     qtm(route_network_no_infra, basemaps = leaflet::providers$Esri.WorldTopoMap,
         lines.lwd = 5)