diff --git a/12-spatial-cv.Rmd b/12-spatial-cv.Rmd index 2788b5a3d..bb35c488d 100644 --- a/12-spatial-cv.Rmd +++ b/12-spatial-cv.Rmd @@ -297,7 +297,7 @@ For spatial CV, we need to provide a few extra arguments. The `coordinate_names` argument expects the names of the coordinate columns (see Section \@ref(intro-cv) and Figure \@ref(fig:partitioning)). Additionally, we should indicate the used CRS (`crs`) and decide if we want to use the coordinates as predictors in the modeling (`coords_as_features`). -```{r 12-spatial-cv-11, eval=FALSE} +```{r 12-spatial-cv-11, eval=TRUE} # 1. create task task = mlr3spatiotempcv::as_task_classif_st( mlr3::as_data_backend(lsl), @@ -358,7 +358,7 @@ This yields all learners able to model two-class problems (landslide yes or no). We opt for the binomial classification\index{classification} method used in Section \@ref(conventional-model) and implemented as `classif.log_reg` in **mlr3learners**. Additionally, we need to specify the `predict.type` which determines the type of the prediction with `prob` resulting in the predicted probability for landslide occurrence between 0 and 1 (this corresponds to `type = response` in `predict.glm()`). -```{r 12-spatial-cv-13, eval=FALSE} +```{r 12-spatial-cv-13, eval=TRUE} # 2. specify learner learner = mlr3::lrn("classif.log_reg", predict_type = "prob") ``` @@ -402,7 +402,7 @@ We will use a 100-repeated 5-fold spatial CV\index{cross-validation!spatial CV}: In the meantime, its functionality was integrated into the **mlr3** ecosystem which is the reason why we are using **mlr3** [@schratz_hyperparameter_2019]. The **tidymodels** framework is another umbrella-package for streamlined modeling in R; however, it only recently integrated support for spatial cross validation via **spatialsample** which so far only supports one spatial resampling method. -```{r 12-spatial-cv-18, eval=FALSE} +```{r 12-spatial-cv-18, eval=TRUE} # 3. specify resampling resampling = mlr3::rsmp("repeated_spcv_coords", folds = 5, repeats = 100) ``` @@ -489,7 +489,7 @@ Before defining spatial tuning, we will set up the **mlr3**\index{mlr3 (package) The classification\index{classification} task remains the same, hence we can simply reuse the `task` object created in Section \@ref(glm). Learners implementing SVM can be found using the `list_mlr3learners()` command of the **mlr3extralearners**. -```{r 12-spatial-cv-23, eval=FALSE, echo=FALSE} +```{r 12-spatial-cv-23, eval=TRUE, echo=TRUE} mlr3_learners = mlr3extralearners::list_mlr3learners() mlr3_learners |> dplyr::filter(class == "classif" & grepl("svm", id)) |> @@ -501,16 +501,18 @@ To allow for non-linear relationships, we use the popular radial basis function Setting the `type` argument to `"C-svc"` makes sure that `ksvm()` is solving a classification task. To make sure that the tuning does not stop because of one failing model, we additionally define a fallback learner (for more information please refer to https://mlr3book.mlr-org.com/chapters/chapter10/advanced_technical_aspects_of_mlr3.html#sec-fallback). -```{r 12-spatial-cv-24, eval=FALSE} +```{r 12-spatial-cv-24} lrn_ksvm = mlr3::lrn("classif.ksvm", predict_type = "prob", kernel = "rbfdot", type = "C-svc") -lrn_ksvm$fallback = lrn("classif.featureless", predict_type = "prob") +lrn_ksvm$encapsulate(method = "try", + fallback = lrn("classif.featureless", + predict_type = "prob")) ``` The next stage is to specify a resampling strategy. Again we will use a 100-repeated 5-fold spatial CV\index{cross-validation!spatial CV}. -```{r 12-spatial-cv-25, eval=FALSE} +```{r 12-spatial-cv-25} # performance estimation level perf_level = mlr3::rsmp("repeated_spcv_coords", folds = 5, repeats = 100) ``` @@ -531,7 +533,7 @@ The random selection of values C and Sigma is additionally restricted to a prede The range of the tuning space was chosen with values recommended in the literature [@schratz_hyperparameter_2019]. To find the optimal hyperparameter combination, we fit 50 models (`terminator` object in the code chunk below) in each of these subfolds with randomly selected values for the hyperparameters C and Sigma. -```{r 12-spatial-cv-26, eval=FALSE} +```{r 12-spatial-cv-26, eval=TRUE} # five spatially disjoint partitions tune_level = mlr3::rsmp("spcv_coords", folds = 5) # define the outer limits of the randomly selected hyperparameters @@ -546,7 +548,7 @@ tuner = mlr3tuning::tnr("random_search") The next stage is to modify the learner `lrn_ksvm` in accordance with all the characteristics defining the hyperparameter tuning with `auto_tuner()`. -```{r 12-spatial-cv-27, eval=FALSE} +```{r 12-spatial-cv-27, eval=TRUE} at_ksvm = mlr3tuning::auto_tuner( learner = lrn_ksvm, resampling = tune_level, diff --git a/15-eco.Rmd b/15-eco.Rmd index ef74c296e..af07ad5e7 100644 --- a/15-eco.Rmd +++ b/15-eco.Rmd @@ -507,7 +507,7 @@ Calling the `train()`-method of the `AutoTuner`-object finally runs the hyperpar ```{r 15-eco-24, eval=FALSE, cache=TRUE, cache.lazy=FALSE} # hyperparameter tuning -set.seed(08012024) +set.seed(24092024) autotuner_rf$train(task) ``` @@ -515,17 +515,12 @@ autotuner_rf$train(task) saveRDS(autotuner_rf, "extdata/15-tune.rds") ``` -```{r 15-eco-26, echo=FALSE, eval=FALSE} +```{r 15-eco-26, echo=FALSE, cache=TRUE, cache.lazy=FALSE} autotuner_rf = readRDS("extdata/15-tune.rds") ``` - - -```{r tuning-result, eval=FALSE} +```{r tuning-result, cache=TRUE, cache.lazy=FALSE} autotuner_rf$tuning_result -#> mtry sample.fraction min.node.size learner_param_vals x_domain regr.rmse -#> -#> 1: 4 0.878 7 0.368 ``` ### Predictive mapping @@ -533,7 +528,7 @@ autotuner_rf$tuning_result The tuned hyperparameters\index{hyperparameter} can now be used for the prediction. To do so, we only need to run the `predict` method of our fitted `AutoTuner` object. -```{r 15-eco-27, cache=TRUE, cache.lazy=FALSE, warning=FALSE, eval=FALSE} +```{r 15-eco-27, cache=TRUE, cache.lazy=FALSE, warning=FALSE} # predicting using the best hyperparameter combination autotuner_rf$predict(task) ``` diff --git a/extdata/15-tune.rds b/extdata/15-tune.rds index 0662c5773..b3b592461 100644 Binary files a/extdata/15-tune.rds and b/extdata/15-tune.rds differ