Skip to content

Commit

Permalink
Revert ml stuff (#1124)
Browse files Browse the repository at this point in the history
* Revert "eval=FALSE in c15"

This reverts commit 80872c3.

* Revert "store_backends"

This reverts commit e6f74bd.

* Revert "Remove cache from chunk 15-eco-26"

This reverts commit 7217843.

* Revert "Try evaluating train function"

This reverts commit 6302bc7.

* Revert "Update 15-tune.rds, fix actions (hopefully)"

This reverts commit 737cbfc.

* Revert "Do not evaluate failing mlr3::lrn chunk (#1111 hotfix)"

This reverts commit b402780.

* set eval=TRUE and save 15-tune.rds anew

* use new method of specifying a fallback learner (breaking change, see https://github.com/mlr-org/mlr3/releases/tag/v0.21.0)

* rebuild the autotuner
  • Loading branch information
jannes-m committed Sep 25, 2024
1 parent 5e2dd84 commit bfb40b7
Show file tree
Hide file tree
Showing 3 changed files with 15 additions and 18 deletions.
20 changes: 11 additions & 9 deletions 12-spatial-cv.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -297,7 +297,7 @@ For spatial CV, we need to provide a few extra arguments.
The `coordinate_names` argument expects the names of the coordinate columns (see Section \@ref(intro-cv) and Figure \@ref(fig:partitioning)).
Additionally, we should indicate the used CRS (`crs`) and decide if we want to use the coordinates as predictors in the modeling (`coords_as_features`).

```{r 12-spatial-cv-11, eval=FALSE}
```{r 12-spatial-cv-11, eval=TRUE}
# 1. create task
task = mlr3spatiotempcv::as_task_classif_st(
mlr3::as_data_backend(lsl),
Expand Down Expand Up @@ -358,7 +358,7 @@ This yields all learners able to model two-class problems (landslide yes or no).
We opt for the binomial classification\index{classification} method used in Section \@ref(conventional-model) and implemented as `classif.log_reg` in **mlr3learners**.
Additionally, we need to specify the `predict.type` which determines the type of the prediction with `prob` resulting in the predicted probability for landslide occurrence between 0 and 1 (this corresponds to `type = response` in `predict.glm()`).

```{r 12-spatial-cv-13, eval=FALSE}
```{r 12-spatial-cv-13, eval=TRUE}
# 2. specify learner
learner = mlr3::lrn("classif.log_reg", predict_type = "prob")
```
Expand Down Expand Up @@ -402,7 +402,7 @@ We will use a 100-repeated 5-fold spatial CV\index{cross-validation!spatial CV}:
In the meantime, its functionality was integrated into the **mlr3** ecosystem which is the reason why we are using **mlr3** [@schratz_hyperparameter_2019]. The **tidymodels** framework is another umbrella-package for streamlined modeling in R; however, it only recently integrated support for spatial cross validation via **spatialsample** which so far only supports one spatial resampling method.


```{r 12-spatial-cv-18, eval=FALSE}
```{r 12-spatial-cv-18, eval=TRUE}
# 3. specify resampling
resampling = mlr3::rsmp("repeated_spcv_coords", folds = 5, repeats = 100)
```
Expand Down Expand Up @@ -489,7 +489,7 @@ Before defining spatial tuning, we will set up the **mlr3**\index{mlr3 (package)
The classification\index{classification} task remains the same, hence we can simply reuse the `task` object created in Section \@ref(glm).
Learners implementing SVM can be found using the `list_mlr3learners()` command of the **mlr3extralearners**.

```{r 12-spatial-cv-23, eval=FALSE, echo=FALSE}
```{r 12-spatial-cv-23, eval=TRUE, echo=TRUE}
mlr3_learners = mlr3extralearners::list_mlr3learners()
mlr3_learners |>
dplyr::filter(class == "classif" & grepl("svm", id)) |>
Expand All @@ -501,16 +501,18 @@ To allow for non-linear relationships, we use the popular radial basis function
Setting the `type` argument to `"C-svc"` makes sure that `ksvm()` is solving a classification task.
To make sure that the tuning does not stop because of one failing model, we additionally define a fallback learner (for more information please refer to https://mlr3book.mlr-org.com/chapters/chapter10/advanced_technical_aspects_of_mlr3.html#sec-fallback).

```{r 12-spatial-cv-24, eval=FALSE}
```{r 12-spatial-cv-24}
lrn_ksvm = mlr3::lrn("classif.ksvm", predict_type = "prob", kernel = "rbfdot",
type = "C-svc")
lrn_ksvm$fallback = lrn("classif.featureless", predict_type = "prob")
lrn_ksvm$encapsulate(method = "try",
fallback = lrn("classif.featureless",
predict_type = "prob"))
```

The next stage is to specify a resampling strategy.
Again we will use a 100-repeated 5-fold spatial CV\index{cross-validation!spatial CV}.

```{r 12-spatial-cv-25, eval=FALSE}
```{r 12-spatial-cv-25}
# performance estimation level
perf_level = mlr3::rsmp("repeated_spcv_coords", folds = 5, repeats = 100)
```
Expand All @@ -531,7 +533,7 @@ The random selection of values C and Sigma is additionally restricted to a prede
The range of the tuning space was chosen with values recommended in the literature [@schratz_hyperparameter_2019].
To find the optimal hyperparameter combination, we fit 50 models (`terminator` object in the code chunk below) in each of these subfolds with randomly selected values for the hyperparameters C and Sigma.

```{r 12-spatial-cv-26, eval=FALSE}
```{r 12-spatial-cv-26, eval=TRUE}
# five spatially disjoint partitions
tune_level = mlr3::rsmp("spcv_coords", folds = 5)
# define the outer limits of the randomly selected hyperparameters
Expand All @@ -546,7 +548,7 @@ tuner = mlr3tuning::tnr("random_search")

The next stage is to modify the learner `lrn_ksvm` in accordance with all the characteristics defining the hyperparameter tuning with `auto_tuner()`.

```{r 12-spatial-cv-27, eval=FALSE}
```{r 12-spatial-cv-27, eval=TRUE}
at_ksvm = mlr3tuning::auto_tuner(
learner = lrn_ksvm,
resampling = tune_level,
Expand Down
13 changes: 4 additions & 9 deletions 15-eco.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -507,33 +507,28 @@ Calling the `train()`-method of the `AutoTuner`-object finally runs the hyperpar

```{r 15-eco-24, eval=FALSE, cache=TRUE, cache.lazy=FALSE}
# hyperparameter tuning
set.seed(08012024)
set.seed(24092024)
autotuner_rf$train(task)
```

```{r 15-eco-25, cache=TRUE, cache.lazy=FALSE, eval=FALSE, echo=FALSE}
saveRDS(autotuner_rf, "extdata/15-tune.rds")
```

```{r 15-eco-26, echo=FALSE, eval=FALSE}
```{r 15-eco-26, echo=FALSE, cache=TRUE, cache.lazy=FALSE}
autotuner_rf = readRDS("extdata/15-tune.rds")
```

<!-- TODO: evaluate this when issue fixed upstream -->

```{r tuning-result, eval=FALSE}
```{r tuning-result, cache=TRUE, cache.lazy=FALSE}
autotuner_rf$tuning_result
#> mtry sample.fraction min.node.size learner_param_vals x_domain regr.rmse
#> <int> <num> <int> <list> <list> <num>
#> 1: 4 0.878 7 <list[4]> <list[3]> 0.368
```

### Predictive mapping

The tuned hyperparameters\index{hyperparameter} can now be used for the prediction.
To do so, we only need to run the `predict` method of our fitted `AutoTuner` object.

```{r 15-eco-27, cache=TRUE, cache.lazy=FALSE, warning=FALSE, eval=FALSE}
```{r 15-eco-27, cache=TRUE, cache.lazy=FALSE, warning=FALSE}
# predicting using the best hyperparameter combination
autotuner_rf$predict(task)
```
Expand Down
Binary file modified extdata/15-tune.rds
Binary file not shown.

0 comments on commit bfb40b7

Please sign in to comment.