Feat/output chunk length regression model #761

brunnedu · 2022-01-31T15:56:34Z

Addresses #618

Summary

RegressionModels:

Implemented output_chunk_length parameter for all RegressionModels allowing for much more flexible prediction with less covariate data
deleted now unused functions (_get_prediction_data, _shift_matrices, _update_min_max)
turned lags into a dictionary for simpler code
updated check of covariate input dimensions
updated checks for enough covariate data (now the requirements are minimal and can differ between covariate types)
only wraps model with MultiOutputRegressor if it is necessary and model doesn't support multi-output regression natively
doesn't rely on torch datasets anymore

RegressionModel Tests:

Deleted now unused tests (test_models_denoising_multi_input, test_models_denoising, test_shift_matrices)
Added new tests (test_models_accuracy_*, test_multioutput_wrapper)
Added checks for output_chunk_length=1 and output_chunk_length=5 in some already existing tests
Updated test_not_enough_covariates to check if minimal covariates are required

Other Information

changed the __str__ of LinearRegressionModel, RandomForest, LightGBMModel
should we consider deleting the LinearRegressionModel, since it behaves the same as RegressionModel(model=None)?

…oding coefficients

…utput-chunk-length-regression-model

…accordingly

codecov-commenter · 2022-01-31T18:23:49Z

Codecov Report

Merging #761 (770b63a) into master (c2d91e0) will decrease coverage by 0.10%.
The diff coverage is 97.40%.

@@            Coverage Diff             @@
##           master     #761      +/-   ##
==========================================
- Coverage   90.77%   90.67%   -0.11%     
==========================================
  Files          67       67              
  Lines        6756     6711      -45     
==========================================
- Hits         6133     6085      -48     
- Misses        623      626       +3

Impacted Files	Coverage Δ
darts/models/forecasting/random_forest.py	`100.00% <ø> (ø)`
darts/models/forecasting/regression_model.py	`96.61% <97.26%> (-1.14%)`	⬇️
darts/models/forecasting/gradient_boosted_model.py	`100.00% <100.00%> (ø)`
...arts/models/forecasting/linear_regression_model.py	`100.00% <100.00%> (ø)`
...ts/models/forecasting/regression_ensemble_model.py	`100.00% <100.00%> (ø)`
darts/utils/data/inference_dataset.py	`94.54% <0.00%> (-1.82%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c2d91e0...770b63a. Read the comment docs.

…n input_dim doesn't match

…utput-chunk-length-regression-model

…ithub.com/unit8co/darts into feat/output-chunk-length-regression-model

hrzn

This is really awesome work @brunnedu, thanks!
You are bringing our RegressionModels to the next step! 🚀

hrzn · 2022-02-03T12:35:58Z

darts/models/forecasting/regression_ensemble_model.py

-                f"`lags_future_covariates`: {regression_model.lags_future_covariates}."
-            ),
+        # check lags of the regression model
+        raise_if_not(


hrzn · 2022-02-03T15:41:50Z

darts/models/forecasting/regression_model.py

+                    and self.model._get_tags().get("multioutput")
+                ):
+                    # if not, wrap model with MultiOutputRegressor
+                    self.model = MultiOutputRegressor(self.model, n_jobs=1)


I wonder there could be some cases where n_jobs > 1 might provide significant gains...

right, I will test it out 👍

hrzn · 2022-02-03T15:47:55Z

darts/models/forecasting/regression_model.py

-            input_chunk_length=max(0, -self.min_lag),
-            output_chunk_length=1,
-        )
+                    if cov.has_datetime_index:


Ouch, it hurts that we have to differentiate the two cases here 😱

yes, maybe it would be nice to have a slice_inclusive in TimeSeries that always includes the start and stop for both datetime and integer indices? 🤔

Yes maybe. I think our slicing is at least consistent with Pandas, but it's not the first time we find this a bit frustrating.

…er to fit()

brunnedu added 19 commits January 14, 2022 14:02

fix multicollinearity in regression ensemble model tests causing expl…

5c492c1

…oding coefficients

reset seed to intial value

40f7a80

add output_chunk_length parameter to regression model

bc3be03

Merge branch 'master' of https://github.com/unit8co/darts into feat/o…

3637541

…utput-chunk-length-regression-model

add output_chunk_length to fit method of regressionmodel

1332f99

add check if model support multi output regression natively

b05e952

remove _shift_matrices test

561719f

update the LightGBMModel

67874cf

update linear regression model

78532c6

update random forest regression model

0209f38

update LightGBMModel docstring

0d40e06

use dict for lags in regressionmodel and adjust all models and tests …

0ac1073

…accordingly

reformat regression_ensemble_model using pre-commit

e31d797

reformat test_regression_models with pre-commit

f12b81c

shorten comment line length

22763c5

remove unused import to pass flake8

a2f2d82

reformat with black

d4107f4

reformat with black

a377401

update docstring of _create_lagged_data

19181d7

brunnedu requested review from dennisbader, hrzn, pennfranc and tomasvanpottelbergh as code owners January 31, 2022 15:56

brunnedu and others added 2 commits January 31, 2022 17:10

reformat using black

684a86e

Merge branch 'master' into feat/output-chunk-length-regression-model

ef83068

brunnedu and others added 4 commits February 1, 2022 10:21

improve error message when unable to build any samples to fit and whe…

4ad9d72

…n input_dim doesn't match

Merge branch 'master' of https://github.com/unit8co/darts into feat/o…

3bf57df

…utput-chunk-length-regression-model

Merge branch 'feat/output-chunk-length-regression-model' of https://g…

c6d3257

…ithub.com/unit8co/darts into feat/output-chunk-length-regression-model

Merge branch 'master' into feat/output-chunk-length-regression-model

352e9b0

brunnedu and others added 2 commits February 1, 2022 15:51

Merge branch 'master' into feat/output-chunk-length-regression-model

cf5367f

return self at the end of fit() in regressionmodel

d44a188

hrzn approved these changes Feb 3, 2022

View reviewed changes

brunnedu and others added 3 commits February 5, 2022 16:46

remove numpydoc type hints and add n_jobs_multioutput_wrapper paramet…

d189844

…er to fit()

add comments

6c73682

Merge branch 'master' into feat/output-chunk-length-regression-model

770b63a

brunnedu merged commit 0e5b5ad into master Feb 5, 2022

madtoinou deleted the feat/output-chunk-length-regression-model branch July 5, 2023 21:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/output chunk length regression model #761

Feat/output chunk length regression model #761

brunnedu commented Jan 31, 2022

codecov-commenter commented Jan 31, 2022 •

edited

Loading

hrzn left a comment

hrzn Feb 3, 2022

hrzn Feb 3, 2022

brunnedu Feb 4, 2022

hrzn Feb 3, 2022

brunnedu Feb 4, 2022

hrzn Feb 4, 2022

Feat/output chunk length regression model #761

Feat/output chunk length regression model #761

Conversation

brunnedu commented Jan 31, 2022

Summary

Other Information

codecov-commenter commented Jan 31, 2022 • edited Loading

Codecov Report

hrzn left a comment

Choose a reason for hiding this comment

hrzn Feb 3, 2022

Choose a reason for hiding this comment

hrzn Feb 3, 2022

Choose a reason for hiding this comment

brunnedu Feb 4, 2022

Choose a reason for hiding this comment

hrzn Feb 3, 2022

Choose a reason for hiding this comment

brunnedu Feb 4, 2022

Choose a reason for hiding this comment

hrzn Feb 4, 2022

Choose a reason for hiding this comment

codecov-commenter commented Jan 31, 2022 •

edited

Loading