Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/window transformer #1269

Merged
merged 55 commits into from
Nov 26, 2022
Merged
Show file tree
Hide file tree
Changes from 54 commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
4cf16f7
first commit
adamkells Aug 26, 2022
bf05165
Add some template code
adamkells Sep 9, 2022
866cc88
remove print debugging
adamkells Sep 9, 2022
4de43a8
Merge branch 'master' into feat/window-features
hrzn Sep 12, 2022
221e4b1
Merge branch 'master' into feat/window-features
eliane-maalouf Sep 19, 2022
3a1c259
Merge branch 'master' into feat/window-features
hrzn Sep 19, 2022
3bb7554
Merge branch 'master' into feat/window-features
eliane-maalouf Sep 21, 2022
b034f0b
- ForecastingWindowTransformer implementation and tests
eliane-maalouf Oct 9, 2022
ce35be2
- added UserWarning to logging.py
eliane-maalouf Oct 9, 2022
b0cdec8
- formatting files
eliane-maalouf Oct 9, 2022
c1bf337
- cleanup and formatting
eliane-maalouf Oct 10, 2022
c7a4d37
- formatting
eliane-maalouf Oct 10, 2022
9a13883
Merge branch 'master' into feat/window_transformer
eliane-maalouf Oct 10, 2022
c954627
- corrected behavior for user provided function (rolling or not rolli…
eliane-maalouf Oct 10, 2022
95dba04
- corrected lint errors
eliane-maalouf Oct 10, 2022
187412c
- corrected sorting imports
eliane-maalouf Oct 10, 2022
44b392c
Merge branch 'master' into feat/window_transformer
eliane-maalouf Oct 11, 2022
c6a8d4d
Merge branch 'master' into feat/window-features
eliane-maalouf Oct 11, 2022
087b8dc
Merge branch 'feat/window-features' into feat/window_transformer
eliane-maalouf Oct 11, 2022
8bbaecc
- removed @adamkells modifications in regression_model.py after movin…
eliane-maalouf Oct 11, 2022
36f49a9
reset regression_model.py to master version
eliane-maalouf Oct 11, 2022
31af196
Update darts/dataprocessing/transformers/window_transformer.py
eliane-maalouf Oct 17, 2022
58b3c57
Update darts/dataprocessing/transformers/window_transformer.py
eliane-maalouf Oct 17, 2022
280aab2
Update darts/dataprocessing/transformers/window_transformer.py
eliane-maalouf Oct 17, 2022
85e0a04
Update darts/dataprocessing/transformers/window_transformer.py
eliane-maalouf Oct 17, 2022
6bac75f
Merge branch 'master' into feat/window_transformer
eliane-maalouf Oct 17, 2022
5d3ea2d
added window_transform() function to TimeSeries class to allow direct…
eliane-maalouf Oct 21, 2022
faf7668
Merge branch 'master' into feat/window_transformer
eliane-maalouf Oct 23, 2022
6312334
updated ForecastingWindowTransformer class
eliane-maalouf Oct 23, 2022
eaae122
- update untitests for window transformation from TimeSeries and from…
eliane-maalouf Oct 24, 2022
79eabc4
- updated how a target time series gets
eliane-maalouf Oct 25, 2022
d5e653f
Merge branch 'master' into feat/window_transformer
eliane-maalouf Oct 28, 2022
4025bc9
Merge branch 'master' into feat/window_transformer
eliane-maalouf Oct 28, 2022
86200c1
Merge branch 'master' into feat/window_transformer
eliane-maalouf Nov 3, 2022
50e6170
Notebook example // option to suppress warnings // init update
eliane-maalouf Nov 3, 2022
821b330
formatting
eliane-maalouf Nov 3, 2022
d0b8062
improve docstring for window_transform
hrzn Nov 6, 2022
712f907
Merge branch 'master' into feat/window_transformer
eliane-maalouf Nov 9, 2022
aa1ac84
Merge branch 'master' into feat/window_transformer
eliane-maalouf Nov 18, 2022
bc1f551
Merge branch 'master' into feat/window_transformer
eliane-maalouf Nov 18, 2022
a7d232d
updated window_transform function as per review
eliane-maalouf Nov 18, 2022
ba554fc
Merge branch 'master' into feat/window_transformer
eliane-maalouf Nov 20, 2022
11a2751
updated window_transformer.py and demo notebook DRAFT_window_transfor…
eliane-maalouf Nov 20, 2022
f5bdf58
updated unittests, corrected formatting
eliane-maalouf Nov 21, 2022
b4d4742
corrected formatting and documentation
eliane-maalouf Nov 21, 2022
236f249
sort imports
eliane-maalouf Nov 21, 2022
1539945
- removed previously added user warning function
eliane-maalouf Nov 24, 2022
db128be
remove incorrect import
eliane-maalouf Nov 24, 2022
e76692a
Merge branch 'master' into feat/window_transformer
hrzn Nov 25, 2022
49da756
add docstring header in window transformer
hrzn Nov 25, 2022
8c1a84f
updated as per review, removed draft notebook
eliane-maalouf Nov 25, 2022
47df61b
- import sorting
eliane-maalouf Nov 25, 2022
a778cc1
Merge branch 'master' into feat/window_transformer
eliane-maalouf Nov 25, 2022
f0bbcec
Merge branch 'master' into feat/window_transformer
hrzn Nov 26, 2022
faac844
Merge branch 'master' into feat/window_transformer
eliane-maalouf Nov 26, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions darts/dataprocessing/transformers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,4 @@
)
from .scaler import Scaler
from .static_covariates_transformer import StaticCovariatesTransformer
from .window_transformer import WindowTransformer
155 changes: 155 additions & 0 deletions darts/dataprocessing/transformers/window_transformer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
"""
Window Transformer
------------------
"""

from typing import Iterator, List, Optional, Sequence, Tuple, Union

from darts.dataprocessing.transformers import BaseDataTransformer
from darts.logging import get_logger
from darts.timeseries import TimeSeries
from darts.utils.utils import series2seq

logger = get_logger(__name__)


class WindowTransformer(BaseDataTransformer):
def __init__(
self,
transforms: Union[dict, List[dict]],
treat_na: Optional[Union[str, Union[int, float]]] = None,
forecasting_safe: Optional[bool] = True,
keep_non_transformed: Optional[bool] = False,
name: str = "WindowTransformer",
n_jobs: int = 1,
verbose: bool = False,
):
"""
eliane-maalouf marked this conversation as resolved.
Show resolved Hide resolved
A transformer that applies window transformation to a TimeSeries or a Sequence of TimeSeries. It expects a
eliane-maalouf marked this conversation as resolved.
Show resolved Hide resolved
dictionary or a list of dictionaries specifying the window transformation(s) to be applied. All series in the
sequence will be transformed with the same transformations.

Parameters
----------
transforms
A dictionary or a list of dictionaries.
Each dictionary specifies a different window transform.

The dictionaries can contain the following keys:

:``"function"``: Mandatory. The name of one of the pandas builtin transformation functions,
or a callable function that can be applied to the input series.
Pandas' functions can be found in the
`documentation <https://pandas.pydata.org/docs/reference/window.html>`_.

:``"mode"``: Optional. The name of the pandas windowing mode on which the ``"function"`` is going to be
applied. The options are "rolling", "expanding" and "ewm".
If not provided, Darts defaults to "expanding".
User defined functions can use either "rolling" or "expanding" modes.
More information on pandas windowing operations can be found in the `documentation
<https://pandas.pydata.org/pandas-docs/stable/user_guide/window.html>`_.

:``"components"``: Optional. A string or list of strings specifying the TimeSeries components on which the
transformation should be applied. If not specified, the transformation will be
applied on all components.

All other dictionary items provided will be treated as keyword arguments for the windowing mode
(i.e., ``rolling/ewm/expanding``) or for the specific function
in that mode (i.e., ``pandas.DataFrame.rolling.mean/std/max/min...`` or
``pandas.DataFrame.ewm.mean/std/sum``).
This allows for more flexibility in configuring the transformation, by providing for
example:

* :``"window"``: Size of the moving window for the "rolling" mode.
If an integer, the fixed number of observations used for each window.
If an offset, the time period of each window.
* :``"min_periods"``: The minimum number of observations in the window required to have a value (otherwise
NaN). Darts reuses pandas defaults of 1 for "rolling" and "expanding" modes and of 0 for "ewm" mode.
* :``"win_type"``: The type of weigthing to apply to the window elements.
If provided, it should be one of `scipy.signal.windows
<https://docs.scipy.org/doc/scipy/reference/signal.windows.html#module-scipy.signal.windows>`_.
* :``"center"``: ``True``/``False`` to set the observation at the current timestep at the center of the
window (when ``forecasting_safe`` is `True`, Darts enforces ``"center"`` to ``False``).
* :``"closed"``: ``"right"``/``"left"``/``"both"``/``"neither"`` to specify whether the right,
left or both ends of the window are included in the window, or neither of them.
Darts defaults to ``"both"``.

More information on the available functions and their parameters can be found in the
`Pandas documentation <https://pandas.pydata.org/docs/reference/window.html>`_.

For user-provided functions, extra keyword arguments in the transformation dictionary are passed to the
user-defined function.
By default, Darts expects user-defined functions to receive numpy arrays as input.
This can be modified by adding item ``"raw": False`` in the transformation dictionary.
It is expected that the function returns a single
value for each window. Other possible configurations can be found in the
`pandas.DataFrame.rolling().apply() documentation
<https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rolling.html>`_
and `pandas.DataFrame.expanding().apply() documentation
<https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.expanding.html>`_.

treat_na
Specifies how to treat missing values that were added by the window transformations
at the beginning of the resulting TimeSeries. By default, Darts will leave NaNs in the resulting TimeSeries.
This parameter can be one of the following:

* :``"dropna"``: to truncate the TimeSeries and drop time steps containing missing values.
If multiple columns contain different numbers of missing values, only the minimum number
of rows is dropped. This operation might reduce the length of the resulting TimeSeries.

* :``"bfill"`` or ``"backfill"``: to specify that NaNs should be filled with the last transformed
and valid observation. If the original TimeSeries starts with NaNs, those are kept.
When ``forecasting_safe`` is ``True``, this option returns an exception to avoid future observation
contaminating the past.

* :an integer or float: in which case NaNs will be filled with this value.
All columns will be filled with the same provided value.

forecasting_safe
If True, Darts enforces that the resulting TimeSeries is safe to be used in forecasting models as target
or as feature. The window transformation will not allow future values to be included in the computations
at their corresponding current timestep. Default is ``True``.
"ewm" and "expanding" modes are forecasting safe by default.
"rolling" mode is forecasting safe if ``"center": False`` is guaranteed.

keep_non_transformed
``False`` to return the transformed components only, ``True`` to return all original components along
the transformed ones. Default is ``False``.

name
A specific name for the transformer.

n_jobs
The number of jobs to run in parallel. Parallel jobs are created only when a ``Sequence[TimeSeries]`` is
passed as input to a method, parallelising operations regarding different ``TimeSeries``. Defaults to `1`.

verbose
Whether to print operations progress.
"""
super().__init__(name, n_jobs, verbose)

# dictionary checks are implemented in TimeSeries.window_transform()

self.transforms = transforms
self.keep_non_transformed = keep_non_transformed
self.treat_na = treat_na
self.forecasting_safe = forecasting_safe

def _transform_iterator(
self, series: Union[TimeSeries, Sequence[TimeSeries]]
) -> Iterator[Tuple]:

series = series2seq(series)

kwargs_dict = {
"transforms": self.transforms,
"keep_non_transformed": self.keep_non_transformed,
"treat_na": self.treat_na,
"forecasting_safe": self.forecasting_safe,
}
for s in series:
yield (s, kwargs_dict)

@staticmethod
def ts_transform(series: TimeSeries, kwargs_dict) -> TimeSeries:
return series.window_transform(**kwargs_dict)
Loading