Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/split at when split_point is not in series #1415

Merged
merged 3 commits into from
Dec 16, 2022

Conversation

DavidKleindienst
Copy link
Contributor

@DavidKleindienst DavidKleindienst commented Dec 7, 2022

Fixes #1312

Summary

This PR changes the behavior of TimeSeries._split_at - and therefore also split_before and split_after - when split_point is a Timestamp which is not an index of the TimeSeries but lies between two indeces.
It also adds an appropriate test to the ´TimeSeriesTestCase.test_split` unit test.

Let's say we have a TimeSeries indexed by dates 2020-01-01, 2020-01-03 and 2020-01-05:

Current behavior:
TimeSeries._split_at(2020-01-02, after=True) produces two TimeSeries with indeces [2020-01-01, 2020-01-03] for the first, and 2020-01-05 for the second.
TimeSeries._split_at(2020-01-02, after=False) tries to produce two TimeSeries with indeces [ ] for the first, and [2020-01-01, 2020-01-03, 2020-01-05] for the second, but errors because the first one is empty.

Proposed behavior:
TimeSeries._split_at(2020-01-02) produces two TimeSeries with indeces 2020-01-01 for the first, and [2020-01-03, 2020-01-05] irrespective of whether after is True or False.

I believe this is the behavior most people would expect when splitting a TimeSeries at a point that lies between two dates.

This PR does not change the behavior of TimeSeries splits when split_point is not a Timestamp, or split_point is a Timestamp contained in the series.

Other Information

@DavidKleindienst DavidKleindienst changed the title Fix split at when split_point is not in series Fix/split at when split_point is not in series Dec 7, 2022
@codecov-commenter
Copy link

codecov-commenter commented Dec 8, 2022

Codecov Report

Base: 93.68% // Head: 93.68% // Increases project coverage by +0.00% 🎉

Coverage data is based on head (5f49ed2) compared to base (bcf9abf).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1415   +/-   ##
=======================================
  Coverage   93.68%   93.68%           
=======================================
  Files          94       94           
  Lines        9408     9395   -13     
=======================================
- Hits         8814     8802   -12     
+ Misses        594      593    -1     
Impacted Files Coverage Δ
darts/timeseries.py 91.85% <100.00%> (+0.02%) ⬆️
...arts/models/forecasting/torch_forecasting_model.py 89.50% <0.00%> (-0.05%) ⬇️
darts/models/forecasting/block_rnn_model.py 98.24% <0.00%> (-0.04%) ⬇️
darts/models/forecasting/nhits.py 99.27% <0.00%> (-0.01%) ⬇️
darts/datasets/__init__.py 100.00% <0.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Contributor

@hrzn hrzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @DavidKleindienst , this makes sense to me

@hrzn hrzn merged commit 3e5b5af into unit8co:master Dec 16, 2022
@DavidKleindienst DavidKleindienst deleted the Fix/timeseries_split_(#1312) branch February 20, 2023 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Unexpected split_before/split_after result when split_point is between two time indices
3 participants