Add py311, remove py37 #320

jack-mcivor · 2023-03-16T18:52:50Z

Fixes #319

It is easiest to implement NEP-29. All current versions of numpy, scipy, scikit-learn are built for 3.8-3.11 (and do not support Python 3.7)

Pandas can be removed as it is a transitive dependency only.

np.bool was deprecated in NumPy 1.20.

ngboost/ngboost.py

…nfig

jack-mcivor · 2023-03-19T19:05:22Z

I could do with a bit of help on this one. Two tests failing:

============================================================================================== short test summary info ==============================================================================================
FAILED tests/test_distns.py::test_dists_runs_on_examples_logscore[learner0-T] - ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
FAILED tests/test_distns.py::test_dists_runs_on_examples_logscore[learner1-T] - ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

It looks like this is caused by an upgrade of scipy, although I'm not clear why.

scipy==1.7.3 (passes)
scipy==1.8.1 (passes)
scipy==1.9.3 (fails)
scipy==1.10.1 (fails)

It doesn't seem to matter what versions of numpy, scikit-learn are used. It seems like the failure mode is variable based on the random-seed, but it looks like it's down to failures in calculating the gradient (either due to a Singular matrix error or nans in the result).

There are in general quite a few errors when running the tests, not sure what to make of them.

tim-x-y-z · 2023-03-21T05:58:19Z

It doesn't seem to matter what versions of numpy, scikit-learn are used. It seems like the failure mode is variable based on the random-seed, but it looks like it's down to failures in calculating the gradient (either due to a Singular matrix error or nans in the result).

I've had that issue for a while with multi-variate normal distribution - in my case i turned the natural_gradient off and this issue seemed to be going out. Not sure if relevant at all but thought i'd share.

I can try to have a deeper look later today !

ryan-wolbeck · 2023-08-09T16:14:11Z

I agree with removing support for 3.7 as it's end of like for Python as well. I'm working to get through some other PRs and I'll work to help figure out the failures here. Thanks for the PR

ryan-wolbeck · 2023-09-08T21:26:25Z

@jack-mcivor can you re-run checks here? Logs have expired, would like to see the errors. Thanks

…feature/py311

jack-mcivor · 2023-09-10T21:54:26Z

@ryan-wolbeck those logs should be available now

ryan-wolbeck · 2023-10-18T01:03:15Z

Here's what I'm working on testing here for an update:

The gradient computation for the T distribution with LogScore is leading to a singular matrix error. I'm testing a change to add a regularization term to the diagonal of the matrix, ensuring it's non-singular.
The target values (y) contain NaN values in the tests. Working on a change to handle the null values.. something like

# Check for NaN values in y_train and handle them
nan_mask = np.isnan(y_train)
if np.any(nan_mask):
    X_train = X_train[~nan_mask]
    y_train = y_train[~nan_mask]

ngb.fit(X_train, y_train)

I'm also looking at the runtime overflow warnings. Testing using stable functions to avoid numerical instability. By constraining the input to the exponential function we could ensure that the output remains within a range that can be accurately represented by typical floating-point numbers, making the computation "stable" in the sense that it avoids numerical issues.

I'm planning to work through these updates this week and will hopefully push some code here before the end of the week.

ryan-wolbeck · 2023-10-23T19:47:03Z

@jack-mcivor I've been working through tests on the versions of SciPy and supported python versions here https://github.com/stanfordmlgroup/ngboost/tree/pr-320 though I've not been successful even when pushing SciPy beyond 1.8.1 which requires > python 3.8 and still run into the the same issues and seem to be stuck.

In order to keep moving forward. I think we should create a new PR to just fix the issue with np.bool so that we can push an update to fix those issues for usability, once that is merged into main I then think we should work towards this removal of 3.7 (and maybe 3.8 but about 15% of downloads come from 3.8 still). I'm worried with this PR that we are going to struggle to find the root cause though I think you are likely right that it's a SciPy issue.

Would you mind creating another PR just to address the np.bool issue since it should already be within the code you added here? If you don't have time I can also work towards splitting it out myself as well.

DumasCharles · 2023-10-30T19:22:50Z

Hello,

Thank you for your work on NGBoost, and answering my comment on #336 😊

As mentioned in this issue, the bug with np.bool prevents from fitting a model with an Exponential distribution, which can be a major blocker... As you suggest, would it be possible to open a PR specific to this bug?
I'm happy to create the PR if that helps.

jack-mcivor · 2023-10-31T13:17:58Z

#337 is to fix the bool deprecation

jack-mcivor · 2023-12-01T14:20:45Z

@ryan-wolbeck it's been a while since I looked at this, but is there anything else we can do?

I think the failing test is not specific to Python versions at all, but scipy? According to the current constraints, a user could install scipy>=1.9 and get failing tests. So, I think we should xfail this test, or if it is a real problem, then constrain scipy<1.9.

What do you think?

ryan-wolbeck · 2023-12-28T21:01:04Z

@jack-mcivor lets merge the latest changes from master so we can build the tests faster. From there I'll pull this again and test out the versions to dig in further

…feature/py311

jack-mcivor · 2024-01-02T14:50:36Z

@ryan-wolbeck done, still seems to exhibit the same behaviour. A minimal & fast reproducer is

pytest 'tests/test_distns.py::test_dists_runs_on_examples_logscore[learner0-T]'

ryan-wolbeck · 2024-01-10T16:55:53Z

@jack-mcivor

The error message indicates that a numpy.linalg.LinAlgError: Singular matrix error occurred. This error is raised when a function in the numpy.linalg module, such as numpy.linalg.solve, is applied to a singular (or non-invertible) matrix.

In this case, the error occurred in the grad method of the Score class in ngboost/scores.py. This method tries to solve a linear system using the numpy.linalg.solve function, which fails if the input matrix is singular.

The test_dists_runs_on_examples_logscore test is failing because it's trying to fit an NGBRegressor with a T distribution, which seems to be causing the singular matrix error.

I think you are right that we should just allow this T in this case to fail (skip the test). I tested an implementation here https://github.com/stanfordmlgroup/ngboost/tree/pr-320-2 and got things to build. I did play around with the versions a bit as well but I don't think that's actually necessary to get the xfail to allow the failure to occur due to the singular matrix error.

Can you push a similar implementation here?

@alejandroschuler do you have any concerns with us allowing this case to fail in order to support 3.11?

alejandroschuler · 2024-01-10T17:08:35Z

@ryan-wolbeck sure let's let that test fail, nbd

jack-mcivor · 2024-01-19T18:05:19Z

@ryan-wolbeck thanks, that should be ready now

Jack McIvor added 2 commits March 16, 2023 18:48

Add py311, remove py37

4970afe

Fix minor linting errors on new pylint

cb78384

jack-mcivor commented Mar 16, 2023

View reviewed changes

ngboost/ngboost.py Show resolved Hide resolved

Jack McIvor added 5 commits March 16, 2023 21:18

Fix types; remove comment

51911f8

Set seed for deterministic tests

88bfed7

Fix types in test file

767849d

Fix unused import; Use specific noqa codes; Fix typo in pre-commit co…

8f45867

…nfig

Fix type in tests

aef1e52

Fix feature importance base case

1dab475

tsiakmaki mentioned this pull request May 9, 2023

LinAlgError: Singular matrix #306

Open

connortann mentioned this pull request Jul 22, 2023

chore: start testing ngboost and gpboost shap/shap#3118

Merged

1 task

ryan-wolbeck self-requested a review August 9, 2023 14:54

This was referenced Aug 16, 2023

replace np.bool with bool #332

Closed

Deprecation warning for np.bool in Y_from_censored function #331

Closed

Merge branch 'master' of https://github.com/jack-mcivor/ngboost into …

f948c66

…feature/py311

DumasCharles mentioned this pull request Oct 17, 2023

AttributeError with np.bool when fitting NGBRegressor with Exponential distribution #336

Closed

Merge branch 'master' of https://github.com/jack-mcivor/ngboost into …

fca60c1

…feature/py311

xfail test per stanfordmlgroup#320

f1b9d16

ryan-wolbeck merged commit c482aab into stanfordmlgroup:master Jan 29, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add py311, remove py37 #320

Add py311, remove py37 #320

jack-mcivor commented Mar 16, 2023

jack-mcivor commented Mar 19, 2023

tim-x-y-z commented Mar 21, 2023

ryan-wolbeck commented Aug 9, 2023

ryan-wolbeck commented Sep 8, 2023 •

edited

Loading

jack-mcivor commented Sep 10, 2023

ryan-wolbeck commented Oct 18, 2023 •

edited

Loading

ryan-wolbeck commented Oct 23, 2023

DumasCharles commented Oct 30, 2023

jack-mcivor commented Oct 31, 2023

jack-mcivor commented Dec 1, 2023

ryan-wolbeck commented Dec 28, 2023

jack-mcivor commented Jan 2, 2024

ryan-wolbeck commented Jan 10, 2024

alejandroschuler commented Jan 10, 2024

jack-mcivor commented Jan 19, 2024

Add py311, remove py37 #320

Add py311, remove py37 #320

Conversation

jack-mcivor commented Mar 16, 2023

jack-mcivor commented Mar 19, 2023

tim-x-y-z commented Mar 21, 2023

ryan-wolbeck commented Aug 9, 2023

ryan-wolbeck commented Sep 8, 2023 • edited Loading

jack-mcivor commented Sep 10, 2023

ryan-wolbeck commented Oct 18, 2023 • edited Loading

ryan-wolbeck commented Oct 23, 2023

DumasCharles commented Oct 30, 2023

jack-mcivor commented Oct 31, 2023

jack-mcivor commented Dec 1, 2023

ryan-wolbeck commented Dec 28, 2023

jack-mcivor commented Jan 2, 2024

ryan-wolbeck commented Jan 10, 2024

alejandroschuler commented Jan 10, 2024

jack-mcivor commented Jan 19, 2024

ryan-wolbeck commented Sep 8, 2023 •

edited

Loading

ryan-wolbeck commented Oct 18, 2023 •

edited

Loading