Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚠️ Nightly upstream-dev CI failed ⚠️ #8844

Closed
github-actions bot opened this issue Mar 16, 2024 · 21 comments · Fixed by #8946
Closed

⚠️ Nightly upstream-dev CI failed ⚠️ #8844

github-actions bot opened this issue Mar 16, 2024 · 21 comments · Fixed by #8946
Labels
CI Continuous Integration tools

Comments

@github-actions
Copy link
Contributor

github-actions bot commented Mar 16, 2024

Workflow Run URL

Python 3.12 Test Summary
xarray/tests/test_duck_array_ops.py::TestOps::test_where_type_promotion: AssertionError: assert dtype('float64') == <class 'numpy.float32'>
 +  where dtype('float64') = array([ 1., nan]).dtype
 +  and   <class 'numpy.float32'> = np.float32
xarray/tests/test_duck_array_ops.py::TestDaskOps::test_where_type_promotion: AssertionError: assert dtype('float64') == <class 'numpy.float32'>
 +  where dtype('float64') = array([ 1., nan]).dtype
 +  and   <class 'numpy.float32'> = np.float32
xarray/tests/test_rolling.py::TestDataArrayRolling::test_rolling_dask_dtype[float32]: AssertionError: assert dtype('float64') == dtype('float32')
 +  where dtype('float64') = <xarray.DataArray (x: 3)> Size: 24B\ndask.array<truediv, shape=(3,), dtype=float64, chunksize=(3,), chunktype=numpy.ndarray>\nCoordinates:\n  * x        (x) int64 24B 1 2 3.dtype
 +  and   dtype('float32') = <xarray.DataArray (x: 3)> Size: 12B\narray([1. , 1.5, 2. ], dtype=float32)\nCoordinates:\n  * x        (x) int64 24B 1 2 3.dtype
@github-actions github-actions bot added the CI Continuous Integration tools label Mar 16, 2024
dcherian added a commit to dcherian/xarray that referenced this issue Mar 16, 2024
@dcherian
Copy link
Contributor

dcherian commented Mar 19, 2024

To avoid the great scientific python apocalypse of April 2024 (numpy 2 & pandas 3!), we'll need to fix the following. I'm cc'ing people hoping that many of these are easy and obvious fixes :)

  1. a number of array API failures: (cc @keewis, @TomNicholas)
xarray/tests/test_array_api.py::test_arithmetic: AttributeError: '_DType' object has no attribute 'type'
xarray/tests/test_namedarray.py::TestNamedArray::test_permute_dims[dims0-expected_sizes0]: ModuleNotFoundError: No module named 'numpy.array_api'
x
  1. Some casting errors in the coding pipeline: (cc @kmuehlbauer )
xarray/tests/test_backends.py::TestScipyFileObject::test_roundtrip_mask_and_scale[dtype1-create_masked_and_scaled_data-create_encoded_masked_and_scaled_data]: ValueError: Unable to avoid copy while creating an array from given array.
  1. Some copying errors in the coding pipeline (cc @kmuehlbauer)
xarray/tests/test_backends.py::TestScipyFileObject::test_roundtrip_test_data: ValueError: Failed to decode variable 'time': Unable to avoid copy while creating an array from given array.
  1. I bet a return value from pandas has changed to scalar leading to a lot of interpolation failures (upstream-dev CI: Fix interp and cumtrapz #8861)
xarray/tests/test_interp.py::test_interpolate_chunk_1d[1-1-0-True-linear]: ValueError: dimensions () must have the same length as the number of data dimensions, ndim=1
  1. Some datetime / timedelta casting errors: (cc @spencerkclark )
xarray/tests/test_backends.py::test_use_cftime_false_standard_calendar_in_range[gregorian]: pandas._libs.tslibs.np_datetime.OutOfBoundsTimedelta: Cannot cast 0 from D to 'ns' without overflow.
xarray/tests/test_backends.py::test_use_cftime_false_standard_calendar_in_range[proleptic_gregorian]: pandas._libs.tslibs.np_datetime.OutOfBoundsTimedelta: Cannot cast 0 from D to 'ns' without overflow.
xarray/tests/test_backends.py::test_use_cftime_false_standard_calendar_in_range[standard]: pandas._libs.tslibs.np_datetime.OutOfBoundsTimedelta: Cannot cast 0 from D to 'ns' without overflow.
  1. Some errors from pandas.MultiIndex.names now returning a tuple and not a list (pandas 3 MultiIndex fixes #8847)

@kmuehlbauer
Copy link
Contributor

3. Some copying errors in the coding pipeline (cc @kmuehlbauer)

#8851

@kmuehlbauer
Copy link
Contributor

kmuehlbauer commented Mar 19, 2024

2. Some casting errors in the coding pipeline: (cc @kmuehlbauer )

#8852

@spencerkclark
Copy link
Member

5. Some datetime / timedelta casting errors: (cc @spencerkclark )

This is still #8623 (comment) — I'll try and look into pandas-dev/pandas#56996 some more this weekend (and at the very least will ping it again).

@keewis
Copy link
Collaborator

keewis commented Apr 10, 2024

In addition to the string dtype failures (the old ones, U and S): numpy/numpy#26270

xarray/tests/test_accessor_str.py::test_case_str: AssertionError: assert dtype('<U26') == dtype('<U30')

we've also got a couple of failures related to TimedeltaIndex (#8938)

xarray/tests/test_missing.py::test_scipy_methods_function[barycentric]: TypeError: TimedeltaIndex.__new__() got an unexpected keyword argument 'unit'

As far as I can tell, that parameter has been renamed to freq?

@keewis
Copy link
Collaborator

keewis commented Apr 10, 2024

we also have a failing strategy test (hidden behind the numpy.array_api change):

FAILED xarray/tests/test_strategies.py::TestVariablesStrategy::test_make_strategies_namespace - AssertionError: np.float32(-1.1754944e-38) of type <class 'numpy.float32'>

not sure if that's us or upstream in hypothesis (cc @Zac-HD). For context, this is using numpy>=2.0 from the scientific-python nightly wheels repository (see SPEC4 for more info on that). With that version of numpy, scalar objects appear to not be considered float values anymore: isinstance(np.float32(-1.1754944e-38), float) == False Edit: or at least, all but float64 on my system... I assume that depends on the OS?

@Zac-HD
Copy link
Contributor

Zac-HD commented Apr 11, 2024

Yeah, that looks like Hypothesis needs some updates for compatibility - issue opened, we'll get to it... sometime, because volunteers 😅. FWIW I don't think it'll be OS-dependent, CPython float is 64-bit on all platforms.

@seberg
Copy link

seberg commented Apr 12, 2024

Just randomly coming here. The way scalars are considered a float/not a float should not have changed. However, promotion would have changed, so previoiusly:

float32(3) + 0.0

for example would have returned a float64 (which is a float subclass).

If that doesn't make it easy to find, and you can narrow down a bit where it happens, you could try wrapping/setting np._set_promotion_state('weak_and_warn') and then np._set_promotion_state('weak') again to undo.
That will hopefully give you a warning form the place where the promotion changed, unfortunately, there will likely be a lot of unhelpful warnings/noise (so it would be good to put it very targeted I think).


Please ping me if you get stuck tracking things down, I hope that comment may be helpful, but can try to spend some time looking at it.

@kmuehlbauer
Copy link
Contributor

In addition to the string dtype failures (the old ones, U and S):

xarray/tests/test_accessor_str.py::test_case_str: AssertionError: assert dtype('<U26') == dtype('<U30')

@keewis If you find the time, please have a look into #8932. I think I've identified the problem, but have no idea why this happens only for numpy2 (did not had a thorough look there).

@keewis
Copy link
Collaborator

keewis commented Apr 13, 2024

@dcherian, did we decide what to do with the dtype casting / promotion issues?

@dcherian
Copy link
Contributor

I haven't looked at them yet and probably won't have time for a day at least

@spencerkclark
Copy link
Member

  1. Some datetime / timedelta casting errors: (cc @spencerkclark)

Things are a bit stuck at the moment on pandas-dev/pandas#56996 / pandas-dev/pandas#57984, so I may just xfail this for the time being (it is an upstream issue anyway).

@keewis
Copy link
Collaborator

keewis commented May 2, 2024

We have another set of datetime failures. From what I can tell, pandas changed behavior for this:

pd.date_range("2001", "2001", freq="-1ME", inclusive="both")

where on pandas=2.2 this would return DatetimeIndex containing just 2001-01-31, but on pandas=3.0 this will return an empty DatetimeIndex (in general, one entry less than what we're expecting).

As far as I can tell, this is intentional (see pandas-dev/pandas#56832 for the PR that changed it). Should we adapt cftime_range, or is pandas' new behavior too restrictive and we should raise an issue?

@spencerkclark
Copy link
Member

Ah thanks for noting that @keewis—yes, I think we should port that bug fix over from pandas. I can try to do that later today along with the xfail.

@spencerkclark
Copy link
Member

#8996 takes care of the xfail. Porting the pandas negative frequency bug fix to cftime_range will take a little more care in terms of how we handle testing with older pandas versions, so I'll try and take a closer look at it over the weekend.

@keewis
Copy link
Collaborator

keewis commented May 22, 2024

an update here: the release date of numpy=2.0 has been set to June 16th, which gives us 3-4 weeks to fix the remaining issues. To be safe I'd suggest we try to release a compatible version within the next two weeks (this is mostly aimed at myself, I guess).

@jakirkham
Copy link

Thanks Justus! 🙏

What are the remaining issues?

@keewis
Copy link
Collaborator

keewis commented May 22, 2024

The only remaining issue is #8946. As a summary, we're trying to support the Array API while simultaneously supporting python scalars in where. This is currently not supported by the Array API, but dropping support would be a breaking change for us – see data-apis/array-api#807 and data-apis/array-api#805 for discussion on whether or not the Array API could be changed to help us with that.

Either way we'll need to work around this, and so we need to be able to find a reasonable common dtype for strongly and weakly dtyped data.

In numpy<2.0 (i.e. before NEP50) we would cast scalars to 0D arrays, and since the dtype of those was mostly ignored, it used to have the desired behaviour. I guess this is numpy specific behaviour and would not have worked properly for Array API libraries.

@jakirkham
Copy link

^ @seberg would be interested in hearing your take on this question 🙂

@keewis
Copy link
Collaborator

keewis commented Jun 7, 2024

we have two new unique issues:

  1. numpy.datetime64 scalars appear to have lost their component attributes (or rather, we're dispatching to numpy.datetime64 instead of cftime):
xarray/tests/test_coding_times.py::test_infer_datetime_units_with_NaT[dates0-days since 1900-01-01 00:00:00]: AttributeError: 'numpy.datetime64' object has no attribute 'year'
  1. warnings about the conversion to datetime64[ns]:
xarray/tests/test_variable.py::test_datetime_conversion_warning[[datetime.datetime(2000, 1, 1, 0, 0)]-False]: UserWarning: Converting non-nanosecond precision datetime values to nanosecond precision. This behavior can eventually be relaxed in xarray, as it is an artifact from pandas which is now beginning to support non-nanosecond precision values. This warning is caused by passing non-nanosecond np.datetime64 or np.timedelta64 values to the DataArray or Variable constructor; it can be silenced by converting the values to nanosecond precision ahead of time.

I don't get this with the release candidate, so I assume this is new in one of the upstream-dev versions, probably numpy or cftime (I can't tell for sure, though).

@spencerkclark
Copy link
Member

Thanks @keewis—I'll take a look at these over the weekend. I wouldn't be surprised if they were pandas-related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration tools
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

7 participants