fix: check for NaNs in emd loss matrix #623

bobluppes · 2024-05-17T10:38:22Z

Types of changes

This PR introduces an additional check for NaNs in the loss matrix of the emd computation. If NaNs are detected we raise an error in order to protect against segfaults in the C++ backend.

Motivation and context / Related issue

The motivation of this PR is to fail more gracefully in cases of NaN costs.
Closes #469

How has this been tested (if it applies)

Added new tests.

PR checklist

I have read the CONTRIBUTING document.
The documentation is up-to-date with the changes I made (check build artifacts).
All tests passed, and additional code has been covered with new tests.
I have added the PR and Issue fix to the RELEASES.md file.

⚠️
Some notes on the checklist above:

I did not find a CONTRIBUTING.md
While I ran all related tests, I did not run the entire suite (due to some modules missing on my system). I assume the entire suite is run in the CI?
Please let me know if this is something you would like a release issue for. If so please let me know.

bobluppes · 2024-05-17T10:43:55Z

ot/lp/__init__.py


+    if np.isnan(M).any():
+        raise ValueError('The loss matrix should not contain NaN values.')
+


Failing early here ensures that we do not segfault in the accelerated emd_c call.

I did not look too deep into the emd_c implementation, but my assumption is that this check is somewhat pessimistic. Maybe it is possible to formulate problems for which we do not need to access a subset of values in the loss matrix (possibly due to the graph being disconnected). In that case we could support NaN values in some cases. @rflamary what is your opinion on this?

if the graph is disconnected then the parts that are not used should have an infinite value (which is ha,ndled by the C++ solver). i'm OK with not handling naNs.

rflamary

A few comments. Thanks @bobluppes for the PR

rflamary · 2024-05-20T07:18:51Z

ot/lp/__init__.py

@@ -302,6 +304,9 @@ def emd(a, b, M, numItermax=100000, log=False, center_dual=True, numThreads=1, c
    ot.optim.cg : General regularized OT
    """

+    if np.isnan(M).any():


A problem here is that you are using numpy on arrays that might not be numpy (see backend function below). You should do the test later in the function on the OT loss marix that hhas been converted to numpy to avoid backend errors.

bobluppes added 3 commits May 17, 2024 11:42

add tests

fb5bb0c

test emd directly

fc53a26

perform check in emd entrypoint

9942d1e

bobluppes commented May 17, 2024

View reviewed changes

rflamary reviewed May 20, 2024

View reviewed changes

bobluppes marked this pull request as draft May 20, 2024 08:04

rflamary added 5 commits May 28, 2024 09:18

Merge branch 'master' into 469-segfault-gw-solver

0bf7dfa

Merge branch 'master' into 469-segfault-gw-solver

1336ed2

test if nans

be8a5ea

Merge branch 'master' into 469-segfault-gw-solver

727d01d

Update test_gw.py

79d00b9

rflamary marked this pull request as ready for review June 25, 2024 08:46

rflamary added 2 commits June 25, 2024 10:48

pep8

b75d07c

Merge branch 'master' into 469-segfault-gw-solver

1713360

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: check for NaNs in emd loss matrix #623

fix: check for NaNs in emd loss matrix #623

bobluppes commented May 17, 2024

bobluppes May 17, 2024

rflamary May 20, 2024

rflamary left a comment

rflamary May 20, 2024


		if np.isnan(M).any():
		raise ValueError('The loss matrix should not contain NaN values.')

fix: check for NaNs in emd loss matrix #623

Are you sure you want to change the base?

fix: check for NaNs in emd loss matrix #623

Conversation

bobluppes commented May 17, 2024

Types of changes

Motivation and context / Related issue

How has this been tested (if it applies)

PR checklist

bobluppes May 17, 2024

Choose a reason for hiding this comment

rflamary May 20, 2024

Choose a reason for hiding this comment

rflamary left a comment

Choose a reason for hiding this comment

rflamary May 20, 2024

Choose a reason for hiding this comment