Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: check for NaNs in emd loss matrix #623

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

bobluppes
Copy link

Types of changes

This PR introduces an additional check for NaNs in the loss matrix of the emd computation. If NaNs are detected we raise an error in order to protect against segfaults in the C++ backend.

Motivation and context / Related issue

The motivation of this PR is to fail more gracefully in cases of NaN costs.
Closes #469

How has this been tested (if it applies)

Added new tests.

PR checklist

  • I have read the CONTRIBUTING document.
  • The documentation is up-to-date with the changes I made (check build artifacts).
  • All tests passed, and additional code has been covered with new tests.
  • I have added the PR and Issue fix to the RELEASES.md file.

⚠️
Some notes on the checklist above:

  • I did not find a CONTRIBUTING.md
  • While I ran all related tests, I did not run the entire suite (due to some modules missing on my system). I assume the entire suite is run in the CI?
  • Please let me know if this is something you would like a release issue for. If so please let me know.

Comment on lines 306 to 309

if np.isnan(M).any():
raise ValueError('The loss matrix should not contain NaN values.')

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failing early here ensures that we do not segfault in the accelerated emd_c call.

I did not look too deep into the emd_c implementation, but my assumption is that this check is somewhat pessimistic. Maybe it is possible to formulate problems for which we do not need to access a subset of values in the loss matrix (possibly due to the graph being disconnected). In that case we could support NaN values in some cases. @rflamary what is your opinion on this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the graph is disconnected then the parts that are not used should have an infinite value (which is ha,ndled by the C++ solver). i'm OK with not handling naNs.

Copy link
Collaborator

@rflamary rflamary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments. Thanks @bobluppes for the PR

@@ -302,6 +304,9 @@ def emd(a, b, M, numItermax=100000, log=False, center_dual=True, numThreads=1, c
ot.optim.cg : General regularized OT
"""

if np.isnan(M).any():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A problem here is that you are using numpy on arrays that might not be numpy (see backend function below). You should do the test later in the function on the OT loss marix that hhas been converted to numpy to avoid backend errors.

@bobluppes bobluppes marked this pull request as draft May 20, 2024 08:04
@rflamary rflamary marked this pull request as ready for review June 25, 2024 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nan in metric cost matrix causes 'segmentation fault core dumped' for GW solver
2 participants