Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add utility to "normalize" numeric string values on a Document #1988

Closed
wants to merge 9 commits into from

Conversation

kwokcb
Copy link
Contributor

@kwokcb kwokcb commented Aug 23, 2024

Resolution

Fixes: #1981

Allow for a user to "normalize" numeric string values on a document. This allows for 2 documents which may have differing formatting for these values to be comparable for equivalency if both are "normalized" before the comparison.

Changes

  • Add in string value normalization. normalizeValueString(). This supports all numeric types. For integers only leading and trailing spaces are removed. All vector / matrix values are separated using this string: ", "
  • Use in Document new value normalization interface: Document.normalizeValueStrings()
  • Add in unit tests for each.
  • Add Pythong and JS bindings.

- Use in Document new value normalization interface:  Document.normalizeValueStrings()
- Add in unit tests for each.
source/MaterialXCore/Util.h Outdated Show resolved Hide resolved
doc->normalizeValueStrings();
doc2->normalizeValueStrings();

// Note: do not check doc2 == doc as that is a pointer comparison
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if there is a way to allow pointer comparison to be a value comparison, but left a note here since I ran into it.

  - Simplify logic.
  - Remove precision clamping.
- Add JS wrapper.
source/MaterialXCore/Util.cpp Outdated Show resolved Hide resolved
source/MaterialXCore/Document.cpp Outdated Show resolved Hide resolved
source/MaterialXCore/Util.cpp Outdated Show resolved Hide resolved
source/MaterialXCore/Document.cpp Outdated Show resolved Hide resolved
source/MaterialXCore/Util.h Outdated Show resolved Hide resolved
- Add integer, matrix support
- Move type into more generic method.
- Update tests.
- Do not remove trailing 0s for scientific notation. This just checks for the 'e' character after stripping leading and trailing spaces and leading zeros.
- Skip empty tokens, though this is invalid for MaterialX.
@jstone-lucasfilm
Copy link
Member

@kwokcb As a thought experiment, it seems worthwhile to consider the following line from the OpenPBR definition:

<input name="base_weight" type="float" value="1.0" uimin="0.0" uimax="1.0" uiname="Base Weight" uifolder="Base" />

After the proposed normalization logic, would it become the following?

<input name="base_weight" type="float" value="1" uimin="0.0" uimax="1.0" uiname="Base Weight" uifolder="Base" />

If so, then I wonder if this approach will really achieve our goal of normalizing numeric value strings across a document, allowing for more straightforward comparisons. In a sense, the transformation above would make the definition of OpenPBR less consistent rather than more so!

@kwokcb
Copy link
Contributor Author

kwokcb commented Sep 2, 2024

@kwokcb As a thought experiment, it seems worthwhile to consider the following line from the OpenPBR definition:

<input name="base_weight" type="float" value="1.0" uimin="0.0" uimax="1.0" uiname="Base Weight" uifolder="Base" />

After the proposed normalization logic, would it become the following?

<input name="base_weight" type="float" value="1" uimin="0.0" uimax="1.0" uiname="Base Weight" uifolder="Base" />

If so, then I wonder if this approach will really achieve our goal of normalizing numeric value strings across a document, allowing for more straightforward comparisons. In a sense, the transformation above would make the definition of OpenPBR less consistent rather than more so!

Hi @jstone-lucasfilm,

If it's desired then a .0 can be preserved. There is logic to remove the . if all trailing zeros is removed as the trailing zeros provide no additional precision. This can instead leave a .0 instead. The same way a 0. is preserved if all leading zeros are removed.

I don't think this utility should be used to enforce format consistency as it's intent was for comparison consistency and "normalization" semantics were added . That said, the above can be done but is basically preserving a 1 decimal place precision format when none is required.

@jstone-lucasfilm
Copy link
Member

@kwokcb In this case, it's not the choice of formatting conventions that I'm thinking of, but rather that these conventions aren't being applied consistently across the document. As a user, it would seem very surprising that normalization would be applied to the value attribute, but not to the uimin or uimax attribute, leaving the three of them in a seemingly inconsistent state.

I wonder if we may be pursuing the wrong approach here, and it may be worthwhile to go back to the initial motivation for this proposed feature. If the formatting of numeric strings is changing in a roundtrip of MaterialX documents through USD, is there potentially some way to preserve this formatting in the interchange steps, rather than attempting to apply a standardized string formatting after the fact?

@kwokcb
Copy link
Contributor Author

kwokcb commented Sep 3, 2024

  1. I don't see anyway to enforce a format for any translation process or it could just be hand-written.

  2. Yes I see your point. It's not affecting all numeric values (but ui meta-data isn't structured as actual Value data)

    I think I have a simpler and more consistent implementation. This goes back to actually converting to a typed data vs a generic string.

    • Instead of writing bespoke text formatting, use the existing string<->value conversion interfaces.

      • All public API paths eventually reach stringToData() when reading and dataToString() for writing.
    • This handles all current types (including numeric vales) so would be more future-proof as well.

    • This would give you "consistent" results in terms of what is currently already being applied via these interfaces.

    • Value elements can be handled by just calling Value::setValue(Value::getValue())

    • Non-Value elements such as ui meta-data can use Value::createValueFromString() (which is what get's called from Value::getValue()). This is currently being used to for UI data for MaterialXViewer + MaterialXGraphEditor.

  3. The alternative which could be much simpler is to just add a new function which compares typed values
    instead of strings. This would never mutate data, it would get strings this way:

value = valueElemen.getValue()
valueToCompare = value.getValueString()
  • Interestingly there is no Value::operator== method which could hide the string conversion logic.

@ld-kerley and @jstone-lucasfilm let me know your thoughts.

@ld-kerley
Copy link
Contributor

In general I'm less of a fan of having to mutate the document to perform a comparison on it.

I like the idea of having explicit some sort of comparison function for all the MaterialX elements, and that comparison function accepting arguments to control the criteria for comparison. This could account for the difference between "these two documents are going to render the same" vs "these two documents are going to present the same UI", and potentially also allow us to provide "these two documents are within a specified floating point tolerance of each other" by providing an error threshold.

@dbsmythe
Copy link
Contributor

dbsmythe commented Sep 3, 2024

Yeah, I was never expecting that the document would ever get changed. I thought this was strictly to be used as a way to check "are these two values the same" when values are expressed as strings without introducing potential floating point conversion errors.

@jstone-lucasfilm
Copy link
Member

Earlier in this discussion, I recall a suggestion to add an Element::isEquivalent method, and that may actually be the best path forward. This would avoid the potential pitfalls of document normalization, which isn't a concept that we've had in MaterialX previously, and may not actually be a well-posed process.

If we took the Element::isEquivalent path, we'd be matching the nomenclature used in our epsilon-based matrix comparisons, which are distinct from our raw matrix comparisons that demand exact matches:

bool isEquivalent(const M& rhs, S tolerance) const

@kwokcb
Copy link
Contributor Author

kwokcb commented Sep 3, 2024

This sounds good as we've basically come full circle back :).
BTW, the mutation only occurs if you take the result and reassign it back to the value.

I'd just suggest closing this PR and suggest

  • a Element::isEquivalent which is generic +
  • a ValueElement::isEquivalent specialization

@jstone-lucasfilm
Copy link
Member

That approach sounds good to me, @kwokcb, and thanks for working through this tricky set of issues with us! :)

@kwokcb
Copy link
Contributor Author

kwokcb commented Sep 4, 2024

A new PR is up ##2003. I will leave this for the time being if a comparison is needed for testing.
Otherwise this PR can be closed.

@jstone-lucasfilm
Copy link
Member

Thanks @kwokcb, and #2003 looks like a good approach to me, so let's close this original PR for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Equivalence operator fails to handle vector string differences
4 participants