Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equivalence operator fails to handle vector string differences #1981

Open
kwokcb opened this issue Aug 20, 2024 · 11 comments
Open

Equivalence operator fails to handle vector string differences #1981

kwokcb opened this issue Aug 20, 2024 · 11 comments

Comments

@kwokcb
Copy link
Contributor

kwokcb commented Aug 20, 2024

Issue

The operator== operator seems to detect that a element differs when the format of the string used to represent
vectors differs.

This seems very strange as the values are identical and would assume in memory that the comparison does not take into consideration the string formatting.

I cannot currently compare documents such as these.

Example

Test file 1:

<?xml version="1.0"?>
<materialx version="1.39" colorspace="srgb_texture">
  <nodegraph name="patternGraph">
    <input name="ramp_right" type="color3" value="0.8392156862745098, 0.12156862745098039, 0.12156862745098039" />
    <input name="ramp_left" type="color3" value="0.08627450980392157, 0.44313725490196076, 0.7137254901960784" />
    <input name="hex_size" type="float" value="0.4" />
    <input name="hex_tiling" type="vector2" value="20.0, 10.0" />
    <output name="out" type="color3" nodename="multiply_color4" />
    <texcoord name="texcoord_vector3" type="vector2">
      <input name="index" type="integer" value="0" />
    </texcoord>
    <multiply name="multiply_color4" type="color3">
      <input name="in1" type="color3" nodename="tiledhexagons_color4" />
      <input name="in2" type="color3" nodename="ramplr_color4" />
    </multiply>
    <tiledhexagons name="tiledhexagons_color4" type="color3">
      <input name="texcoord" type="vector2" nodename="texcoord_vector3" />
      <input name="uvtiling" type="vector2" interfacename="hex_tiling" />
      <input name="uvoffset" type="vector2" value="0.0, 0.0" />
      <input name="size" type="float" interfacename="hex_size" />
      <input name="staggered" type="boolean" value="false" />
    </tiledhexagons>
    <ramplr name="ramplr_color4" type="color3">
      <input name="valuel" type="color3" interfacename="ramp_left" />
      <input name="valuer" type="color3" interfacename="ramp_right" />
      <input name="texcoord" type="vector2" nodename="texcoord_vector3" />
    </ramplr>
  </nodegraph>
  <gltf_pbr name="gltf_shader" type="surfaceshader">
    <input name="base_color" type="color3" nodegraph="patternGraph" />
  </gltf_pbr>
</materialx>

Test file 2:

<?xml version="1.0"?>
<materialx version="1.39" colorspace="srgb_texture">
  <nodegraph name="patternGraph">
    <input name="ramp_right" type="color3" value="0.8392156862745098, 0.12156862745098039, 0.12156862745098039" />
    <input name="ramp_left" type="color3" value="0.08627450980392157, 0.44313725490196076, 0.7137254901960784" />
    <input name="hex_size" type="float" value="0.4" />
    <input name="hex_tiling" type="vector2" value="20.0, 10.0" />
    <output name="out" type="color3" nodename="multiply_color4" />
    <texcoord name="texcoord_vector3" type="vector2">
      <input name="index" type="integer" value="0" />
    </texcoord>
    <multiply name="multiply_color4" type="color3">
      <input name="in1" type="color3" nodename="tiledhexagons_color4" />
      <input name="in2" type="color3" nodename="ramplr_color4" />
    </multiply>
    <tiledhexagons name="tiledhexagons_color4" type="color3">
      <input name="texcoord" type="vector2" nodename="texcoord_vector3" />
      <input name="uvtiling" type="vector2" interfacename="hex_tiling" />
      <input name="uvoffset" type="vector2" value="0.0, 0.0" />
      <input name="size" type="float" interfacename="hex_size" />
      <input name="staggered" type="boolean" value="false" />
    </tiledhexagons>
    <ramplr name="ramplr_color4" type="color3">
      <input name="valuel" type="color3" interfacename="ramp_left" />
      <input name="valuer" type="color3" interfacename="ramp_right" />
      <input name="texcoord" type="vector2" nodename="texcoord_vector3" />
    </ramplr>
  </nodegraph>
  <gltf_pbr name="gltf_shader" type="surfaceshader">
    <input name="base_color" type="color3" nodegraph="patternGraph" />
  </gltf_pbr>
</materialx>

The differences are:

  • space between "," in vectors
  • usage of .0 vs not using .0. e.g. 0 vs `0.0

image

@jstone-lucasfilm
Copy link
Member

This is an interesting observation, @kwokcb, but I wonder if this is actually by design?

After all, we don't want our document comparison logic to convert from value strings to precision-limited floats and vectors before performing comparisons, as this would destroy information that was present in the original value strings.

Is there another approach that you had in mind, that might improve upon our current literal string comparisons, but without destroying information that was present in the original strings?

@kwokcb
Copy link
Contributor Author

kwokcb commented Aug 20, 2024

I guess if it is indeed doing a string compare then it should do a numeric value compare instead for operator==. Then string format would not matter. Would need to see what it's actually doing.

@jstone-lucasfilm
Copy link
Member

As noted above, it's my sense that converting strings to precision-limited floats and vectors for document comparisons would be a step backwards, as it would hide meaningful differences (e.g. 0.9 and 0.899999976 would be considered identical), and would remove the ability to compare documents containing invalid value strings.

I'm definitely open to other ideas on how to improve our document comparisons, though, if you have thoughts on strategies that might provide the best of both worlds.

@ld-kerley
Copy link
Contributor

Feels to me like maybe there are different use-cases for comparison, and perhaps the best approach is to think about supporting different types of comparison. I can see the value in a true value comparison, as well as the current string comparison.

@dbsmythe
Copy link
Contributor

Here's another possibility: keep it as a string compare, but preprocess each string first to 1) remove all whitespace, and 2) remove all ".0[0*]" (e.g. a decimal followed by one or more zeroes). That way, "1,1" and "1, 1" and "1.0,1.0" would all be "equal". This wouldn't cover all possible cases, but I bet it'd cover the most important differences with no destruction of actual information in the original string representation of the numeric values, and no floating-point precision concerns.

@jstone-lucasfilm
Copy link
Member

That's a really cool idea, @dbsmythe.

Taking this one step further, what if we added a new Document::normalizeValues method, which would apply the logic that you describe above to all value elements, removing trailing zeroes and decimal points from numerals and standardizing spacing in vectors and lists.

This would allow @kwokcb to achieve the result he's after by testing the following:

Document doc1Normal = doc1->copy();
Document doc2Normal = doc2->copy()
doc1Normal->normalizeValues();
doc2Normal->normalizeValues();
if (doc1Normal == doc2Normal)
{
    ...
}

An additional advantage of this separation is that Document::normalizeValues could become a new, optional component of scripts such as mxformat.py, where the normalization of values could be applied as part of the formatting process.

@ld-kerley
Copy link
Contributor

I like this idea, but maybe normalize implies something different for vector types. Perhaps standardizeValueFormatting()?

@jstone-lucasfilm
Copy link
Member

jstone-lucasfilm commented Aug 21, 2024

Good point, and perhaps Document::normalizeValueStrings would provide sufficient clarity that this is purely a string normalization?

We do have some precedent for the usage of normalize in the context of strings and paths, so it wouldn't be a completely new meaning for the word in MaterialX:

FilePath FilePath::getNormalized() const

@kwokcb
Copy link
Contributor Author

kwokcb commented Aug 21, 2024

This all sounds very promising.
I'd suggest the following interfaces to allow the same scoping flexibility as operator==

  1. A normalizeValueString(string) utility that returns the normalized version of input string.
  2. A normalizeValueStrings(Element) which can perform this at any root assuming the root is valid.

Thus this could be run on a document / nodegraph, node, input or just a string, to normalize as needed. This also does not require any class interface modifications.

@jstone-lucasfilm
Copy link
Member

That sounds reasonable to me, @kwokcb, and I might suggest the following for consistency with similar methods in MaterialX:

  1. A normalizeValueString helper method in Util.h, that can be used on a single string.
  2. A Document::normalizeValueStrings method that normalizes all value strings in the document. Although this could be lofted to Element, this seems like a feature in the spirit of Document::validate that's perhaps only meaningful at the Document level, and developers would be likely to look for the feature there.

@kwokcb
Copy link
Contributor Author

kwokcb commented Aug 23, 2024

I will put up something for this. Note that the per Element scope is useful as I've had to write different equivalence code to work around the strict ordering check which I logged an issue for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants