Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smoothstep breaks GLSL compilation when multiple type signature nodes are used #1978

Closed
HardCoreCodin opened this issue Aug 19, 2024 · 13 comments

Comments

@HardCoreCodin
Copy link

HardCoreCodin commented Aug 19, 2024

When multiple signature variants of smoothstep participate in a graph, the viewport render goes blank and this shows in the output:

Error in compiling fragment shader:
0(493) : error C1013: function "mx_smoothstep_float" is already defined at 0(483)

This is likely because the function mx_smoothstep_float exists in multiple headers that get included in such cases.

See attached example file.
smoothstep_failure_mtlx.txt

@HardCoreCodin HardCoreCodin changed the title smoothstep_float is broken smoothstep brakes GLSL compilation when multiple type signature nodes are used Aug 19, 2024
@kwokcb
Copy link
Contributor

kwokcb commented Aug 19, 2024

There are includes which I assume are not being checked to see if that file has already been included in the non-float code implementations

#include "mx_smoothstep_float.glsl"

This breaks GLSL and Metal at least.

@jstone-lucasfilm jstone-lucasfilm changed the title smoothstep brakes GLSL compilation when multiple type signature nodes are used Smoothstep breaks GLSL compilation when multiple type signature nodes are used Aug 19, 2024
@jstone-lucasfilm
Copy link
Member

Good catch, @HardCoreCodin, and this sounds like a bug in our GLSL shader generation that should be fixed.

@ld-kerley
Copy link
Contributor

I think the fix is probably a pretty easy #ifndef/#define trick in the right place - I can investigate and put up a PR soon

@kwokcb
Copy link
Contributor

kwokcb commented Aug 20, 2024

@ld-kerley , not sure if this is the place to look but I believe #includes are parsed and code is embedded -- but it should not try to include twice based on ShaderStage::addInclude().

@ld-kerley
Copy link
Contributor

In this case the #include statement is in a file that is already being included - a quick local test and adding a standard include guard around the actual source does appear to work in both GLSL and MSL.

@ld-kerley
Copy link
Contributor

I think I see what's happening.... The float version of smooth step is added first, but because this is a ShaderSourceNode it doesn't added with addInclude() which is where we do the de-duplication of the includes, but instead just directly calls addBlock().

The second (color) smoothstep node is also a ShaderSourceNode so it also just adds itself using addBlock() which does then parse the #include statements, and call addInclude() for each of them, but because the initial inclusion of the float smoothstep code wasn't added via addInclude() its not in its de-duplication map.

I can see a few possible solves here.

  • We could just add #ifndef/#define guards around code that is included, that way if other holes appear in the de-duplication logic they won't fail. This approach could lead to longer shader code files, but the shader compilers will just ignore the additional code by way of the pre-processor.
  • We could change the ShaderSourceNode implementation to call addInclude() instead of addBlock(), I'm not sure if this might leave other holes. Also ShaderSourceNode doesn't actually directly call addBlock() it calls emitBlock() in the ShaderGenerator interface, and I don't currently see any equivalent emitInclude() in that classes interface.
  • We could also impose a constraint that shader source files cannot include any other shader source files, and instead common code must be restructured in to a common header file. Concretely here that would mean that the float smoothstep file would just include some new file in genglsl/lib that would contain the actual implementation. It would be nice to add some sort of validation/unit test that we don't break this new constraint if we adopt this approach.

I think in the long term I would vote for option 3, but I also think that option 1 might be the pragmatic answer in the short term, and perhaps we file these ideas as an issue to be working on in the future.

Curious what you think, @kwokcb, @niklasharrysson, @jstone-lucasfilm ?

@kwokcb
Copy link
Contributor

kwokcb commented Aug 20, 2024

I’ll just throw out one more option. Turn the vector versions of this into graphs that use the float version node instances — hence no include dependence exists. Would have to think some more on the other options, but adding the common include seems like a good idea if it is added on demand.

@ld-kerley
Copy link
Contributor

I like the nodegraph implementation idea, though if we applied this to all destination languages, then I think we would be constructing less optimal OSL code.

  <implementation name="IM_smoothstep_vector3_genosl" nodedef="ND_smoothstep_vector3" target="genosl" sourcecode="smoothstep({{low}}, {{high}}, {{in}})" />

OSL currently provides a smoothstep() that can take a vector type that will allow for more aggressive OSL optimizer optimizations.

I guess that raises a hole in my MaterialX understanding. If a nodegraph implementation exists, alongside a target specific implementation as above, what happens? I guess the implementation that is more specific would ideally be selected, if so then we could safely deploy the nodegraph idea (which I think in general is how we should try and build all nodes to make them more maintainable), without any loss of OSL performance.

@ld-kerley
Copy link
Contributor

Interestingly we already have "some" smoothstep nodegraph implementations, so I think that casts an even stronger vote for that solution.

@dbsmythe
Copy link
Contributor

dbsmythe commented Aug 20, 2024

Yes, according to the Spec:

It is allowable for there to be both a <nodegraph> and an <implementation> for the same nodedef target/version, with the <implementation> generally prevailing in order to allow for optimized native-code node implementations, although ultimately it would be up to the host application to determine which implementation to actually use.

(The above sentence is in the paragraph right above the https://github.com/AcademySoftwareFoundation/MaterialX/blob/main/documents/Specification/MaterialX.Specification.md#example-custom-node-defined-by-a-nodegraph section)

@HardCoreCodin
Copy link
Author

There may be strong reasons in the near term to also have a custom smoothstep function, even for float, for all back-end targets, including OSL. And they would all likely need to change soon - see this other discussion on the OSL docs issue (it would also related to this GLSL implementation from the MaterialX's perspective).

@kwokcb
Copy link
Contributor

kwokcb commented Aug 21, 2024

@ld-kerley for your implementation matching question:

jstone-lucasfilm pushed a commit that referenced this issue Aug 22, 2024
Issue #1978 reports an issue with repeated code for the smoothstep node.

This PR will resolve that issue (and also add a unit test that demonstrates the issue) by converting all the smoothstep nodes to nodegraph implementations.

This PR also simplifies the nodegraphs for smoothstep to allow follow the same pattern.
@jstone-lucasfilm
Copy link
Member

Thanks for this report, @HardCoreCodin, and to @ld-kerley for the fix in #1982!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants