Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[✨ feature] Scene: allow bypassing ray intersection acceleration data structures #297

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

merlinND
Copy link
Member

@merlinND merlinND commented Sep 30, 2022

Description

For very simple scenes, the cost of building, maintaining and using acceleration data structures can sometime be greater than the time savings.
This PR enables skipping them entirely.

Testing

See test_accel_bypass.py.

Checklist

  • My code follows the style guidelines of this project
  • My changes generate no new warnings
  • My code also compiles for cuda_* and llvm_* variants. If you can't test this, please leave below
  • I have commented my code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • I cleaned the commit history and removed any "Merge" commits
  • I give permission that the Mitsuba 3 project may redistribute my contributions under the terms of its license

This can be useful to completely bypass OptiX or other heavy acceleration data structures when the scene is trivial (e.g. a single cube).
Mostly adapted from the existing render tests.
@merlinND merlinND added the enhancement New feature or request label Sep 30, 2022
accel_init_cpu(props);
// Decide whether to use acceleration data structures for ray intersections
// TODO: do we even want a heuristic? Could lead to surprising changes in performance for the user.
bool naive_intersection_desirable = m_shapes.size() <= 5;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed with @Speierers, we may not want to enable this feature unless the user explicitly asks for it. The downside is that users may not find out it exists, but it's probably better than creating surprising performance jumps due to this heuristic.

dr::masked(pi.prim_index, valid) = prim_pi.prim_index;
dr::masked(pi.shape, valid) = prim_pi.shape;
dr::masked(pi.instance, valid) = prim_pi.instance;
dr::masked(pi.shape_index, valid) = shape_index;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to fix this. Currently encountering a compilation error if trying to do a masked assignment to pi, something about type computation from masked arrays of Shape:

In file included from /home/merlin/Code/mitsuba3-master/src/render/scene.cpp:3:
In file included from /home/merlin/Code/mitsuba3-master/include/mitsuba/render/bsdf.h:4:
/home/merlin/Code/mitsuba3-master/include/mitsuba/render/interaction.h:560:36: error: no type named 'Ray3f' in 'drjit::detail::MaskedArray<mitsuba::Shape<dr::DiffArray<dr::CUDAArray<float>>, Color<dr::DiffArray<dr::CUDAArray<float>>, 3>>>'
    using Ray3f = typename Shape_::Ray3f;
                  ~~~~~~~~~~~~~~~~~^~~~~
/home/merlin/Code/mitsuba3-master/ext/drjit/include/drjit/array_router.h:2044:21: note: in instantiation of template class 'mitsuba::PreliminaryIntersection<drjit::detail::MaskedArray<drjit::DiffArray<CUDAArray<float>>>, drjit::detail::MaskedArray<mitsuba::Shape<dr::DiffArray<dr::CUDAArray<float>>, Color<dr::DiffArray<dr::CUDAArray<float>>, 3>>>>' requested here
        masked_t<T> result;
                    ^
/home/merlin/Code/mitsuba3-master/src/render/scene.cpp:138:17: note: in instantiation of function template specialization 'drjit::masked<mitsuba::PreliminaryIntersection<drjit::DiffArray<CUDAArray<float>>, mitsuba::Shape<dr::DiffArray<dr::CUDAArray<float>>, Color<dr::DiffArray<dr::CUDAArray<float>>, 3>>>, drjit::DiffArray<drjit::CUDAArray<bool>>>' requested here
            dr::masked(pi, valid) = prim_pi;
                ^
/home/merlin/Code/mitsuba3-master/src/render/scene.cpp:460:22: note: in instantiation of member function 'mitsuba::Scene<drjit::DiffArray<CUDAArray<float>>, mitsuba::Color<drjit::DiffArray<CUDAArray<float>>, 3>>::ray_intersect' requested here
MI_INSTANTIATE_CLASS(Scene)
                     ^

dr::masked(pi.prim_index, valid) = prim_pi.prim_index;
dr::masked(pi.shape, valid) = prim_pi.shape;
dr::masked(pi.instance, valid) = prim_pi.instance;
dr::masked(pi.shape_index, valid) = shape_index;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise.


print(f'\n--- {mi.variant()} ---')
for bypass, values in results.items():
print(f'{"naive" if bypass else "accel"}: {1000 * np.mean(values):.6f} ms')
Copy link
Member Author

@merlinND merlinND Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently getting these results for a scene containing a single analytic sphere, with the direct integrator:

--- scalar_rgb ---
naive: 71.434951 ms
[0.06400513648986816, 0.06670022010803223, 0.07629013061523438, 0.07175850868225098, 0.06787610054016113, 0.08331823348999023, 0.06868195533752441, 0.0772852897644043, 0.07218050956726074, 0.0662534236907959]
accel: 73.885083 ms
[0.07530069351196289, 0.07167458534240723, 0.07322049140930176, 0.06887698173522949, 0.07498621940612793, 0.08003997802734375, 0.07488870620727539, 0.0752708911895752, 0.07259535789489746, 0.07199692726135254]
------------------

--- llvm_ad_rgb ---
naive: 51.368260 ms
[0.04977750778198242, 0.055547237396240234, 0.04845452308654785, 0.04771161079406738, 0.05972123146057129, 0.04860043525695801, 0.0483553409576416, 0.059114694595336914,
0.047663211822509766, 0.0487368106842041]
accel: 51.119304 ms
[0.0474705696105957, 0.05318307876586914, 0.0463564395904541, 0.04656219482421875, 0.061937808990478516, 0.05034995079040527, 0.046396732330322266, 0.04927563667297363, 0.06153106689453125, 0.04812955856323242]
------------------

--- cuda_ad_rgb ---
naive: 13.897085 ms
[0.014396190643310547, 0.013736724853515625, 0.013788223266601562, 0.013866901397705078, 0.013941049575805664, 0.014055013656616211, 0.013768434524536133, 0.013695955276489258, 0.013820171356201172, 0.01390218734741211]
accel: 12.030840 ms
[0.012321233749389648, 0.01208186149597168, 0.011997222900390625, 0.011968374252319336, 0.01200246810913086, 0.011963605880737305, 0.011999368667602539, 0.01200246810913086, 0.011951684951782227, 0.012020111083984375]
------------------

This doesn't include the savings from accel build up and updates, e.g. in the context of an optimization loop.
Also, the case where I saw huge gains before was with a hardcoded axis-aligned cube intersection routine. Currently our cube shape is a type of Mesh, so we'd need a new shape type to find out if it still holds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the output values be exactly the same? (They are close, but it is curious that the equivalence is not perfect).

Copy link
Member Author

@merlinND merlinND Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output values of the time measurement?

If you're talking about the rendered images, they're really close, but if the intersections are computed in different ways, then I don't find it surprising that they are small imprecisions.

Scene:
image

Difference:
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, are those things in brackets also time values? 🤦

Regarding the actually computed intersections: differences around the machine epsilon (relative error ~1e-07) would be expected.

Copy link
Member Author

@merlinND merlinND Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay yeah we're not at machine epsilon :p
To be investigated, it might be that the implementations actually compute slightly different results (not just due to e.g. order of operations).

In my test scene all of the differences seem to be concentrated around the sphere (which happens to be the only refractive object). Could also be a matter of mint handling.

MI_DECLARE_CLASS()
private:
/// Axis-aligned bounding box in world space
field<BoundingBox3f, ScalarBoundingBox3f> m_bbox;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it enough to only carry a ScalarBoundingBox3f in this plugin?

@wjakob wjakob force-pushed the master branch 4 times, most recently from 4af72bd to 199b607 Compare November 30, 2022 20:51
@wjakob wjakob force-pushed the master branch 2 times, most recently from df4cbe0 to 64fedcd Compare January 17, 2023 23:05
@wjakob wjakob force-pushed the master branch 3 times, most recently from 3f3b8d0 to 1bdea6e Compare October 12, 2023 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants