Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: fix kernel cache miss due to changing film crop_size, crop_offset #920

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

KoykL
Copy link

@KoykL KoykL commented Sep 26, 2023

Description

Work in progress. Since I won't be able to work on this for a while, I decide to open a PR for results I have so far. Hopefully it will be useful. Feel free to take over.

Current PR only fixes kernel cache miss in perspective.cpp and integrator.cpp, which seem to eliminate most kernel cache miss in render. render_backward still incur kernel cache miss likely due to imageblock.

Fixes #908

Testing

Checklist

  • My code follows the style guidelines of this project
  • My changes generate no new warnings
  • My code also compiles for cuda_* and llvm_* variants. If you can't test this, please leave below
  • I have commented my code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • I cleaned the commit history and removed any "Merge" commits
  • I give permission that the Mitsuba 3 project may redistribute my contributions under the terms of its license

@njroussel
Copy link
Member

Hi @KoykL

Thanks for the head start, I'll try to workout a proper fix next week -- as you said there are some other changes in ImageBlock that are required to fully avoid kernel re-compilations due to size changes.

@wjakob wjakob force-pushed the master branch 3 times, most recently from 3f3b8d0 to 1bdea6e Compare October 12, 2023 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Changes in film crop_size, crop_offset results in new kernel being generated every iteration
2 participants